Voltdb (New SQL) SUNNIE CHUNG CIS 612 Voltdb 2

VoltDB (New SQL) SUNNIE CHUNG CIS 612 VoltDB 2 VoltDB is an ACID-compliant relational database management system, which uses memory as storage to maximize performance. VoltDB also uses shared-nothing architecture in which each node is independent and self-sufficient. The architecture brings a RDBMS as VoltDB scalability. Architecture 3 VoltDB belongs to a NewSQL relation database system. NewSQL is type of modern database management systems that seek to provide the same scalable performance of NoSQL while still maintaining the ACID guarantees of traditional database system. Automatic partitioning across shared-nothing server cluster Main-memory data architecture Elimination of multi-threading and locking overhead Automatic replication and command logging Stored procedure interface for transactions Features 4 VoltDB uses in-memory storage to maximize throughput, avoiding costly disk access. Further performance gains are achieved by serializing all data access, avoiding many of the time-consuming functions of traditional databases such as locking, latching, and maintaining transaction logs. Scalability, reliability, and high availability are achieved through clustering and replication across multiple servers and server farms. Scaling is transparent to applications and can be done in two dimensions: Up (by increasing the capacity of existing database nodes) and Out (by increasing the number of nodes in cluster) ACID in VoltDB 5 VoltDB is a fully ACID-compliant transactions database, relieving the application developer from having to develop code to perform transactions and mange rollbacks within their own application. It guarantees that data will be 100% accurate all the time. ACID is ensured by: Data is organized into in-memory partitions Clients connect to the database and send transactions Incoming transactions are routed to data and executed serially Each stored procedure is defined as a transaction, the stored procedure succeeds and rollbacks as a whole to ensure database consistency. ACID in VoltDB 6 Also by using serialized processing (single-threaded), VoltDB ensures transactional consistency without the overhead of locking, latching and transaction logs. Handling multiple requests at a time is conducted by partitioning. It is slower with multiple-partitioned transaction than with single- partitioned transaction but the integrity is maintained and throughput is maximized. How VoltDB Works 7 Tested with VoltDB Enterprise version 4.2 for Mac VoltDB is not like traditional database products because there is no generic database. Instead, each VoltDB database is optimized for a specific application by compiling the schema, stored procedure, and partitioning information into VoltDB application catalog. The catalog then will be loaded on or more lost machines to create a distributed database. VoltDB example 8 Example: a schema is saved in a text file towns.sql CREATE TABLE towns ( town VARCHAR(128), county VARCHAR(64), state VARCHAR(2) ); Compiling the Application 9 Catalog: $ voltdb compile towns.sql ------------------------------------------ Successfully created catalog.jar Includes schema: towns.sql [MP][WRITE] TOWNS.insert INSERT INTO TOWNS VALUES (?, ?, ?); ------------------------------------------ Catalog contains 1 built-in CRUD procedures. Simple insert, update, delete and select procedures are created automatically for convenience. ------------------------------------------ Full catalog report can be found at file:///Users/nqt289/Desktop/voltdb/catalog- report.html Or can be viewed at "http://localhost:8080" when the server is running. VoltDB Command 10 Or to name the catalog (default is catalog.jar ) $ voltdb compile –o towns.jar towns.sql Starting the Database: $ voltdb create towns.jar Initializing VoltDB... Build: 4.2 voltdb-4.2-0-gc9751d3-local Enterprise Edition Connecting to VoltDB cluster as the leader... Host id of this node is: 0 Starting VoltDB with trial license. License expires on May 17, 2014. Initializing the database and command logs. This may take a moment... WARN: This is not a highly available cluster. K-Safety is set to 0. Server completed initialization. Check report, schema, procedure, etc. at http://localhost:8080/ Command Line interface 11 VoltDB provides a SQL shell interpreter that allows users to execute VoltDB SQL and Stored Procedure interactively as well as non- interactively via scripts. VOLTDB provides a command line interface, which can be accessed through sqlcmd $ sqlcmd SQL Command :: localhost:21212 1> Command Line interface 12 Three key options at the sqlcmd prompt: SQL queries: for ad hoc SQL queries Procedure calls: execute stored procedures Exit: to exit interactive session VoltDB Query/Syntax 13 VoltDB supports a subset of ANSI-standard SQL 99, including CREATE INDEX, CREATE TABLE, and CREATE VIEW for schema definition and SELECT, INSERT, UPDATE, and DELETE for data manipulation. Insert statement: 1> insert into towns values ('Billerica','Middlesex','MA'); (1 row(s) affected) 2> insert into towns values ('Buffalo','Erie','NY'); (1 row(s) affected) 3> insert into towns values ('Bay View','Erie','OH'); (1 row(s) affected) VoltDB Query/Syntax 14 Select statement: 4> select count(*) as total from towns; TOTAL ------ 3 (1 row(s) affected) 5> select town, state from towns ORDER BY town; TOWN STATE ---------- ------ Bay View OH Billerica MA Buffalo NY (3 row(s) affected) Exit: 6> exit VoltDB Input 15 CSV and TXT files are standard input files to be loaded into VoltDB database. VoltDB provides a simplified CSV loader through shell script csvloader. Command: csvloader tableName < dataFile.csv csvloader tableName –f dataFile.csv VoltDB Input Example: 16 Create a database with two tables towns and people from a schema saved in towns.sql $ voltdb compile -o towns.jar towns.sql $ voltdb create towns.jar Prepare input files $ cut -d'|' -f2,4-7,16 POP_PLACES_20140401.txt | grep -v '|$' | grep -v '||' > towns.txt VoltDB Input Example: 17 Loading the data: $ csvloader --separator "|" --skip 1 --file towns.txt towns Read 194465 rows from file and successfully inserted 194465 rows (final) Elapsed time: 4.989 seconds Invalid row file: /Users/nqt289/Desktop/voltdb/csvloader_TOWNS_insert_invalidrows. csv Log file: /Users/nqt289/Desktop/voltdb/csvloader_TOWNS_insert_log.log Report file: /Users/nqt289/Desktop/voltdb/csvloader_TOWNS_insert_report.log Querying the Database 18 1> SELECT town,state,elevation from towns order by elevation desc limit 5; TOWN STATE ELEVATION ------------------------- ------ ---------- Corona (historical) CO 3573 Quartzville (historical) CO 3527 Logtown (historical) CO 3524 Tomboy (historical) CO 3508 Rexford (historical) CO 3484 (5 row(s) affected) Querying the Database 19 2> select town, count(town) as duplicates from towns 3> group by town order by duplicates desc limit 5; TOWN DUPLICATES ------------ ----------- Midway 214 Fairview 211 Oak Grove 167 Five Points 150 Riverside 130 (5 row(s) affected) Querying the Database 20 Load another file: people.txt $ csvloader --file people.txt --skip 1 people Read 3143 rows from file and successfully inserted 1802 rows (final) Elapsed time: 0.467 seconds Querying the Database 21 Check “people” table 1> select * from people order by population desc limit 5; STATE_NUM COUNTY_NUM STATE COUNTY POPULATION ---------- ----------- ----------- ------------------- ----------- 6 37 California Los Angeles County 9818605 17 31 Illinois Cook County 5194675 4 13 Arizona Maricopa County 3817117 6 73 California San Diego County 3095313 6 59 California Orange County 3010232 (5 row(s) affected) Querying the Database 22 Perform join tables 2> select top 5 min(t.elevation) as height, 3> t.state,t.county, max(p.population) 4> from towns as t, people as p 5> where t.state_num=p.state_num and t.county_num=p.county_num 6> group by t.state, t.county order by height desc; HEIGHT STATE COUNTY C4 ------- ------ --------- ------ 2754 CO Lake 7310 2640 CO Hinsdale 843 2609 CO Mineral 712 2523 CO San Juan 699 2452 CO Summit 27994 (5 row(s) affected) Save and Recover 23 As VoltDB uses memory for operational storage unit, it provides a tool to save database snapshots. Snapshots are a complete disk-based representation of a VoltDB database, including everything needed to reproduce the database after a shutdown. Save and Recover 24 Save: $ voltadmin save /Users/nqt289/Desktop/voltdb/voltdbroot/snapshots/ "townsandpeople" -- Snapshot Save Results -- HOST_ID HOSTNAME TABLE RESULT ERR_MSG ------- ------------------------ ------ ------- ------- 0 Thuats-MacBook-Pro.local PEOPLE SUCCESS 0 Thuats-MacBook-Pro.local STATES SUCCESS 0 Thuats-MacBook-Pro.local TOWNS SUCCESS 0 Thuats-MacBook-Pro.local SUCCESS Save and Recover 25 Recover: $ voltdb recover Initializing VoltDB... Build: 4.2 voltdb-4.2-0-gc9751d3-local Enterprise Edition Connecting to VoltDB cluster as the leader... Host id of this node is: 0 Starting VoltDB with trial license. License expires on May 17, 2014. Initializing the database and command logs. This may take a moment... WARN: This is not a highly available cluster. K-Safety is set to 0. Restoring from path: voltdbroot/snapshots with nonce: townsandpeople Finished restore of voltdbroot/snapshots with nonce: townsandpeople in 0.87 seconds Server completed initialization. Save and Recover 26 Adding, dropping tables, or changing stored procedure can be done while the database is running. New catalog will be created then data can be recovered. When updating schema, the deploymeny.xml is

Load more