<<

DB2 and NoSQL - The end of relaonal?

George Baklarz IBM Canada Lab

Session Code: B08 Thursday, September 12 (11:15 am – 12:15 pm) | DB2 for Linux, Unix, and Windows

2

Agenda • Market • • Applicaon APIs • NoSQL • Types of NoSQL Databases • Current Market Trends with NoSQL • JavaScript Object Notaon • What is it and Why do developers Like it • JSON Datastore in DB2 • MongoDB Query API • Java Programming API • JSON Wire-listener 3 The Database Market and Choice

You are here 4 Database API Support

Javascript Java PhP Python/Jython Ruby/JRuby Scala node.js JDBC API

pureQuery node.js Zend node-odbc SqlAlchemy / Rails Lift API framework JSON Adapter Adapter adapters driver JSON API

Python interpreter Ruby interpreter c java c java OpenJPA

DB2 CLI and ODBC driver DB2 JCC JDBC driver

DB2 5 ADO.NET Support

Application (ASP.NET)

ADO.NET

ODBC .NET OLE DB .NET Data Provider Data Provider Other .NET DB2 .NET Data Data Providers Provider IDS/DB2 ODBC IDS/DB2 OLE Driver Provider

Informix DB2 DBMS DB DB

Now supporting: Visual Studio 2012, .NET Framework 4.5, Entity Framework 4.3, Entity Framework 5.0 is delivered with DB2 10.5 6

What is NoSQL? Dominant Flavors

• Key Value Stores 122+ NoSQL Database Hash of keys, where the data part of Offerings Today! • key-value is in a binary object • Examples pure key-value stores : , , WebSphere eXtreme Scale • Document Stores • Stores documents made up of tagged elements, which have keys and document-like objects • Examples : MongoDB, couchDB • Family • Each storage block contains data from only one column/column set • Examples : HBase, Cassandra Motivation • Graph Store § Many apps need fewer database features (simplicity) • Key-values are related through graph structure § Need rapid application evolution/deployment, with minimal interaction with DBA • Common Model : RDF § Some apps need extremely high scale (e.g. Twitter) • Examples : Jena, Sesame § Need for a low-latency, low-overhead API to access data § Increasing use of distributed analytics 7

No SQL Market Data • 2011: Market Research Media noted worldwide NoSQL market was 2014 expected to reach ~$3.4B by 2018, generang $14B from 2013-2018 % NoSQL Enterprise Adoption with a CAGR of 21% 2010 2015 • Comparison: data implies NoSQL market ~$895M "MongoDB is the new MySQL." • MySQL in 2011 ~$100M

"Current adoption of NoSQL in enterprises is 4% in 2010. expected to double by 2014, and grow to 20% by 2015. 10% of existing apps could move to NoSQL. [Sept 2011]

"NoSQL is in its infancy, many immature with only community support and limited tool capability; however, expect to see a few used more widely during the next 5 years." 8

Hacker News Trends 9

The JSON-XML Shi • Developers find it easier to move data back and forth without losing informaon in JSON vs. XML • XML is more powerful and more sophiscated than JSON • BUT JSON found to be 'good enough' à It makes programming tasks easier • By the me RDBMS world got very sophiscated with XML, developers had chosen JSON • Applicaon shi lead to emergence of database that store data in JSON (i.e. MongoDB) • JSON on the server side is appealing for developers using JSON on the client er side 10

Open APIs State of the Market • JSON is the new cool • XML declining: 5 years ago hardly any JSON • Why? JSON is: • Less verbose and smaller docs size • value vs. Mytag:value • Tightly integrated with JavaScript which has a lot of focus • Most new development tools support JSON and not XML • 50% of NoSQL Engines are for JSON Documents 11

New Era Applicaon Characteriscs • Applicaons evolve rapidly as the needs for mobile and Web presence try to keep pace with internet user needs • Applicaon developers are increasingly looking for soluons that allow nearly connuous integraon of applicaon changes • Amazon.com allows 1000’s of their developers to check in product code changes daily… • Developers resist soluons that require delays to sync up with DBA change windows 12

NoSQL JSON Stores are Appealing to Developers • JSON schema can be evolved rapidly without intervenon by DBAs or data modelers. • Objects like "shopping cart" in these applicaons really aren’t used outside the Web applicaon, so there is no need to interlock closely with the rest of the enterprise data model • JSON offers a very simple and elegant model for persisng Java or JavaScript objects, without needing a heavy-weight persistence soluon like OpenJPA or Hibernate • Performance and is very good for JSON • Store a single JSON document represenng the object versus • Store "n" rows in relaonal as a "normalized" object 13

Typical JSON Open Source Datastore Aributes • Logging is oen turned off to improve performance • By default, no return code on insert (a.k.a. "fire and forget") • App must verify update was performed • Data is sharded for scalability • Shards are replicated asynchronously for availability • Queries to replica nodes can return back-level data somemes… • No concept of or rollback • Each JSON update is independent • No document-level locking • App must manage a "revision" tag to detect document update conflicts • Applicaons have to implement compensaon logic to update mulple documents with ACID properes • JSON documents are stored in collecons • But no "" across collecons • No document-level or tag-level security • No built-in temporal or geo-spaal query support 14

Data Access Example Using JavaScript and JSON • Relaonal representaon

Lastname Firstname Street

Jones Billy 123 Maple Drive

• JSON representaon JSON_string = ‘{"Lastname":"Jones", "Firstname":"Billy", "Street":"123 Maple Drive"}’; • Javascript data access var JSONrecord = JSON.parse(JSON_string); l_name = JSONrecord.Lastname; f_name = JSONrecord.Firstname; l_street = JSONrecord.Street; 15

Simple Database API for JSON • Insert a record, a blog post by Joe: db.posts.insert({author:"Joe", date:"2012-04-20", post:"…"}) • Find all posts by Joe: db.posts.find({author:"Joe"}) • Update date of Joe's post db.posts.update({author:"Joe",{$set:{date:"2013-08-22"}}) • Delete all posts of Joe: db.posts.remove({author:"Joe"}) • Remove the posts collecon db.posts.drop() 16

The Guardian Move from Oracle to MongoDB

• Exisng system: Modern Java Applicaon using Load became an issue Spring, Hibernate, and Oracle RDBMS was difficult to scale • Problems: • Schema upgrades caused downme • Complexity • Found 300 tables complex (imagine SAP) • 10,000 lines of Hibernate configuraon code found to be buggy • 1,000 domain objects for database mapping • 70,000 lines of domain object code, ght binding to database • ORM did not really mask complexity • Database had strong influence on domain model, complex mapping joins Introduced memcached to patch up load problems • Required complex hibernate features, complex caching strategy, lots of opmizaons, hand code complex queries in SQL • Load problems: introduced caching, and SOLR/ API to opmize reads • Solved load problems, but increased complexity to 3 models: tables, objects, JSON API 17 The Guardian Move to MongoDB

Found it a simple key store… too simple

• First project: Migrate Oracle system with 3M users, complex PL-SQL • Embraced MongoDB: • Simple flexible schema with similar query and indexing to RDBMS • Mulple domain concepts expressed in single document • Can be designed in forwardly extensible way • Document oriented • Stores parsed JSON documents • Can express complex queries • Can be flexible about consistency • Malleable schema, can change at runme • Can work at both large and small scales • Easy for developers to get going • Commercial support available (10Gen) • Produces a net reducon in lines of code and complexity 18

DB2 JSON Support: Agility With a Trusted Foundaon • Interoperate seamlessly with modern applicaons • Flexible schemas allow rapid delivery of applicaons • Preserve tradional DBMS capabilies, leverage exisng skills and tools • Mul-statement Transacons • Management / Operaons • Security • Scale, performance and high availability • Extend with advanced features (future) • Temporal semancs • Full Text search • Mul-collecon joins • Combine with Enterprise RDBMS data 19

DB2 JSON API Technology Details • IBM provided Java Driver for JSON API • Java Driver supporng JSON API for data access layer • Transacons • Parametric SQL statements (Delete, select) • Temporal tables • CLP-Like Command Shell • Ad-hoc updates / queries • Administraon commands • Open Source Driver Wire Listener • Leverage NoSQL community drivers for data access layer 20

DB2 JSON Java API • Java Driver that translates API calls to SQL + funcon invocaons • Supports Transactions

• Batches insertions Java Apps

• Fire-forget inserts (fast) JSON API JSON • Indexing Command Shell • Time travel query JDBC Driver DRDA • Smart Query re-write

• Java command line JSON_TABLE() JSON JSON_UPDATE() UDFs …

JSON_VAL() - builtin, supports extraction of (SQL) DB2 Engine IoE w/ BLOB in the values from BSON expression 21 DB2 NoSQL/JSON API from Java

/*Set up Conn. and Database handle*/ /* Use to fetch back the JSON */ Context ctx = new InitialContext(); DataSource ds = DBCursor cursor = shop.find(new (DataSource)ctx.lookup("jdbc/myDB2"); BasicDBObject("customer", "Bill")); Connection conn = ds.getConnection();

try { Database db = new Database(conn); DBCollection shop = while(cursor.hasNext()) { db.getCollection("shop"); DBObject obj = cursor.next(); doSomething(obj); /*Create JSON objects and insert*/ } BasicDBObject cart = new } finally{ BasicDBObject(); cursor.close(); //close the cursor BasicDBObject amtDue = new } BasicDBObject(); cart.put("sid", "176"); cart.put("customer", "Bill")’; amtDue.put("subtotal", 50.07); amtDue.put("tax", 4.26); amtDue.put("total", 54.33) cart.put("amtDue", amtDue); shop.insert(cart); 22

NoSQL JSON Wire Listener Applications • Built on JSON API Java PHP NodeJS • Leverage community BSON Wire Protocol • Immediate reach to more AIM Developed MongoDB Wire Protocol applicaons and developers NoSQL JSON Wire Listener • Presence in "New style apps" • (Future) Extend exisng community JSON API JSON CLP drivers with DB specific features JDBC Driver

• Mul-statement commit scope DRDA • Temporal • Geo-spaal JSON_TABLE() JSON JSON_UPDATE() DB2 UDFs …

DB2 Engine

Community Provided Drivers IBM extension to enable DB2 features 23 NodeJS Code Sample var databaseUrl = "shop" var collections = ["cart"] var db = require("mongojs").connect(databaseUrl, collections);

… Db.users.save({sid: "176", customer: "[email protected]", amtDue: {subtotal: 50.07, tax: 4.26, total: 54.33}}, function(err, saved) { if (err || !saved ) console.log("cart not saved"); else console.log("cart saved"); });

… Db.cart.find({customer: "[email protected]"}, function(err. carts) { if (err || !carts) console.log("No carts found"); else carts.forEach( function(iCart) { console.log(iCart);});});

24 Javascript Access to DB2 – Node.js

db.query("select * from employee fetch first 5 rows only", ! function(err, rows, moreResultSets) ! { ! for (var i=0;i

Indexes • Simple index – Ascending Index on SID db.collection.ensureIndex({sid:{1, "$int"}}); • Simple Index – Ascending Index on Customer Field db.collection.ensureIndex({"customer":1}}); • Composite index – Customer Ascending with Total Descending db.collection.ensureIndex( {customer:[1, "$string", 20], total:{-1, "$int"}} ); • Index on nested object db.collection.ensureIndex({amtDue.total:{1, "$int"}); 26

What is JSON’s Role in the Enterprise? • Flexible Schema is agile, liberang for applicaon developers • But will we abandon years of experse in data modeling / normalizaon theory? • How to maintain control in an enterprise, mission crical DBMS? • Idenficaon of appropriate applicaons is crical • Applicaon deployment procedures need to adapt • New controls to prevent schema chaos • Applicaon Development Groups need to implement controls • When combining with applicaon that uses relaonal schema • Idenfy porons that need to remain dynamic • Allocate / accommodate space for that as JSON • Future – combinaon of SQL and JSON will make this easier "If I have seen further, it is by standing on the shoulders of giants" - Sir Isaac Newton 27 What Format to Use? • Consider NoSQL JSON when: • Applicaon and schema subject to frequent changes • Prototyping, early stages of applicaon development • De-normalized data has advantages • Enty / document is in the form you want to save • Read efficiency – return in one fetch without sorng, grouping or ORM mapping • "Systems of Engagement" • Less stringent "CAP" requirements in favor of speed • Eventual consistency is good enough • Social media • Relaonal sll best suited when these are crical: • Data Normalizaon to • Eliminate redundancy • Ensure master data consistency • Database enforced constraints • Database-server JOINs on secondary indexes

28 Data Normalizaon - Choose the Right Soluon Relational NoSQL JSON - Two approaches: Embedded (de-normalized) Simple normalized schema (DB2 sample) with relational constraints: {dept: "A10", deptname:"Shipping", manager:"Samuel", emp:[ {empno:"000999", lastname:"Harrison", edlevel:"16"}, Chance for {empno:"370001", lastname:"Davis", data edlevel:"12"} ] redundancy proj:[ {projno:"397", projname:"Site Renovation", respemp:"370001" }, {projno:"397", projname:"Site Renovation", respemp:"370001"} … ] } Requires application-

Using references side join

{_id {_id {_id dept emp dept … dept ref emp ref ) … … } )

If you need normalizaon and database-enforced constraints, JSON may not be best choice 29

Using Schema-less Model In a Tradional Environment • During prototyping, early development, validaon • As system stabilizes, stable schema parts could be moved to relaonal columns

• For cases where web message or document will be retrieved as-is • Yet retrieval on internal fields is needed

• When parts of the schema will always be dynamic 30

JSON Use Case – Inheritance of Common Fields • Documents share a common structure but products {prodnum:"BA9444", name:"Mahogany Desk", may have unique variaons type:furniture, price:349.00, description:"Small Writing Desk", • Example: supplier: "Elegant Wood Designs" details : { • website stores product descripons in single construction:"veneer", weight:80, units: pounds, collecon dimensions: {height:29, • All have product number, price, supplier, width:48, depth: 28, name, descripon units:"inches" } } • Different product types have unique fields }

• As new products are introduced they need no {prodnum:"CR2549", name:"Gulliver’s Travels", change type:book, price:15.97, description:"Classic novel", • Common fields are indexed, others are supplier: "Penguin Group", details : { queryable but not indexed author:"Jonathan Swift", categories: [adventure, travel, fantasy], publish_date: 1726 } } 31

JSON and DB2 – Complementary Technologies • Does NoSQL mean NoDBA? NoDB2? • Definitely not - the relaonal database isn’t going away anyme soon • We see JSON as becoming a complementary technology to relaonal

• Transaconal atomicity is essenal for mission crical business transacons • DB2 JSON Store soluon brings commits, transacon scope

• Possible Future Direcon – access JSON data directly with SQL • JSON data co-exists with relaonal columns in the same table • Enables the proper balance between fixed schema and flexible schema 32

DB2 JSON Summary • MongoDB Wire Listener • Leverage NoSQL community drivers for data access layer • IBM provided Java Driver for JSON API • Java Driver supporng JSON API for data access layer • Transacons • Parametric SQL statements(Delete, select) • Temporal tables • Insert, Update, Delete, Select support • Select projecon list • Batching, order by, paging queries (API only) • Fire and forget inserts • Limited aggregate funcons (group-by / having) • Indexing support in API • Primary index and secondary single value index • Import/Export • Import/Export from/to MongoDB export JS-files • Command line tools • Execute JSON queries and display results • Install • Files and scripts that are part of server and DS Driver George Baklarz IBM Canada Laboratory baklarz@ca..com Session Code: B08 DB2 and NoSQL - The end of relaonal?