Nosql: the New Normal

NoSQL: The New Normal Matt Asay (@mjasay) VP, Corporate Strategy, 10gen 10gen Overview 180+ employees 500+ customers Offices in New York, Palo Alto, Washington Over $81 million in funding DC, London, Dublin, Barcelona and Sydney 2 Global Community 3,800,000+ MongoDB Downloads 47,000+ Online Education Registrants 15,000+ MongoDB User Group Members 14,000+ MongoDB Monitoring Service Users (MMS) 10,000+ Annual MongoDB Days Attendees 3 the past… is prologue? Back to the Future? What has been will be again, what has been done will be done again; there is nothing new under the sun. (Ecclesiastes 1:9, NIV) 5 What Do You Remember about 1969? 6 nothing? hold that thought 7 Back to the Future: PreSQL • IBM’s IMS (1969) – Developed as part of the Apollo Project • IDS (Integrated Data Store), navigational database, 1973 • High performance but: – Forced developers to worry about both query design and schema design upfront – Made it hard to change anything mid-stream 8 Enter SQL • Designed to overcome “PreSQL” deficiencies – Decoupled query design from schema design – Allowed developers to focus on schema design – Could be confident that you could query the data as you wanted later • 30 years of dominance later… 9 …the present… RDBMS Is Like a Spreadsheet 11 With “Relations” Between Rows 12 Lots of relations. Lots of rows. 13 It Hides What You’re Really Doing 14 It Makes Development Hard Code XML Config DB Schema Object Relational Application Relational Mapping Database 15 And Makes Things Hard to Change New Table New Column New Table Name Pet Phone Email New Column 3 months later… 16 RDBMS Scale = Bigger Computers “Clients can also opt to run zEC12 without a raised datacenter floor -- a first for high-end IBM mainframes.” IBM Press Release 28 Aug, 2012 17 This Was a Problem for Google 2010 Search Index Size: 100,000,000 GB New data added per day 100,000+ GB Databases they could use 0 == 4.1miles 250,000+ MBP’s 18 Source: http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html And for Facebook 2010: 13,000,000 queries per second 19 And for Facebook 2010: 13,000,000 queries per second TPC #1 DB: 504161 tps TPC Top Results 20 And for Facebook 2010: 13,000,000 queries per second TPC #1 DB: 504161 tps TPC Top Results Top 10 combined: 1,370,368 tps 21 Big Data Changes Everything… for Everyone Variety of Data Agile Development • Unstructured data • Iterative • Semi-structured • Short development data cycles • Polymorphic data • New workloads Volume/Velocity of Data New Architectures • Petabytes of data • Horizontal scaling • Trillions of records • Commodity servers • Millions of queries per • Cloud computing second 22 Shift in What We’re Computing 23 Living in the Post-transactional Future Order-processing systems largely “done” (RDBMS); primary focus on better search and recommendations or adapting prices on the fly (NoSQL) Vast majority of its engineering is focused on recommending better movies (NoSQL), not processing monthly bills (RDBMS) Easy part is processing the credit card (RDBMS). Hard part is making it location aware, so it knows where you are and what you’re buying (NoSQL) 24 Shift in How We Develop Applications “Systems of Engagement are built by front-line developers using modern languages who are driven by time to market, the need for rapid deployment and iteration….They value solutions that make it easy for them to deploy their application code with as little friction as possible.” (Forrester 2013) 25 Why NoSQL? Agile Scalable Lower TCO 27 what does this mean? Developers Are More Productive Code XML Config DB Schema Object Relational Application Relational Mapping Database 29 Developers Are More Productive Code XML Config DB Schema Object Relational Application Relational Mapping Database 30 Developers? They Matter 31 some nosql examples Agility RDBMS MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] } 33 Example Serves targeted content to users using MongoDB- powered identity system Problem Why MongoDB Results • 20M+ unique visitors • Easy-to-manage • Rapid rollout of new per month dynamic data model features enables limitless • Rigid relational schema growth, interactive • Customized, social unable to evolve with content conversations changing data types throughout site and new features • Support for ad hoc • Tracks user data to queries • Slow development increase engagement, cycles • Highly extensible revenue 34 Scalability Auto-Sharding • Increase capacity as you go • Commodity and cloud architectures • Improved operational simplicity and cost visibility 35 Case Study Manages a wide range of content and services for its web properties using MongoDB Problem Why MongoDB Results • Trouble dealing with a • Move from 6 billion rows • Significant performance huge variety of content in RDBMS to simplicity gains despite big • MySQL unable to keep of 1 document increase in volume and variety of data up with performance • Automated failover and and scalability requirements ability to add nodes • Greater agility, faster without downtime development iteration • Problems compounded by integrating • “Blazingly fast” query • Saved £2m in licenses information from T- performance: “blown and hardware Mobile joint venture away by [MongoDB’s] performance” 36 Better Total Cost of Ownership (TCO) Developer/Ops Savings • Ease of Use • Agile development • Less maintenance Hardware Savings • Commodity servers • Internal storage (no SAN) • Scale out, not up Software/Support Savings DB Alternative • No upfront license • Cost visibility for usage growth 37 Case Study Stores one of world’s largest record repositories and searchable catalogues in MongoDB Problem Why MongoDB Results • One of world’s largest • Fast, easy scalability • Will scale to 100s of TB record repositories by 2013, PB by 2020 • Full query language • Move to SOA required • Searchable catalogue • Complex metadata new approach to data of varied data types store storage • Decreased SW and • Delivers high scalability, • RDBMS could not support costs support centralized data fast performance, and mgt and federation of easy maintenance, while information services keeping support costs low 38 Performance Better Data In-Memory In-Place Locality Caching Updates 39 Case Study Uses MongoDB to safeguard over 6 billion images served to millions of customers Problem Why MongoDB Results • 6B images, 20TB of • JSON-based data • 5x cost reduction data model • 9x performance • Brittle code base on top • Agile, high improvement of Oracle database – performance, scalable hard to scale, add • Faster time-to-market • Alignment with features • Dev cycles in weeks vs. Shutterfly’s services- • High SW and HW costs based architecture tens of months 40 …the future Leading Organizations Rely on NoSQL 42 NoSQL Adoption NoSQL First Policy MultipleNoSQL NoSQLCentre of ProjectsExcellence Multiple NoSQL First Projects NoSQL Project 43 NoSQL: The New Normal Document Store Meets Requirements RDBMSs Meet Requirements Key/Value or Column Stores Meet Requirements 44 Is It a Polyglot Future? 45 Remember the Long Tail? 46 It Didn’t Work out Well 47 General Purpose, High Performance Database Popularity Jobs, Searches, Mentions, Etc. Rank Database Database Type Score % Change 1 Oracle RDBMS 1567.93 + 8.6 Microsoft SQL 2 Server RDBMS 1310.61 - 0.73 3 MySQL RDBMS 1284.78 - 28.89 4 Microsoft Access RDBMS 186.05 + 15.12 5 PostgreSQL RDBMS 183.47 + 16 6 DB2 RDBMS 162.05 + 8.39 7 MongoDB Document store 116.14 + 20.01 8 Sybase RDBMS 84.52 + 1.07 9 SQLite RDBMS 81.01 + 0.65 10 Solr Search engine 47.35 Wide column 11 Cassandra store 36.16 + 1.24 12 Redis Key-value store 32.03 + 6.06 13 Informix RDBMS 25.43 + 1.52 14 Memcached Key-value store 23.28 - 1.83 Wide column 15 Hbase store 20.74 + 1.05 Source: DBEngines, Feb 2013 48 NoSQL Benefits Increased Developer Productivity Better User Experience Faster Time to Market Lower TCO 49 Leading NoSQL Database Google Search LinkedIn Job Skills MongoDB Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 MongoDB Competitor 2 Competitor 4 All Others Competitor 1 Competitor 3 Jaspersoft Big Data Index Indeed.com Trends Direct Real-Time Downloads Top Job Trends 1. HTML 5 MongoDB 2. MongoDB Competitor 1 3. iOS 4. Android Competitor 2 5. Mobile Apps 6. Puppet Competitor 3 7. Hadoop 8. jQuery 9. PaaS 51 10. Social Media .

Nosql: the New Normal

Demonstrate an Understanding of Computer Database Management Systems 114049

L21: Data Models & Relational Algebra

What Is Database? Types and Examples

Graph-Based Source Code Analysis of Dynamically Typed Languages

Database History

Databases Theoretical Introduction Contents

Graph Databases

Thomas Schwarz SJ

Query Analysis on a Distributed Graph Database Student: Bc

The Neo Database – a Technology Introduction (20061123)

Comparative Analysis of Legacy Database Systems P a G E | 1

Overview of Database Evolution and Elevation