NoSQL: The New Normal
Matt Asay (@mjasay) VP, Corporate Strategy, 10gen 10gen Overview
180+ employees 500+ customers
Offices in New York, Palo Alto, Washington Over $81 million in funding DC, London, Dublin, Barcelona and Sydney
2 Global Community
3,800,000+ MongoDB Downloads 47,000+ Online Education Registrants 15,000+ MongoDB User Group Members 14,000+ MongoDB Monitoring Service Users (MMS) 10,000+ Annual MongoDB Days Attendees
3 the past… is prologue? Back to the Future?
What has been will be again, what has been done will be done again; there is nothing new under the sun.
(Ecclesiastes 1:9, NIV)
5 What Do You Remember about 1969?
6 nothing? hold that thought
7 Back to the Future: PreSQL
• IBM’s IMS (1969) – Developed as part of the Apollo Project • IDS (Integrated Data Store), navigational database, 1973 • High performance but: – Forced developers to worry about both query design and schema design upfront – Made it hard to change anything mid-stream
8 Enter SQL
• Designed to overcome “PreSQL” deficiencies – Decoupled query design from schema design – Allowed developers to focus on schema design – Could be confident that you could query the data as you wanted later
• 30 years of dominance later…
9 …the present… RDBMS Is Like a Spreadsheet
11 With “Relations” Between Rows
12 Lots of relations. Lots of rows.
13 It Hides What You’re Really Doing
14 It Makes Development Hard
Code XML Config DB Schema
Object Relational Application Relational Mapping Database
15 And Makes Things Hard to Change
New Table New Column New Table
Name Pet Phone Email
New Column
3 months later…
16 RDBMS Scale = Bigger Computers
“Clients can also opt to run zEC12 without a raised datacenter floor -- a first for high-end IBM mainframes.”
IBM Press Release 28 Aug, 2012
17 This Was a Problem for Google
2010 Search Index Size: 100,000,000 GB
New data added per day 100,000+ GB
Databases they could use 0 4.1miles== 250,000+ MBP’s
18 Source: http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html And for Facebook
2010: 13,000,000 queries per second
19 And for Facebook
2010: 13,000,000 queries per second
TPC #1 DB: 504161 tps
TPC Top Results
20 And for Facebook
2010: 13,000,000 queries per second
TPC #1 DB: 504161 tps
TPC Top Results
Top 10 combined: 1,370,368 tps
21 Big Data Changes Everything… for Everyone
Variety of Data Agile Development • Unstructured data • Iterative • Semi-structured • Short development data cycles • Polymorphic data • New workloads
Volume/Velocity of Data New Architectures • Petabytes of data • Horizontal scaling • Trillions of records • Commodity servers • Millions of queries per • Cloud computing second
22 Shift in What We’re Computing
23 Living in the Post-transactional Future
Order-processing systems largely “done” (RDBMS); primary focus on better search and recommendations or adapting prices on the fly (NoSQL)
Vast majority of its engineering is focused on recommending better movies (NoSQL), not processing monthly bills (RDBMS)
Easy part is processing the credit card (RDBMS). Hard part is making it location aware, so it knows where you are and what you’re buying (NoSQL)
24 Shift in How We Develop Applications
“Systems of Engagement are built by front-line developers using modern languages who are driven by time to market, the need for rapid deployment and iteration….They value solutions that make it easy for them to deploy their application code with as little friction as possible.” (Forrester 2013)
25
Why NoSQL?
Agile Scalable Lower TCO
27 what does this mean? Developers Are More Productive
Code XML Config DB Schema
Object Relational Application Relational Mapping Database
29 Developers Are More Productive
Code XML Config DB Schema
Object Relational Application Relational Mapping Database
30 Developers? They Matter
31 some nosql examples Agility
RDBMS MongoDB
{ _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] }
33 Example
Serves targeted content to users using MongoDB- powered identity system
Problem Why MongoDB Results
• 20M+ unique visitors • Easy-to-manage • Rapid rollout of new per month dynamic data model features enables limitless • Rigid relational schema growth, interactive • Customized, social unable to evolve with content conversations changing data types throughout site and new features • Support for ad hoc • Tracks user data to queries • Slow development increase engagement, cycles • Highly extensible revenue
34 Scalability
Auto-Sharding
• Increase capacity as you go • Commodity and cloud architectures • Improved operational simplicity and cost visibility
35 Case Study
Manages a wide range of content and services for its web properties using MongoDB
Problem Why MongoDB Results
• Trouble dealing with a • Move from 6 billion rows • Significant performance huge variety of content in RDBMS to simplicity gains despite big • MySQL unable to keep of 1 document increase in volume and variety of data up with performance • Automated failover and and scalability requirements ability to add nodes • Greater agility, faster without downtime development iteration • Problems compounded by integrating • “Blazingly fast” query • Saved £2m in licenses information from T- performance: “blown and hardware Mobile joint venture away by [MongoDB’s] performance”
36 Better Total Cost of Ownership (TCO)
Developer/Ops Savings • Ease of Use • Agile development • Less maintenance Hardware Savings • Commodity servers • Internal storage (no SAN) • Scale out, not up
Software/Support Savings DB Alternative • No upfront license • Cost visibility for usage growth
37 Case Study
Stores one of world’s largest record repositories and searchable catalogues in MongoDB
Problem Why MongoDB Results
• One of world’s largest • Fast, easy scalability • Will scale to 100s of TB record repositories by 2013, PB by 2020 • Full query language • Move to SOA required • Searchable catalogue • Complex metadata new approach to data of varied data types store storage • Decreased SW and • Delivers high scalability, • RDBMS could not support costs support centralized data fast performance, and mgt and federation of easy maintenance, while information services keeping support costs low
38 Performance
Better Data In-Memory In-Place Locality Caching Updates
39 Case Study
Uses MongoDB to safeguard over 6 billion images served to millions of customers
Problem Why MongoDB Results
• 6B images, 20TB of • JSON-based data • 5x cost reduction data model • 9x performance • Brittle code base on top • Agile, high improvement of Oracle database – performance, scalable hard to scale, add • Faster time-to-market • Alignment with features • Dev cycles in weeks vs. Shutterfly’s services- • High SW and HW costs based architecture tens of months
40 …the future Leading Organizations Rely on NoSQL
42 NoSQL Adoption
NoSQL First Policy MultipleNoSQL NoSQLCentre of ProjectsExcellence Multiple NoSQL First Projects NoSQL Project
43 NoSQL: The New Normal
Document Store Meets Requirements
RDBMSs Meet Requirements
Key/Value or Column Stores Meet Requirements
44 Is It a Polyglot Future?
45 Remember the Long Tail?
46 It Didn’t Work out Well
47 General Purpose, High Performance
Database Popularity Jobs, Searches, Mentions, Etc.
Rank Database Database Type Score % Change 1 Oracle RDBMS 1567.93 + 8.6 Microsoft SQL 2 Server RDBMS 1310.61 - 0.73 3 MySQL RDBMS 1284.78 - 28.89 4 Microsoft Access RDBMS 186.05 + 15.12 5 PostgreSQL RDBMS 183.47 + 16 6 DB2 RDBMS 162.05 + 8.39 7 MongoDB Document store 116.14 + 20.01 8 Sybase RDBMS 84.52 + 1.07 9 SQLite RDBMS 81.01 + 0.65 10 Solr Search engine 47.35 Wide column 11 Cassandra store 36.16 + 1.24 12 Redis Key-value store 32.03 + 6.06 13 Informix RDBMS 25.43 + 1.52 14 Memcached Key-value store 23.28 - 1.83 Wide column 15 Hbase store 20.74 + 1.05 Source: DBEngines, Feb 2013
48 NoSQL Benefits
Increased Developer Productivity Better User Experience
Faster Time to Market Lower TCO
49
Leading NoSQL Database
Google Search LinkedIn Job Skills
MongoDB
Competitor 1
Competitor 2
Competitor 3
Competitor 4
Competitor 5 MongoDB Competitor 2 Competitor 4 All Others Competitor 1 Competitor 3
Jaspersoft Big Data Index Indeed.com Trends
Direct Real-Time Downloads Top Job Trends 1. HTML 5 MongoDB 2. MongoDB Competitor 1 3. iOS 4. Android Competitor 2 5. Mobile Apps 6. Puppet Competitor 3 7. Hadoop 8. jQuery 9. PaaS 51 10. Social Media