NoSQL: The New Normal

Matt Asay (@mjasay) VP, Corporate Strategy, 10gen 10gen Overview

180+ employees 500+ customers

Offices in New York, Palo Alto, Washington Over $81 million in funding DC, London, Dublin, Barcelona and Sydney

2 Global Community

3,800,000+ MongoDB Downloads 47,000+ Online Education Registrants 15,000+ MongoDB User Group Members 14,000+ MongoDB Monitoring Service Users (MMS) 10,000+ Annual MongoDB Days Attendees

3 the past… is prologue? Back to the Future?

What has been will be again, what has been done will be done again; there is nothing new under the sun.

(Ecclesiastes 1:9, NIV)

5 What Do You Remember about 1969?

6 nothing? hold that thought

7 Back to the Future: PreSQL

• IBM’s IMS (1969) – Developed as part of the Apollo Project • IDS (), navigational , 1973 • High performance but: – Forced developers to worry about both query design and schema design upfront – Made it hard to change anything mid-stream

8 Enter SQL

• Designed to overcome “PreSQL” deficiencies – Decoupled query design from schema design – Allowed developers to focus on schema design – Could be confident that you could query the data as you wanted later

• 30 years of dominance later…

9 …the present… RDBMS Is Like a Spreadsheet

11 With “Relations” Between Rows

12 Lots of relations. Lots of rows.

13 It Hides What You’re Really Doing

14 It Makes Development Hard

Code XML Config DB Schema

Object Relational Application Relational Mapping Database

15 And Makes Things Hard to Change

New Table New Column New Table

Name Pet Phone Email

New Column

3 months later…

16 RDBMS Scale = Bigger Computers

“Clients can also opt to run zEC12 without a raised datacenter floor -- a first for high-end IBM mainframes.”

IBM Press Release 28 Aug, 2012

17 This Was a Problem for Google

2010 Search Index Size: 100,000,000 GB

New data added per day 100,000+ GB

Databases they could use 0 4.1miles== 250,000+ MBP’s

18 Source: http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine. And for Facebook

2010: 13,000,000 queries per second

19 And for Facebook

2010: 13,000,000 queries per second

TPC #1 DB: 504161 tps

TPC Top Results

20 And for Facebook

2010: 13,000,000 queries per second

TPC #1 DB: 504161 tps

TPC Top Results

Top 10 combined: 1,370,368 tps

21 Big Data Changes Everything… for Everyone

Variety of Data Agile Development • Unstructured data • Iterative • Semi-structured • Short development data cycles • Polymorphic data • New workloads

Volume/Velocity of Data New Architectures • Petabytes of data • Horizontal scaling • Trillions of records • Commodity servers • Millions of queries per • Cloud computing second

22 Shift in What We’re Computing

23 Living in the Post-transactional Future

Order-processing systems largely “done” (RDBMS); primary focus on better search and recommendations or adapting prices on the fly (NoSQL)

Vast majority of its engineering is focused on recommending better movies (NoSQL), not processing monthly bills (RDBMS)

Easy part is processing the credit card (RDBMS). Hard part is making it location aware, so it knows where you are and what you’re buying (NoSQL)

24 Shift in How We Develop Applications

“Systems of Engagement are built by front-line developers using modern languages who are driven by time to market, the need for rapid deployment and iteration….They value solutions that make it easy for them to deploy their application code with as little friction as possible.” (Forrester 2013)

25

Why NoSQL?

Agile Scalable Lower TCO

27 what does this mean? Developers Are More Productive

Code XML Config DB Schema

Object Relational Application Relational Mapping Database

29 Developers Are More Productive

Code XML Config DB Schema

Object Relational Application Relational Mapping Database

30 Developers? They Matter

31 some examples Agility

RDBMS MongoDB

{ _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] }

33 Example

Serves targeted content to users using MongoDB- powered identity system

Problem Why MongoDB Results

• 20M+ unique visitors • Easy-to-manage • Rapid rollout of new per month dynamic data model features enables limitless • Rigid relational schema growth, interactive • Customized, social unable to evolve with content conversations changing data types throughout site and new features • Support for ad hoc • Tracks user data to queries • Slow development increase engagement, cycles • Highly extensible revenue

34 Scalability

Auto-Sharding

• Increase capacity as you go • Commodity and cloud architectures • Improved operational simplicity and cost visibility

35 Case Study

Manages a wide range of content and services for its web properties using MongoDB

Problem Why MongoDB Results

• Trouble dealing with a • Move from 6 billion rows • Significant performance huge variety of content in RDBMS to simplicity gains despite big • MySQL unable to keep of 1 document increase in volume and variety of data up with performance • Automated failover and and scalability requirements ability to add nodes • Greater agility, faster without downtime development iteration • Problems compounded by integrating • “Blazingly fast” query • Saved £2m in licenses information from T- performance: “blown and hardware Mobile joint venture away by [MongoDB’s] performance”

36 Better Total Cost of Ownership (TCO)

Developer/Ops Savings • Ease of Use • Agile development • Less maintenance Hardware Savings • Commodity servers • Internal storage (no SAN) • Scale out, not up

Software/Support Savings DB Alternative • No upfront license • Cost visibility for usage growth

37 Case Study

Stores one of world’s largest record repositories and searchable catalogues in MongoDB

Problem Why MongoDB Results

• One of world’s largest • Fast, easy scalability • Will scale to 100s of TB record repositories by 2013, PB by 2020 • Full query language • Move to SOA required • Searchable catalogue • Complex metadata new approach to data of varied data types store storage • Decreased SW and • Delivers high scalability, • RDBMS could not support costs support centralized data fast performance, and mgt and federation of easy maintenance, while information services keeping support costs low

38 Performance

Better Data In-Memory In-Place Locality Caching Updates

39 Case Study

Uses MongoDB to safeguard over 6 billion images served to millions of customers

Problem Why MongoDB Results

• 6B images, 20TB of • JSON-based data • 5x cost reduction data model • 9x performance • Brittle code base on top • Agile, high improvement of Oracle database – performance, scalable hard to scale, add • Faster time-to-market • Alignment with features • Dev cycles in weeks vs. Shutterfly’s services- • High SW and HW costs based architecture tens of months

40 …the future Leading Organizations Rely on NoSQL

42 NoSQL Adoption

NoSQL First Policy MultipleNoSQL NoSQLCentre of ProjectsExcellence Multiple NoSQL First Projects NoSQL Project

43 NoSQL: The New Normal

Document Store Meets Requirements

RDBMSs Meet Requirements

Key/Value or Column Stores Meet Requirements

44 Is It a Polyglot Future?

45 Remember the Long Tail?

46 It Didn’t Work out Well

47 General Purpose, High Performance

Database Popularity Jobs, Searches, Mentions, Etc.

Rank Database Database Type Score % Change 1 Oracle RDBMS 1567.93 + 8.6 Microsoft SQL 2 Server RDBMS 1310.61 - 0.73 3 MySQL RDBMS 1284.78 - 28.89 4 Microsoft Access RDBMS 186.05 + 15.12 5 PostgreSQL RDBMS 183.47 + 16 6 DB2 RDBMS 162.05 + 8.39 7 MongoDB Document store 116.14 + 20.01 8 Sybase RDBMS 84.52 + 1.07 9 SQLite RDBMS 81.01 + 0.65 10 Solr Search engine 47.35 Wide column 11 Cassandra store 36.16 + 1.24 12 Redis Key-value store 32.03 + 6.06 13 Informix RDBMS 25.43 + 1.52 14 Memcached Key-value store 23.28 - 1.83 Wide column 15 Hbase store 20.74 + 1.05 Source: DBEngines, Feb 2013

48 NoSQL Benefits

Increased Developer Productivity Better User Experience

Faster Time to Market Lower TCO

49

Leading NoSQL Database

Google Search LinkedIn Job Skills

MongoDB

Competitor 1

Competitor 2

Competitor 3

Competitor 4

Competitor 5 MongoDB Competitor 2 Competitor 4 All Others Competitor 1 Competitor 3

Jaspersoft Big Data Index Indeed.com Trends

Direct Real-Time Downloads Top Job Trends 1. HTML 5 MongoDB 2. MongoDB Competitor 1 3. iOS 4. Android Competitor 2 5. Mobile Apps 6. Puppet Competitor 3 7. Hadoop 8. jQuery 9. PaaS 51 10. Social Media