Couchbase Architecture
Total Page:16
File Type:pdf, Size:1020Kb
Couchbase Architecture ©2015 Couchbase Inc. 1 $whoami Laurent Doguin Couchbase Developer Advocate @ldoguin | [email protected] ©2015 Couchbase Inc. 2 2 Big Data = Operational + Analytic (NoSQL + Hadoop) Real-time, Batch-oriented interactive databases analytic databases OPERATIONAL VELOCITY ANALYTICAL VOLUME § Online § Offline, batch-oriented § Web/Mobile/IoT apps § Analytics apps § Millions of customers/ § Hundreds of business analysts ©2015 Couchbase Inc. consumers 3 Key Capabilities Combines the flexibility of JSON, the power of SQL and the scale of NoSQL •N1QL Develop with Agility Operate at Any Scale Multiple data models Push-button scalability N1QL - SQL-Like query Consistent high-performance language Always on 24x7 with HA - DR Multiple indexes Easy Administration with Web UI, Languages, ODBC / JDBC Rest API and CLI drivers and frameworks you already know ©2015 Couchbase Inc. 4 Couchbase provides a complete Data Management solution General purpose capabilities support a broad range of apps and use cases N1QL Highly available Key-value Document Embedded Sync cache store database database management ©2015 Couchbase Inc. 5 Enterprises use Couchbase to enable key objectives Profile 360 Degree Internet of Mobile Personalization Things Management Customer View Applications Content Catalog Real Time Digital Fraud Management Big Data Communication Detection ©2015 Couchbase Inc. 6 Develop with Agility ©2015 Couchbase Inc. 7 What does a JSON document look like? { “ID”: 1, “FIRST”: “Dipti”, “LAST”: “Borkar”, “ZIP”: “94040”, “CITY”: “MV”, = + “STATE”: “CA” } JSON All data in a single document ©2015 Couchbase Inc. 8 Storing and retrieving documents Clients Documents User/application data Read from / Written to Servers Data Buckets Which live on Server Nodes Based on hash partitioning That form a Couchbase Cluster Dynamically scalable ©2015 Couchbase Inc. ©2014 Couchbase, Inc. 9 Accessing Data in Couchbase § Multiple Access Paths Functional Allow for view querying, building of queries Hold on to cluster information such as Give the application developer a concurrent Manage connections to the bucket within the Allow for querying, execution of other and reasonable error handling from the cluster. topology. API for basic (k-v) or document management cluster for different services. directives such as defining indexes and View Provide a core layer where IO can be managed checking on index state. CRUD N1QL Query Query API and optimized. abucket.NewViewQueryReference Cluster Management get() API Provide a way to manage buckets. ().Limit().Stale() Query & Index Data Service abucket.NewN1QLQuery( openBucketinsert() () Services “SELECT * FROM default LIMIT 5” ) info() upsertAPI () Cluster .Consistency(disconnect() remove() insertDesignDocumentgocouchbase.RequestPlus() ); flush() listDesignDocuments() ©2015 Couchbase Inc. 10 Couchbase SDKs and Connectors ©2015 Couchbase Inc. 11 Operate at Any Scale ©2015 Couchbase Inc. 12 Couchbase Architecture – Single Node ü Data Service – builds and maintains Distributed secondary indexes Data Index Query Management REST API (MapReduce Views) Service Service Service Web UI ü Indexing Engine – builds and maintains Global Secondary Indexes Managed Cache Indexing Query Engine Engine View Engine Node / ü Query Engine – plans, coordinates, Cluster and executes queries against either Orchestration Global or Distributed indexes Managed Cache Storage Managed ü Cluster Manager – configuration, Cache heartbeat, statistics, RESTful Storage Erlang / OTP Management interface Node Manager Cluster Manager Couchbase Server Node ©2015 Couchbase Inc. 13 13 Data Service: Write Operation Single-node type means APPLICATION SERVER easier administration and scaling DOC 1 § Writes are async by default § Application gets acknowledgement when MANAGED CACHE successfully in RAM and can trade- off waiting for replication or DOC 1 persistence per-write REPLICATION/ § Replication to 1, 2 or 3 other nodes XDCR/ CONNECTORS/ VIEWS/ DISK § Replication is RAM-based so INDEXING DISK extremely fast QUEUE § Off-node replication is primary level of HA § Disk written to as fast as possible – ©2015 Couchbase Inc. no waiting 14 14 Data Service: Read Operation APPLICATION SERVER Single-node type means GET easier administration and DOC 1 scaling § Reads out of cache are extremely fast MANAGED CACHE § No other process/system to communicate with DOC 1 § Data connection is a TCP-binary REPLICATION/ XDCR/ protocol CONNECTORS/ VIEWS/ DISK INDEXING DISK QUEUE DOC 1 ©2015 Couchbase Inc. 15 15 Data Service: Cache Miss Single-node type means APPLICATION SERVER easier administration and GET DOC 1 scaling § Layer consolidation means 1 single interface for App to talk to and get its data back as fast as MANAGED CACHE possible § DOC 1 DOC 2 DOC 3 DOC 4 DOC 5 Separation of cache and disk allows for fastest access out of REPLICATION/ RAM while pulling data from disk XDCR/ CONNECTORS/ in parallel VIEWS/ DISK INDEXING DISK QUEUE DOC 1 DOC 2 DOC 3 DOC 4 DOC 5 ©2015 Couchbase Inc. 16 16 Couchbase Views § Local Index – Distributed indexing and scatter gather querying § Incremental Map-Reduce – Distributed simple real-time analytics – Only considers changes due to updated data ©2015 Couchbase Inc. ©2014 Couchbase, Inc. 17 Index Service ©2015 Couchbase Inc. 18 Couchbase Global Indexing Service Global Secondary Index Service Index#1 Index#2 § New to 4.0 § Indexes partitioned Index#3 Index#4 independently from data § Each index receives only its own Supervisor mutations Index maintenance & Scan coordinator § Managed Caching layer § ForestDB storage engine § B+ Trie optimized for very large data volumes Indexing Service § Optimized for SSD’s ©2015 Couchbase Inc. 19 Query Service ©2015 Couchbase Inc. 20 Query Execution Flow SELECT c_id, { c_first, "c_first": "Joe", c_last, Clients "c_id": 49165, c_max "c_last": "Montana", FROM CUSTOMER "c_max" : 50000 WHERE c_id = 49165; } 1. Submit the query over REST API 8. Query result 2. Parse, Analyze, create Plan Query 7. Evaluate: Documents to results Service 3. Scan Request; 5. Fetch Request, Index index filters doc keys Data Service Service 4. Get qualified doc keys 6. Fetch the documents ©2015 Couchbase Inc. 21 Couchbase Clustering Architecture ©2015 Couchbase Inc. 22 22 Auto sharding – Bucket and vBuckets Data buckets vB vB vB vB 1 ….. 1024 1 ….. 1024 Active Virtual buckets Replica Virtual buckets ©2015 Couchbase Inc. 23 24 vBucket1 vBucket2 vBucket3 vBucket4 Couchbase SDK Couchbase vBucket5 CLUSTER MAP CLUSTER CRC32 vBucket6 Hashing Algorithm vBucket7 ... Couchbase Cluster Couchbase vBucket1024 vBucket1 vBucket2 vBucket3 vBucket4 Couchbase SDK Couchbase vBucket5 CLUSTER MAP CLUSTER CRC32 vBucket6 Hashing Algorithm vBucket7 ... vBucket1024 Couchbase Cluster Couchbase ©2015 Couchbase Inc. Cluster Map Data Services – Sharding and Replication Application has single logical connection to cluster (client object) READ/WRITE/UPDATE § Multiple nodes added or ACTIVE ACTIVE ACTIVE ACTIVE ACTIVE removed at once SHARD SHARD SHARD SHARD SHARD SHARD SHARD SHARD SHARD 5 2 9 4 7 8 1 3 6 § One-click operation SHARD SHARD SHARD SHARD SHARD SHARD SHARD SHARD SHARD § Incremental movement of active and replica vbuckets and data REPLICA REPLICA REPLICA REPLICA REPLICA § Client library updated via SHARD SHARD SHARD SHARD SHARD SHARD SHARD SHARD SHARD 4 1 8 6 3 2 7 9 5 cluster map SHARD SHARD SHARD SHARD SHARD SHARD SHARD SHARD SHARD § Fully online operation, no downtime or loss of Couchbase Server 1 Couchbase Server 2 Couchbase Server 3 Couchbase Server 4 Couchbase Server 5 performance ©2015 Couchbase Inc. 25 25 What is Multi-Dimensional Scaling? MDS is the architecture that enables independent scaling of data, query and indexing workloads while being managed as one cluster node1 node8 Index Service Query Service Data Service ©2015 Couchbase Inc. 26 Couchbase Cluster Modern Architecture § Independent Scalability for Best Computational Capacity per Service Heavier indexing (index more fields) : scale up index service nodes More RAM for query processing: scale up query service nodes node1 node8 node9 Query Service Index Service Data Service ©2015 Couchbase Inc. 27 Couchbase Cluster Cross Data Center Replication ©2015 Couchbase Inc. 28 Market leading memory-to-memory replication NYC Server Cluster Couchbase Server 1 Couchbase Server 2 Couchbase Server 3 Couchbase Server 4 MEMORY DISK MEMORY DISK MEMORY DISK MEMORY DISK New York San Francisco MEMORY DISK MEMORY DISK MEMORY DISK Couchbase Server 1 Couchbase Server 2 Couchbase Server 3 SF Server Cluster ©2015 Couchbase Inc. 29 In summary The best of both worlds •N1QL Develop with Agility Operate at Any Scale Multiple data models Push-button scalability N1QL - SQL-Like query Consistent high-performance language Always on 24x7 with HA - DR Multiple indexes Easy Administration with Web UI, Languages, ODBC / JDBC Rest API and CLI drivers and frameworks you already know ©2015 Couchbase Inc. 30 Thanks! ©2015 Couchbase Inc. 31 .