Couchbase Analytics: NoETL for Scalable NoSQL Data Analysis Murtadha Al Hubail1 Ali Alsuliman1 Michael Blow1 Michael Carey12 Dmitry Lychagin1 Ian Maxon12 Till Westmann1 1Couchbase, Inc. 2University of California, Irvine
[email protected] ABSTRACT Couchbase Server is a highly scalable document-oriented database management system. With a shared-nothing architecture, it exposes a fast key-value store with a managed cache for sub-millisecond Full-Text data operations, indexing for fast queries, and a powerful query Eventing Query Data Analytics Indexing engine for executing declarative SQL-like queries. Its Query Ser- Search vice debuted several years ago and supports high volumes of low- latency queries and updates for JSON documents. Its recently intro- Figure 1: Couchbase Server Overview duced Analytics Service complements the Query Service. Couch- base Analytics, the focus of this paper, supports complex analytical queries (e.g., ad hoc joins and aggregations) over large collections however, since the early days of mainframes and their attached ter- of JSON documents. This paper describes the Analytics Service minals. from the outside in, including its user model, its SQL++ based Today’s mission-critical applications demand support for mil- query language, and its MPP-based storage and query processing lions of interactions with end-users via the Web and mobile devices. architecture. It also briefly touches on the relationship of Couch- In contrast, traditional database systems were built for thousands of base Analytics to Apache AsterixDB, the open source Big Data users. Designed for strict consistency and data control, they tend to management system at the core of Couchbase Analytics. lack agility, flexibility, and scalability.