Key-Value Graph Document Column-family summary Polygot persistence

NoSQL Systems

February 15, 2016

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Data in a key-value store is organized around the associative array (a.k.a. Map, Dictionary, Hash) Key-value stores contain collection of key-value pair with each key being unique in the collection. They store any value in the value field including data structures, e.g., JSON. They operate as caches or data structures. They do not use SQL and have a flexible schema. An important feature of Key-Value stores is the sharding of data that delivers scale-out architecture.

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Consistency only for operations on a single key (operations are limited to get, put, or delete on a single key). Optimistic writes can be performed, but are very expensive to implement, because a change in value cannot be determined by the data store. Eventual consistency in Riak → update conflicts: newest write wins or return values to let client decide. Scaling uses sharding on the key

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Different strategies for KV stores. In general, no guarantees on writes. Riak uses the concept of Quorum (get write tolerance that way, W=3, R=5) Write quorum: W > N/2, the number of nodes participating in the Writes (W) must be more than half the number of nodes involved in replication (N). Ex: Replication factor of 3, only 2 nodes need to confirm on writes. Read quorum (how many nodes do I need to contact to ensure I have the most up to date value): R + W > N so R > N − W .

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Use cases Storing session information, e.g., web sessions User profiles and preferences Shopping cart data Do not use for: joins between keys, multioperation transactions (support for rollback), need for secondary indexes, operations by sets.

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Memcached

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

stands for Memory Daemon in-memory indexing Implementation started in 2003 at LiveJournal (social blogging sites) Project Danga Interactive, BSD licence Implemented in C Used by YouTube, Facebook, Twitter, Orange, GAE, AWS, Flickr, Slashdot, etc. Characteristics: distributed memory, caching system.

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

It is a server that caches Name Value Pairs (NVPs) in memory. Value can be anything: rows of data, HTML, binary objects. When some data are needed, the system checks if it is available in Memcached and if it is not the case retrieves it from disk and stores it in Memcached for future accesses.

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Query model 4 main commands taking over a TCP or UDP connection. SET: add a new item or replace an existing one with new data ADD: only store the data if the key does not exist. REPLACE: only store the data if the key already exists. GET: return the data

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

is not a persistent data store. can not be dumped to disk. has no security mechanism built-in. does not support any fail-over/ high availability mechanisms. Least Recently Used (LRU) algo to remove data when space is needed. Expiration time can be associated to NVP

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Components Server: a NVP server. Basically, stores and retrieves data stored with a key. Limitations: length(key)¡250 characters. size(value)¡1MB. Server is atomic (does not care about other servers). Client: knows which server contains an NVP. Client libraries for most languages. Enables to compress NVPs with values greater than 1MB.

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Redis high performance key-value store flexible schema Supports publish-subscrie messaging Supports many data structures: lists, sets, sorted sets, hashes, hyperloglogs, etc.

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Hashes collections of key-value pairs map between key string keys and string values Efficient for representing objects and tabular data Operations: HLEN, HKEYS, HVALS, HDEL, HEXISTS, HINCRBY, HINCRBYFLOAT

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Sets unordered collections of strings Similar to lists but all members are unique Possible to add the same element multiple times without needing to check its existence in the set. Add, remove and test existence of members in constant time. Set-based operations (union, intersection, difference) Operations: SADD, SREM, SISMEMBER, SMEMBERS, SDIFF,SDIFFSTORE, SINTER, SINTERSTORE, SPOP, etc.

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Sorted sets Ordered collections of strings. a rank field to determine the order. Members are automatically sorted by rank Members must be unique. Can retrieve elemets by position or rank Set-based operations (union, intersection, difference) Add, remove and updating of members is in O(log(n)). Operations: ZADD, ZREM, ZSCORE, ZRANGE, ...

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Bitmaps Bit operations Can count the number of bits set ot 1. Perform AND, OR, XOR, NOT operations Operations: SETBIT, GETBIT, BITOP AND, BITOP OR, BITOP NOT

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Hyperloglogs A probabilistic data structure to count unique elements → gives an estimation Do not need to keep a copy of all members. Operations: ZADD, ZREM, ZSCORE, ZRANGE, ...

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

uses the BerkeleyDB storage engine. provides transactional semantics with fined-grained concurrency primary and secondary indexes. high availability

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Amazon DynamoDB automatic sharding Client: AdRoll serves 100 billion ad impressions per day.

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Aerospike Hybrid in-memory: dynamic random access memory (DRAM) and SSD self-healing architecture

NoSQL Systems Key-Value Introduction Graph Consistency - Scaling Document Transactions - Use cases Column-family Memcached - Redis summary Oracle NoSQL, DynamoDB, Aerospike, Riak Polygot persistence

Riak Riak secondary indexes, full-text search, MapReduce compliant Couchbase: membase

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

Stores entities (aka nodes) and relationships (aka edges) between these entities. Nodes and edges can have properties (aka attributes) in the property graph model. Edges may be directed or not. A query on a graph is known as traversing the graph. Traversing is fast because joins are not calculated but persisted.

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

Some solutions do not support distributing the nodes, so there no consistency problem → ACID Otherwise, systems adopt a master-slave approach. Some where slaves accept writes and synchronize with the master

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

Partition the graph using round-robin, range, hash, graph partitioning, cluster ML approaches (K-means) Use cases: connected data (social networks), routing and location-based services, recommendation engines. No adapted to use cases where many nodes need to be modified at once.

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

Project NeoTechnology, AGPL/VGPL licence Started in 2009 Implemented in Java Characteristics: represent everything with nodes and relationships, persisted, fully transactional.

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

Nodes Relationships between nodes (can be both directed and bidirectional) Data on nodes and relationships (arbitrary number of key/value pairs)

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

Graph traversal based. Navigate from a starting node via relationships to the node matching a criteria. Usually takes the form of a Java API. Can be queried with Cypher or Gremlin1 Cypher (declarative) vs Gremlin (procedural) OpenCypher (http://www.opencypher.org/)

1https://github.com/tinkerpop/gremlin/wiki NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

Cypher example START barbara = node:nodeIndex(name = ”Barbara”) MATCH (barbara)-[:FRIEND]->(friend node) RETURN friend node.name,friend node.location

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

Every operation must happen inside a transaction. Query model: graph traversal based. Persisted on disk with a custom binary disk format. Sharding is hard and needs interactions of end-user.

NoSQL Systems Key-Value Graph Introduction Document Consistency Column-family Scaling - Use cases summary Neo4J Polygot persistence

Architecture based on master-slave Concurrency control: locks Support for MapReduce: No

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Documents are the main concept in document The DB stores and retrieves XML, JSON, BSON, etc. documents. The documents are stored in the value of a key-value store. The documents are self-describing, hierarchical tree data structures consisting of maps, collections ad scalar values. From document to another, the “schema” can be different but they can still belong to the same collection (like a table in RDBMS).

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Example: { “name”: “ACTIFED ALLERGIE”, “price”:5.61} { “name”:“HUMEX ALLERGIE”, “rembRate”:0} There are no empty or null-valued attributes in documents (contrary to RDBMS). If an attribute is not found in a document, we assume that it is not set or relevant to the document. Documents allow for new attributes to be created without the need to define them or to change the existing documents.

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Popular document stores are: MongoDB, CouchDB, Terrastore, OrientDB, RavenDB, ArrangoDB. Like other NoSQL stores, they have their differences (transaction, consistency, representation, query model, etc.).

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

As aggregate-oriented stores, document systems support transactions at a single document level (→ atomic transactions). RavenDB supports transactions across multiple operations.

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Compared to KV stores, it is frequently possible to the data inside the document without having to retrieve the whole document by its key. CouchDB: query via views which can be materialized (automatically updated when queried if any data has changed since the last update) or dynamic.

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Event logging, especially when events keep changing. Content Management systems Blogging platforms Web and real-time analytics E-commerce applications: due to rich schema flexibility for products and orders. Not to use for complex transactions spanning different operations, queries against varying aggregate structure

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Presentation Document-oriented database Project 10gen, AGPL licence Started in 2009 Implemented in C++ Characteristics: scale out, MapReduce style aggregation, geospatial indexes with features of RDBMS: secondary indexes, range queries, sorting.

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Data model Document is the basic unit. Support for embedded documents, arrays → complex h ierarchical relationships within a single document. Every document has a special key: “ id” that is unique across the document’s collection Collection is like a table but schemafree. It is a group of documents. Subcollections can be defined, e.g. blog.posts. Collections are grouped in databases. Document’s format: JSON (i.e. key/value pairs). Stored as BSON (4MB max for a doc).

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

MongoDB is type and case sensitive. No duplicates in a document. MongoDB shell is a full-featured JavaScript interpreter Only 6 types in JSON. MongoDB extends that: null, boolean, 32-bit and 64-bit integer, 64-bit floating point number, string, objectid, date, regep, code, binary data, array, embedded doc, etc.

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Query expression objects expressed in BSON documents, e.g. db.users.find({}) or db.users.find({’name’:’smith’}) Support for conditions, e.g. db.users.find({’age’: {’$gte’:18,’$lte’:25}}) disjunction (with $in and $nin), negation ($not), regular expression, sorting, skip, limit Querying arrays ($all, $size, $slice to get a subset of the values stored in an array) Querying embedded documents, e.g. db.people.find({”name” : {”first” : ”Joe”, ”last” : ”Schmoe”}}) API and drivers for many languages

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Aggregation tools: count (db.foo.count()), distinct (finds all the distinct value for a given key), group (divides the collection for each value of the chosen key, similar to group by). MapReduce: map and reduce written in Javascript. Not using a framework like Hadoop.

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

GridFS to store large documents (up to 2GB): large docs are splitted into chunks which are stored in collections. Indexes on collections

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Replication Replica set is basically a master-slave cluster with automatic failover. The master is elected by the cluster and may change if the current one goes down. Replication is asynchronous. Be careful if slaves are used for reads (stale data). One starts servers as master or slave. local database stores the operation log (oplog) of the master slaves sync by asking for the oplog is used for scaling reads (from slaves) or data processing

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Sharding MongoDB’s solution to scale out. Principle: break up collections into smaller chunks which are distributed over shards. mongos is started and handles the distribution of queries. That way, the client connects to mongos instead of mongod. The rest is transparent to the client. When to shard: disk space problem on current machine, need faster writes, need to put more data in-memory. How: select a shard key (e.g. username). MongoDB takes care of shard balancing.

NoSQL Systems Key-Value Introduction Graph Transactions Document Querying Column-family Use cases summary MongoDB Polygot persistence

Architecture based on master-slave or master-master (not recommended) Concurrency control: Last update wins Support for MapReduce: Yes but a built-in one (not Hadoop)

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence aka Wide row stores and wide column stores. Aggregate-oriented stores where the value can have multiple columns. A column is a set of data values of a particular type. Column-family dbs store and process data by column instead of row. Popular systems are Cassandra, HBase and Google’s BigTable.

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

Use cases Event logging CMS Blogging platforms Counters: count and categorize visitors of a page to calculate analytics. Expiring usage: you can define expiring columns (remove ad banners on a website for a specific time). Columns are removed automatically after a gven time. (Time To Live = TTL). Not adapted to: for early prototypes due to schema change (Cassandra), ACID transactions.

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

Column family database Apache project (since 2008), started at Facebook in 2007 Started in 2009 Implemented in Java Used at Digg, Twitter, Reddit, Rackspace, Netflix, etc. Characteristics: Fault tolerant, Decentralized (P2P), Shared nothing architecture, Tuneable consistency, Elastic

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

Lineage : Dynamo (for the distribution model) and Bigtable (for data model and storage archititecture) Fully distributed (no Single Point of Failure) Fast reads and write(’optimize for reads, writes are cheap’) Eventually consistent

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

Distributed multi-level hash map

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

Column is a triple with: a key, a value and a timestamp Columns and super columns are sorted (customizable and defined by column family) Predefined sorts are: BytesType, LongType, AsciiType, UTF8Type, LexicalUUIDType, TimeUUIDType

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

API Writes insert(): insert/update a single column remove(): remove a column/super column/row batch mutated() : update/remove several columns Reads get() : a single column get slice() ; retrieve a group of columns (by names or range) get range slices() : retrieve a set of slices for a range of row keys count() : the number of columns in a row

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

CQL CQL: Cassandra Query language SQL like language with DDL (CREATE, ALTER, DROP) and DML (INSERT, UPDATE, DELETE, SELECT) query operations. Supported datatypes: numerical, character, date, unstructured and specialized datatypes (JSON). Transactions are more AID than ACID :: tunable data consistency across a database (from strong to eventual).

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

Replication Set per keyspace Specified in servers config file Tells how duplicate one wants

NoSQL Systems Key-Value Graph Document Introduction Column-family Cassandra summary Polygot persistence

Architecture based on ring (consistent hashing) Concurrency control: OCC (no locking) Support for MapReduce: Yes

NoSQL Systems Key-Value Graph Document Exercice Column-family summary Polygot persistence

NoSQL Systems Key-Value Graph Document Exercice Column-family summary Polygot persistence

We consider a simple blog application containing the following information. Blog entries are being written by users characterized by a userId, a name and an email address. For each blog entry, we store the content of the entry, its storage date (that is the date at which it is being stored in the database), the user who has produced the entry and the category of the entry. Category correspond to a subject area (e.g. sports, music, computer science). Finally, each user can subscribe to the blogs of other users. Apart from joinless queries (e.g. display blog entries in chronological order), the application needs to answer questions such as ’what users subscribe to ones blog’ and ’show the most recent entries for the blogs one has subscribed to’. Model for the relational, document, column family and graph models.

NoSQL Systems Key-Value Graph Document Exercice Column-family summary Polygot persistence

user (userid, username, state) category (catId, catName) blog (blogid, blogContent, blogDate, #userid ,#catID) follow (#follower, #followed, followDate)

NoSQL Systems Key-Value Graph Document Exercice Column-family summary Polygot persistence

Schema from http://www.datastax.com/docs/0.8/ddl/index

NoSQL Systems Key-Value Graph Document Exercice Column-family summary Polygot persistence

users { id : ”jbellis” name : ”Jonathan” state : ”TX” following : [”dhutch”, ”egilmore” ], followers : [ ”dhutch”, ”egilmore”] blogs: { [ date: ”128..”, body: ”Today ..”, category: ”tech” ],.. } } with secondary indexes on following, followers and date of blogs

NoSQL Systems Key-Value Graph Document Exercice Column-family summary Polygot persistence

NoSQL Systems Key-Value Graph Introduction Document scenario Column-family Example summary Polygot persistence

Polyglot persistence Term coined after Neal Ford’s Polygot programming, asking to write programs with a mix prog. languages. Polyglot persistence aims to use different different data stores in your applications. Imagine a e-commerce application. What would you use for the shopping cart, the completed orders and session data ?

NoSQL Systems Key-Value Graph Introduction Document scenario Column-family Example summary Polygot persistence

The shopping cart and the session data can be efficiently stored in a Key-Value store. Respectively, their keys are userID and sessionID. Once an order is completed, that data can be stored in an RDBMS or a Document store. What if we want to add a product recommendation service ? Thing Collaborative Filtering, those who bought that product also like that product or your friends bought .. What about inventory and item prices ?

NoSQL Systems Key-Value Graph Introduction Document scenario Column-family Example summary Polygot persistence

A graph database corresponds to storing recommendation data. Inventory and item prices fit nicely in an RDBMS. If we have a lot of text, we can index that text using a store like Solr (part of the Lucene project). With Polyglot Persistence, one has to be careful with deployment complexity: all databases are needed in production at the same time. It may be a got solution to design services on these databases. It reduces the impact of data storage choices.

NoSQL Systems Key-Value Graph Introduction Document scenario Column-family Example summary Polygot persistence

It is a good practice to use an RDBMS for every aspect of storage for the application With the various NOSQL stores available now, it seems more clever to implement applications that will access data stored in RDBMS, NOSQL, RDF stores, etc. Key aspect for this decision is: to understand the pros and cons of each DB system and identify the storage issues related your app’s functionalities.

NoSQL Systems Key-Value Graph Introduction Document scenario Column-family Example summary Polygot persistence

NoSQL Systems Key-Value Graph Introduction Document scenario Column-family Example summary Polygot persistence Twitter When a write a tweet: the tweet enters the ’WriteAPI’ which calls the Fanout module to send it to all followers, i.e. stored in a user array of tweets (in Redis) In the Redis cluster, all users’s timelines are stored (not persisted, everything in RAM, duplicated 3 times). In case of failover, it can be reconstructed. They keep the last 800 tweets for each user in RAM. Fanout asks the Social Graph service to know who is following who. In redis, data model is tweetId (8bytes), UserID (8bytes), bits (4bytes) plus retweet (tweetID) Timeline service, provides the Redis server where your home timeline is stored.

NoSQL Systems Key-Value Graph Introduction Document scenario Column-family Example summary Polygot persistence

Twitter (2) The WriteAPI also sends tweets to the ’Search Ingester’ then it stores it in a modified Lucene index (named Earlybird). Index is in-memory. Blender is the service that enables to access Earlybird. Twitter also a a pull solution (pulls tweets to users). WriteAPI sends tweets to HTTP Push which contains ’Hosebird’ which searches to how to sends that tweet. A similar service exists for mobile devices, named Mobile Push. WriteAPI also sends all tweets to HDFS to run MR jobs.

NoSQL Systems