Alfred Zimmermann, Alexander Rossmann(Hrsg.): Digital Enterprise Computing2015, LectureNotes in Informatics (LNI), Gesellschaft für Informatik,Bonn 2015 247 On PracticalImplications of TradingACID for CAP in the Big Data Transformation of Enterprise Applications

AndreasTönne1

Abstract: Transactional enterprise applications todaydependonthe ACID property of the underlyingdatabase. ACID is avaluable model to reason about concurrencyand consistency constraints in theapplication requirements. Thenew class of NoSQLdatabases that is used at the foundation of many bigdata architectures drops theACID properties andespecially gives up strong consistencyfor eventual consistency. Thecommonjustification for this change is theCAP- theorem, althoughgeneral scalabilityconcerns arealso noted.Inthis paper we summarizethe arguments for dropping ACID in favour of aweaker consistencymodelfor bigdata architectures. We then show theimplications of theweaker consistencyfor thebusinesscustomer andhis requirements when atransactional Java EE applicationis transformed to abig data architecture. Keywords: bigdata, , cap theorem, consistency, requirements,

1Introduction

The ACID propertyofrelational hasproventobe a valuable guarantee for specifying enterprise applicationrequirements. Business customersare enabled to express their business needsas if they are sequentiallyexecuted.The complicated details of concurrencycontrol of transactions andmaintaining consistencyare delegated to the . ACID has become asynonymfor a promise of effective andsafe concurrency of business services. Andbusinesscustomers have learnedtoassume this as givenand as a reasonable requirement. The broadscale introductionofNoSQL[Ev09][Fo12] databases to enterprises for solvingbig data business needsintroducesanewclass of promises to the business customer.Strong consistencyas promised by ACID is replacedbythe rather weak eventual consistency[Vo09]. The CAP-theorem[Br00] is the excuse of the NoSQL developers andgenerallyofthe bigdata architects to weaken consistencyinfavourof availability, althoughmodern NoSQL database also offerhigherlevels of consistency. In this paper, we discussthe consequences of the transformationofenterprise applications to abig data architecture. Ourfocus lies on the practical implications of looseningconsistency forthe business customersand their requirements.We consider the severity of the implications, the difficultytobalance these with the customer expectations andstrategies forremedies.

1 NovaTec Consulting GmbH,Dieselstrasse 18/1,70771 Leinfelden-Echterdingen,andreas.toenne@novatec- gmbh.de 248 Andreas Tönne

1.1Transformation Scope

The class of enterprise applications,which we consider are the OLTP (online transaction processing) applications.Asthe name implies, these applications criticallydependon transactional properties andfor this upholdingACID is seen mandatory.Our general interest liesinthe questiontowhat degree (and at what cost)can business needsthat todayrequire OLTP applications canbe transformedtothe bigdata paradigm.

1.2 BigDataChallenges

OLTP applications are under a strong bigdata pressure. The threemaincharacteristics volume, variety and velocity of bigdata also applytomanytoday's enterprise applications.The volume of data to handle increases throughnew relevant sourceslike forexample the Internet of Things or the Germangovernments Industrie 4.0initiative [Bm15].Alsoglobalizationingeneral increases the amount of data to handle. The varietyofdata increasesgreatly because structured data source are increasingly accompanied or replacedbyunstructuredsources like the WeborEnterprise 2.0 initiatives.Accelerationofbusinessprocesses andthe successofbig data analytics(data science)ofcompetitors increase the demandsfor a higher velocity, especiallycombined with the required higherdata volumes. We thinkit is fair to state that many OLTP applications already operate under bigdata conditions.However they show a strong inertia to complete the transformation. Their architecture hindersthe integrationofunstructureddata source.Their choice of for example Java EE with a relational database meansahard limitationofscalability.And as will be showninthispaper,the practical consequences andcompromises forbig data scale processing are threateningtothe businesscustomer.

1.3Contribution

We experiencedthese transformationchallengesinthe migrationofacustomer's middlewaresolutiontoabigdata architecture. The first transformationmilestone, establishing demonstrationcapabilities of the solution, wasanelevenman year effort. The architecture changedfromalayeredmonolithbased on Java EE andarelational database to a cloud readymicroservicestopology.We were usingthe Titan graph database [Th15],Cassandra[Ap15][LM10] as the NoSQL storage backendand Elasticsearch [El15] forindexing. The transformationwas critical to ourcustomerfor the simple reason that the existingimplementationwas fallingshort of the scaling requirements by threeordersofmagnitudes. The solutions main purpose is to provide ETL(extract, transform, load)services to enterprise applications in a flexible way. Structured andunstructureddata is extracted from a large range of sources(usingagents) andlosslessconverted to a commondata model. This data is persistedinthe style of a data lake[Fo15] but structured and On Practical Implications of TradingACID for CAPinthe Big Data Transformation 249 enriched with metadata. The metadata describes semantic andformal relations between the data recordssimilar to an ontology.The combined data structure is persisted in a distributed graphdatabase. The heart of the solutionare the analysis algorithmsand their qualityrequirements. Runningsuchanalysisconcurrently whilepreserving consistencyisacritical requirement. If forexample twodata recordsare analyzed fortheir relationships concurrently,duplicate relationslike "myusermatches your author"and "myauthor matches your user"would lead to wrongoverall relationweightsbetween the two records. Such concurrencyconflict is actuallyacommoncase: if sets of relatedrecords are modified in an application, these wouldbe imported foranalysisconcurrently andthe consistency of relations is threatened across many concurrent nodes. Consistencyrules like requiringuniqueness of commutative relations ("we have somethingincommon") were enforced in thebusinessrequirementsthrough uniqueness constraintsonthe database. It should be needlesstopoint out that this was ascalabilitykiller by design.

1.4Structure of this Paper

In thefollowing section2we introduce the reader to theCAP theoremand eventual consistency anddiscuss their relevance forscalability. Section 3isthe main result of this paper, showingthe implications of switchingto eventual consistency forscalabilityreasons to ourbusinessproject. Some remainingchallenges are described in section4.

2The CAP Theorem, Eventual Consistency andScalability

In this section, we provide an overview of theinfluences of the CAPtheorem and eventual consistencytoscalabilityconcernsofdistributed transactional applications.We motivate the decision to loosen consistencyuptothe point of eventual consistency.And we briefly show possibilitiestoachieve higher levels of consistency andtheir price in terms of scalabilityand effort.

2.1Scalabilityisthe New Top Priority

The transformationofapplicationarchitectures from transactional monoliths to distributed cloud service topologies, from traditional to weband bigdata, usuallycomes with a shift of priorities.Aninfluential statement forthisshift wasgiven by Werner Vogels, Amazon CTO: "Eachnode in a system should be able to make decisionspurelybased on local state. If youneedtodosomethingunder high load with failuresoccurring andyou need to reach 250 Andreas Tönne agreement, you’re lost. If you’re concerned about scalability, anyalgorithmthat forces youtorun agreementwill eventuallybecome your bottleneck.Take that as a given." [Vo07][In07] The shift was from consistency as the main promise forapplicationproperties to availability of services, whichistreated second-class by ACID systems. Andthe ruling regime,aspointed outbyVogels, is the (horizontal) scalability. These newarchitectures are characterized by beingextremely distributed,bothinterms of numberofcomputing anddata nodes anddistance measured in communicationlatency andnetwork partition probability. Data replicationisusedtoachieve better proximityofdata andclient and better availabilityinthe presenceofnetwork partitions.Alsofor bigdata, replicationis often the only sensible backup strategy.

2.2BASE Semanticsand Eventual Consistency

The conflict betweenACIDs consistency focusand the availabilitydemands of web basedscalable large distributed systemslead to the alternative BASE data semantics[Fo97] (basically available, soft state, eventual consistency)which uses the weak eventual consistency model. This basically meansthat after a write there is an unspecified period of time of inconsistencywhile the change is replicated through the distributed topology of data nodes. Which versionofthe data is presented at anode in this time window is pure luck.Eventuallyafter a period of no changes, the networkof data nodes stabilizes andagreesonthe last writtenversion. We do not want to embark on a discussion of otherstrongerconsistency models that are also offeredbysome NoSQLdatabases like Cassandra usingquorums or paxos protocols. When discussing this consistencyissue with business customersinthe light of their requirements,the interestingquestionis"howonearthdid they get away with such a lousyconsistency model?".

2.3CAP Theorem

The answer lies in the CAPtheorem [Br00] andits proof [GL02],which was(ab-)used by the developers of NoSQLdatabases to ignore thematterofstrongerconsistency to achievebetter scalabilityand availability, andjustify eventual consistency as a lawof nature.Any distributed system mayhave the following threeimportantproperties and Brewer's conjecture was that youcan only have twoofthese at anytime2: 1. Consistency,which meanssingle-copy consistency(ACID consistencywithout the data invariants): all nodes have the same versionofavalue.

2 Actually the proof by Gilbertand Lynchmakes useofanasynchronous modeland thus introduces the assumption that thereis no common notionoftimeinthe distributedsystem. This seems to be consistent with the intentionofBrewerbut limits thescope of CAPinpractical systems. On Practical Implications of TradingACID for CAPinthe Big Data Transformation 251

2. Availability, whichcan be characterized as thepromise:as long as a node is up, it always returnsanswers(afteranunboundeddurationoftime). 3. Partition Tolerance,which meansit is ok to lose arbitrarilymanymessages betweennodes. This theoremand its adoptionas a justificationfor eventual consistencysparked lasting discussions about its relevance andwhich choice of CA,CPorAPisthe best. Brewer himselfsummarizeshis view of the state of affairsofCAP [Br12] highlightingthe very important interpretationofCAP that "First, because partitions are rare,there is little reason to forfeit CorAwhen thesystemisnot partitioned.Second,the choice between Cand Acan occurmanytimes within thesame system at very fine granularity..." and "Because partitionsare rare,CAP should allowperfect Cand Amost of the time,but when partitionsare present or perceived, astrategythat detects partitionsand explicitly accountsfor them is in order." CAPisall about managing the exceptional case of a networkpartition,which is asplit-brain situation. The decision to take is called the partition decision by Brewer [Br12]:

" Cancel an operationafter a timeout anddecrease availabilityor

" Proceedwith the operationand risk inconsistency The matter of healingnetwork partitionsishandled by NoSQL implementations using forexample brute-forcestrategies like last-write-wins. Amore graceful solutionwould be to use vector clockstoidentify andhandle merge conflictsofpartitioned data change.

2.4NoSQL With Higher Levels Of Consistency

The NoSQL community initiallyhas settled forAPwitheventual consistency, inspired by Amazon'sDynamo database[De07].But we are seeing more andmore additions to NoSQL databasesthat allowstrongerconsistency models like forexample Cassandra [Da15] allowing a quorum consistencylevel. Thus the choice of AP or CP is no longer fixedbyour architecture andinCassandra forinstance it is possible at the level of individual readsand writes. There is howeveraprice attachedtohigherlevels of consistencies andthat is the degree of scalabilitywe canachieve.AsVogels pointed out in the above quote, anyneedfor agreement opposes scalability. Achievingstrong consistencywithaquorum or even naivelyrequiring all data nodes in a system to acknowledge a write will likely reduce scalabilityofabigdata scale system drastically. 252 Andreas Tönne

In ourexample application, lowering the consistency level to single node writesresulted in orders of magnitude speedups (throughput)while exposingarchitectural issues that we will describe in the following section3. The good news is that eventual consistencyisnot as badas it sounds.Recent analysis [BG13] showsthat eventual consistent databasescan achieve consistencyonafrequent basis within tens or hundreds of milliseconds.What this does not tell us is what damages are done during this period andhow to deal with them.

3Business Implications

This sectionisthe main result of the transformation of ourOLTPapplicationtoabig data stack with aNoSQL database. We discussthe tension in the project betweenthe business customer,requiring ACID properties that he was used to andthe requirement to achieveasubstantial increase in scalabilityand throughput. The reductioninconsistency leadstosome difficultquestions about thebusinessrequirements. We also give some technical details of howwe dealt with the errors introducedbythe reducedconsistency level.

3.1ACID Requirements

Judgingthe business implications of turningtowards a bigdata database with loosened consistency requires us to respect the business customer needsand expectations.Inan ideal world, a business service is specified underthe assumptionofperfect singularityof the executingsystem. No side effects, no concurrency, no temporal considerations,no otherusers,nofailures. Consistencyisdefined in terms of businessrules about data integrity. In the last decades, transactional systemsupholding the ACID properties were understood andaccepted by business customersas the nearest approximationofthis simplified ideal that couldbe delivered by their IT people. The ACID properties of a transactionguarantees the following beneficial guarantees for its execution: Atomicity -The guarantee that the transactionisexecuted in one atomic step (froman outside view)oraborted as awhole. Consistency -The guarantee that the transactiontransformsthe database from a consistentstate to anotherconsistent state. The notionofconsistency canvarylargely, dependingonthe capabilitiesofthe underlying database. Usuallythisincludes structural

3 Performance figures areunfortunately notavailable andmeaningfulsince thesystem went through arapid sequences of changes. On Practical Implications of TradingACID for CAPinthe Big Data Transformation 253 consistency rulesabout uniqueness andreferential integrityas well as data consistency rules. Isolation -The guarantee that concurrent transactions are isolated against their respective database changedtoaconfigurable degree. Durability -The guarantee that once atransactionwas committed (success),its changes are permanent, even in the presence of system failures. ACID hasproventobe a simple enough model to talk about requirements in the presence of such nastythingsas concurrencyand failures. At times, thearchitect would come back with "Would it be allowedthat..." questions, askingfor exceptions to the simplified worldand complicatingthingsabit. The I(Isolation) in ACID is the adjusting screwfor such optimizations of SQLsystems possiblyaffecting theirconsistency. Lockingand database constraintsonuniquenessare commonly used in requirements to hedge against consistencyerrorsand to control concurrency.

3.2Dropping ACID to Achieve Higher Scalability

It is common wisdom that bigdata is a disruptiveconcept that requires the transformationofthe businesses core foradoption. Business requirements forachieving business goals need to be redefinedtoreflect the paradigmsofbig data architectures. That means: What promises forconsistency,availability andscalabilitycan your IT people deliver on a bigdata scale? Taking a technology approach by replacingrelational databases by NoSQLand call it good is no option. Unfortunately as we experiencedin the mentionedproject this is an all tooalluring idea to waive it easily.The business customer must be educated in the consequences of asking to "solve the problem with big data".And this meanstofindthe sweet spot in the triangle of looseningconsistency, achievable scalabilityand implementationeffort. In ourexample, we opted forlooseningconsistency as much as possible because that wasthe quickroute4 to deliver a system with the necessary performance. Goingfor eventual consistency meanstodropACIDinfavourofaBASE semantics of transactions.That in turn exposedall the unknowninvariantsofthe analysis algorithms. Dropping ACID also created an enormous technical dept to handle the consistency errors.

3.3ConsequencesofLowering Consistency

We mentionedthe uniqueness constraint on commutative relations betweenrecords.The analysis requirements disalloweddual relations of the kind "Myuserisyourauthor" and "Myauthorisyouruser" sincetheyare considered commutative andduplicatingthem

4 We hadtodeliver the demo readyversion within sixmonths. 254 Andreas Tönne puts a toostrongemphasisonthe relationshipofthe tworecords.Trading ACID foran unconstrained database created questionsfor this example:

" Howoften does this constraintviolationtake place in practice?

" Howsevere is theerror introduced? - Follow-up question: Has anyone a measure formeta annotationerrors?

" Do otheralgorithms usingthe meta annotationbreak on the duplicationitself or on the annotationweight error? - Follow-up question: Whodepends on this annotationanyway? External clientsoralso internal algorithms?

" Canwe ignore the errorfor some time or is there a build-upeffect over time?

" Do we need to filter the duplicate annotations at read time or canwe repair the consistency at intervals? - Follow-up question:What wouldbe a reasonable interval? These questionsled to rather uncomfortable andtime-consumingmeetingswiththe business customer.The customer didnot have aneedtothink about these issues before; he wasshielded from them by a single requirementofuniquenessofcommutative relations. As Brewer [Br12] noted: "Anotheraspect of CAPconfusion is the hiddencost of forfeiting consistency,which is theneedtoknowthe system’s invariants.The subtle beauty of a consistent system is that theinvariantstendtoholdevenwhenthe designer does not knowwhat they are. Consequently,awide range of reasonable invariants will work just fine." We observedafewdistinct typesofissueswithweakening consistency anddropping constraints that have distinct reasonsand justifications.The following sections show a fewcommoncases.

3.4Internal PhantomReads

In ourexample, this was the type of effect with the most qualitydamages.Thisisthe premier effect of eventual consistency.Aninternal phantomreadhappens when one concurrent service executionrunsinisolation(duetothe eventual consistencytime window)toanother executionand sees oldversions of data modified by thesecond execution. Example: tworelated recordsare analyzedinparallel anddue to badluckof replicationone analysis does not seethe other'supdated record.Thisissevere sinceit meansthat the analysis resultsbecome random forrecords imported in a tight time window.Alsoconsiderthatthere is ahigherprobabilityfor a meaningful relation On Practical Implications of TradingACID for CAPinthe Big Data Transformation 255 betweentwo recordsreceived at the same time.We might lose valuable analysis results. In a transactional system,thisscenariocalls foraserializationofexecutionwithlocking. Asolutionwas to keep a journal of analysis orderand determine those imports that might need a re-analysistostabilize the annotationconsistency.

3.5Where is my Data?

Eventual consistencydoes notonlyaffect concurrent executionbut sometimes also serial execution. This dependsonthe implementationofthe replicationstrategy of the database but also on the architecture of the system itself.Some databases do notgive aguarantee that data writtencan be immediatelyreadback, even at the same node,because the replicationpreparationtakes some time.We experiencedsuchproblemswhenchaining analysis algorithms. Asimilar error(but not consistency caused)can take place with the Elasticsearch refreshinterval. Here we didnot find ourown analysis resultsinthe index forashort time until Elasticsearchhad prepared andexecuted the indexrefresh. We only foundabrute-forcesolutiontoassure that chains of algorithms canworkon their ownresults,apart from passingthe results betweenthe algorithmsteps. We needed to use delays,eitherexplicitly betweeneachstep or as retries with an exponential backoff.

3.6Zombie Data

Combiningupdates anddeletesinaconcurrent system givesyou the usual zombie problem if updates anddeletes of the same record overruneachother.We employed measures like timestampstoensure a monotonic orderofanalysisexecutions fora record.And we wrote tombstones fordeletedrecords to distinguishtrue data from zombies. Tombstoningbecame an issue because we couldnot uselocking forrecordupdates. Acquiringlocks in a distributeddatabase like Cassandra is a very expensive operation that requires a consensusbetween all accessible nodes. Since updates were a very frequent operation, this wasnot acceptable. So there is a very small chance that a tombstoneisoverwritten, creatingazombie record.Asusual with bigdata, small chancesbecome significant givenenoughdata updates. This problem awaits a solution still.

3.7Who is Unique?

Assuringuniquenesshas thesame costsasadistributed lock forrecordupdates. The key strategytoavoid this cost wastodetachthe creationofunique data from the remaining analysis algorithms.Some of the unique recordswere pre-created in a startupphase from 256 Andreas Tönne sample data to reduce frequent locks. This wasparticularly successful forsetsofunique data with an upperbound.For example wordsofaparticular language.

3.8Probabilistic Truth

This is an issue that originatesfromthe distributed nature of the database itself andfrom the bigdata scale. Keeping acorrect tallyofdata canbe tooexpensive at real-time and one has to use approximationtechniques. Consider forexample the numberofdistinct wordsinthe whole database. Keeping such tallydoes require some sort of synchronizationbetween theconcurrent updates. We experimented with Cassandra counter columns[Da14] andwithelastic cachingapproaches usingHazelcast[Ha15] but finallydecided to use a probabilistic approach with a definederror rate, based on a Google implementation[HNH13] of HyperLogLog[Fl07].

4Outlook

Ourexperience with the transformationofour OLTP Java EE / Relational applicationto a native bigdata stack showsthe importance of a hybrid persistency approach.As advocates of ACID forNoSQL[Pi15] highlight, there is a need forthe optiontochoose strongerconsistency at high performance. In a hybrid approach,we couldseparate data needinglinearizabilityconsistency andweakerconsistency.For example, dumpingthe mass of data in HBase andkeeping the strongly consistent references to that data elsewhere. Google Spanner[Cr13] showsthat by having an accurate andcoherent distributed clock,one canachieve strong consistencyat a reasonable cost. Anotherideawe want to pursue is to avoidcertain typesofconsistency errors using CRDT(commutative replicated data types)[Sh11] to propagate conflict free distributed changes in the case of networkpartitions. The case of internal phantomreads canbe considered to be effected by a short time networkpartition.We wouldbe happyto resolve this partition conflict-free.The riak [Ba15] showshow to combine NoSQLconceptswithCRDT[Ba15a] to achieve conflict-free data types.

5Conclusions

Transforming existingbusinessrequirementsthat are implemented by OLTP applications to a bigdata solutionusing today'sbest practice bigdata technology means to give up many benefits of ACID transactions andespeciallystrong consistencyin general. We showed ourpractical experienceswithasizable (and from thecustomers perspective very successful) transformationofanexistingOLTPapplicationtoabigdata stack. This example showsthat although the weakeningofconsistency to the level of eventual On Practical Implications of TradingACID for CAPinthe Big Data Transformation 257 consistency does not prohibit this transformation, it still creates a significant dept. We gave some examples of the errors introducedbythe lowerconsistency andhow we dealt with them technically. The cost of compensating forconsistency defectsand the achievable scalabilityhad to be compared andjudgedtogether with the business customer.The effort forthe transformationofthe business requirements andthus theeffort put on the business customer andarchitect turned out to be significant andit needed to be carefully accompanied by consulting andmentoring. Invariants of businessrequirementsthat are usuallyhiddenbythe ACID properties and database locksand constraints become important when high scalabilityneeds to be achieved with a NoSQLdatabase.

References

[Ap15] Apache: WelcometoApacheCassandra, http://cassandra.apache.org, April 9, 2015. [Ba15] basho: riak product site, http://basho.com/riak/, April 92015. [Ba15a] basho: HowtoBuild aClient Libraryfor Riak 2.0, http://basho.com/tag/crdt/,April 9 2015. [BG13] Bailis, P.;Ghodsi, Al.: Eventual ConsistencyToday: Limitations, Extensions, and Beyond. acmqueue volume 11, issue3,April 9, 2013. [Bm15] Bundesministeriumfür Bildungund Forschung: Zukunftsprojekt Industrie 4.0, http://www.bmbf.de/de/9072.php, April 9, 2015. [Br00] Brewer, E.: Towards Robust Distributed Systems. In: Proceedings of the19thAnnual ACMSymposiumonPrinciples of Distributed Computing(PODC '00),pages 7-10, 2000. [Br12] Brewer, E.: CAPTwelveYearsLater: Howthe 'Rules'HaveChanged.Computer, pages 23-29, 2012. [Cr13] Corbett, J. et al.: Spanner: Google's Globally-Distributed Database. In: ACM Transactions on Computer Systems (TOCS), volume 31, issue3,2013. [Da14] DataStax: What's NewinCassdra 2.1: Better Implementation of Counters. http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better- implementation-of-counters,May 20, 2014. [Da15] DataStax: ConfiguringData Consistency. http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.ht ml,April 9, 2015. [De07] DeCandia, G. et al.: Dynamo: Amazon's HighlyAvailable Key-ValueStore.In: SOSP '07Proceedings of twenty-first ACMSIGOPSsymposiumonOperating systems principles,pages 205-220, 2007. 258 Andreas Tönne

[El15] Elasticsearch BV: Elasticsearch |Search&AnalyzeData in Real Time, https://www.elastic.co/products/elasticsearch, April 9, 2015. [Ev09] Evans, E.: EricEvan's Weblog,http://blog.sym-link.com/2009/05/12/nosql_2009.html, April 9, 2015. [Fl07] Flajolet, P. et al.: HyperLogLog:the analysis of anear-optimal cardinality estimation algorithm.In: AOFA '07: Proceedings of the2007InternationalConference on the Analysis of Algorithms, 2007. [Fo97] Fox, A.;Gribble, S.; Chawate, Y.; Brewer, E.;Gauthier, P.: Cluster-BasedScalable NetworkServices. In: Proceedings of theSixteenthACM SymposiumonOperating Systems Principles (SOSP-16),1997. [Fo12] Fowler, M.: NosqlDefinition,http://martinfowler.com/bliki/NosqlDefinition.html, April 9, 2015. [Fo15] Fowler, M.: DataLake, http://martinfowler.com/bliki/DataLake.html, April 9, 2015. [GL02] Gilbert, S.;Lynch,N.: Brewer's Conjecture andthe FeasibilityofConsistent, Available, Partition-Tolerant WebServices. In: ACMSIGACTNews33, pages 51-59, 2002. [Ha15] Hazelcast: Companysite, http://hazelcast.com, April 92015. [HNH13]Heule, S.;Nunkesser,M.; Hall,A.: HyperLogLoginPractice: Algorithmic EngineeringofaState of TheArt CardinalityEstimation Algorithm.In: EDBT '13 Proceedings of the16thInternational Conference on Extending DatabaseTechnology, 2013. [In07] InfoQ: Presentation: AmazonCTO Werner Vogels on Availability andConsistency, http://www.infoq.com/news/2007/08/werner-vogels-pres, April 9, 2015. [LM10] Lakshman,A.; MalikP.: Cassandra: adecentralizedstructuredstorage system. In: ACMSIGOPS OperatingSystems Review Volume 44 Issue 2, pages 35-40, 2010. [Pi15] Pimentel, S.: TheReturnof ACID in theDesignof NoSQLDatabases, http://www.methodsandtools.com/archive/acidnosqldatabase.php, April 92015. [Sh11] Shapiro, M. et al.: Acomprehensivestudy of convergent andcommutativereplicated data types. INRIATechnical Report RR-7506, 2011. [Th15] Thinkaurelius: TitanDistributed GraphDatabase, http://thinkaurelius.github.io/titan/, April 9, 2015. [Vo07] Vogels, W.: Availability &Consistency. QCon2007. [Vo09] Vogels, W.: Eventually Consistent.Communications of theACM 52,2009.