Transactions at the PB Scale with Marklogic Server
Total Page:16
File Type:pdf, Size:1020Kb
ACID Transac,ons at the PB Scale with MarkLogic Server A talk by Nuno Job, Welcome to Berlin Buzzwords 2011! PB Scalable Transactions @dscape | #bbuzz portugal, new york, toronto, san francisco, london! 83! 08! 09! 10! 11! past.! H E L L O my name is present.! stuff I like! ! Nuno! @dscape foreseeable future.! the idea ! search! engines! Structure ims! rdbms! idms! dinosaurs ! are ! Pre-defined Ad-hoc fun!! (o0)—’,--! Pre-defined Ad-hoc! ! Queries modern database! relational! search ! NoSQL! ancestors! bloom! engine! 60’s! 80’s! 90’s! 00’s! timeline! PB Scalable Transactions @dscape | #bbuzz a database for unstructured information unstructured! also stores:! schema-less*! c++ core! text and binaries! easy evolution.! ~ pb scale! xml or json.! features! acid, backups! replication, query ! native search! language (xquery).! a database built! on a search engine?! stop shredding your data! no tables, rows, columns! !!!start storing data as is! thinkin’ documents! uris? looks like a filesystem! * they have this universal index thing.! an inverted index that is structure aware! PB Scalable Transactions @dscape | #bbuzz application server single tier:! XQuery: ! - no boundaries ! dynamic, functional! between languages! programming language! - smaller stack! - king is dead!! long live the king!! features:! ! stop exposing your database! - easy geospatial! !start exposing your data! - http client! !! - facets! - alerting! - store applications! apps! in the database! marklogic! - url rewriting! api! ! apps! other dbs! github.com/dscape/rewrite! apps! ( for rails like routing, session later on )! REST! PB Scalable Transactions @dscape | #bbuzz In MarkLogic we were thinkin’ thinkin’ documents.! ✓ Unstructured Informaon flexible! json or xml?! ✓ Mul,ple TBs to PB scale ! big! ✓ Sub-second response mes fast! ever wonder what! that lotus flower! ✓ Data immediately durable realtime! video-clip is all ! ✓ about?! ! Mix of complex database Queries, aler,ng, full text search, transformaons, love to talk ! about this! geospaal, and real ,me analy,cs. unstructured? To be great, be whole; ! blank verse! Exclude nothing, ! line! exaggerate nothing that is not you. ! Be whole in everything. ! Put all you are ! Into the smallest thing you do. ! So, in each lake, the moon shines ! Because it blooms up above.! buttons! ! thread! Ricardo Reis, Odes! Wool! author! silk! title! poem! unstructured! relational! closet! closet! universal index universal index:! slides:! an inverted index that understands http://www.slideshare.net/cbiow/ structure, organization, and security! mark-logic-strangeloop-2010! “berlin” 123, 126, 130, 152, … “buzzwords” 122, 125, 126, 130, … “fast” 123, 126, 130, 142, … “nosql” 123, 130, 131, 135, … Document “hadoop” 125, 131, 167, 212, … References <preso> 122, 126, 130, 131, … <preso> / <tle> 126, 130, 131, 167, … 126, 130 <y>2011</y> 122, 126, 130, 131, … pub Like learning how ! nosql relational worked in! Editor Read the 80s “It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change.”! " - Charles Darwin ACID Helps Atomicity Either all operations of the transaction - Easy to reason about are correctly executed or none is. data! - Guaranteed persistent! Consistency state Database will remain in a consistent state after the transaction commits. Hurts Isolation In a concurrent transactional system - Hard to scale transactions are unaware of each other. horizontally! - Hard to assure high Durability availability After a transaction completes, changes persist even if the system fails.! PB Scalable Transactions @dscape | #bbuzz CAP Consistency mysql Each client always has the same view of the data. Availability All clients can always read and write. Partition Tolerance redis riak System works well across physical network partitions.! Pick Two! credit: blog.nahurst.com! /visual-guide-to-nosql-systems PB Scalable Transactions @dscape | #bbuzz “It’s naive to explain NoSQL with CAP... for x tending to infinite it's like stating that in the world there are just 3 databases.”! " - Salvatore Sanfillipo, @antirez “There is a magic bullet! ! It's called relaxing the requirements.”! " - Evan Weaver, @evan scaling an inverted index * journaled! durability ++ ! memory disk Ingestion is limited to a size where indexes are manageable! ! On query both in memory and on disk stands behave the same -> transparent to the developer! ! Means:! Fast ingestion with Log-Structured transactions!! Merge-Tree! ! (LSM-Tree)! Zero-latency ingestion and Indexing! PB Scalable Transactions @dscape | #bbuzz “You cannot take a car, grow it 10 times and expect to get a mining truck.”! " - Ivan Pepelnjak, @ioshints divide and conquer ! : ease of use level of abstraction even distribution! database across nodes! stand! a group of trees! makes sense to ! par,,on1 par,,on2 par,,on3 have indexes in ! the same stand! PB Scalable Transactions @dscape | #bbuzz shared nothing cluster E Host 1 E Host 2 E Host n D Host 4 D Host 5 D Host 6 D Host k par--on1 par--on2 par--on3 par--on4 par--onm PB Scalable Transactions @dscape | #bbuzz MVCC Append only database! ! High Throughput! Queries don’t require locks! Queries and Updates do ! not conflict! ! ACID! Cluster consistency: 2-phase commit PB Scalable Transactions @dscape | #bbuzz mvcc queries never lock!! Series 1 delete eric.json replace node mary.json maria.json insert john.json System query ,mestamp 0 5 10 15 20 25 PB Scalable Transactions @dscape | #bbuzz How does the 2-phase commit work? insertinsert-childdelete “/“foo.jsonfoo.json ””, { “foo”: “” } Journal A insert“foo.deletejson “/“bar.jsonbar.json”.foo, ””, { “bar”: “” } Insert fragment 123 “/foo.json” “stuff” Distributed Begin, A added(123), B added(234) Shard A Commit, ,mestamp 1, added (123) Distributed End ID ✔ ✗ URI Insert fragment 345 “/foo.json” Commit, ,mestamp 2, added (345), deleted (123) Distributed Begin, A added(123), B added(234) 123! 1! 2 /foo.json! Commit, ,mestamp 3, deleted (34567) 345! 2 3 /foo.json! Distributed End Journal B Shard B Insert fragment 234 “/bar.json” ID ✔ ✗ URI Prepare Commit, added (123) Prepare 234! 1! 3 /bar.json! Commit, deleted (234) doesn’t lock documents, locks uris! PB Scalable Transactions @dscape | #bbuzz developer.marklogic.com awesome project btw…! 365q.ca “You have database problem. You research blog and HN. You start use NoSQL product. Now you not know anymore if you have problem.”! " - Devops BORAT, @devops_borat Questions? @dscape PB Scalable Transactions @dscape | #bbuzz .