Consistency at Facebook Vijay Chidambaram CS 380D Spring 2018 Consistency Models • Why do we care? • Stronger consistency models are easier to reason about (and program for), but more expensive to obtain • Weaker consistency models provide more performance, but hard to understand and program for Linearizability • Talks about single operations on single objects • Literally means: “did the operations happen in a straight line (one after the other)?” • Once a write completes, all reads on this object should return the written value • Once a read returns value V1, all reads have to return V1 or later values Linearizable Schedule • A schedule of events that satisfies previously stated properties • Given a linearizable execution of a collection of objects, it is impossible to tell if the system was distributed or not Linearizable System • Every execution it produces is a linearizable execution (linearizable schedule) • A linearizable system cannot be differentiated based on external behavior from a single node system • Thus, programming for a linearizable system is as easy as programming for a single node system Formal definition Real-Time Partial Order: read(1) < read(2) read(2) < snapshot write(1) < write(2) write(1) < read(2) < write(3) read(1) < write(2) < snapshot Important: read(2) and write(2) are concurrent Important: snapshot and write(3) are concurrent Example of Linearizable execution Serializability • Guarantees execution of a set of operations (usually each a transaction) is equivalent to some serial execution order • Given operations A1, and A2 serializability only demands that the execution order is A1 followed by A2 or A2 followed by A1 • Serializability makes it seem as if there are no concurrent operations, everything happened one after another Strict Serializability • Combines linearizability and serializability • Transactions need to happen in real-time order • T1 and T2 are executing concurrently • T1 writes object A, and later T2 reads object A • Strict Serializability: T1 before T2 • Serializability: T2 before T1 also valid • In this case, T2 will read old value of object A Serializability • Serializability and linearizability both need coordination • Expensive to obtain • Most systems do not provide these properties Sequential Consistency • Relaxation of linearizability • Instead of conforming to a real-time partial order, we use a client-observed partial order In this system, the write ends as soon as requests sent out write(3) < snapshot But snapshot returns 0 for read(3) Not linearizable Client-partial order does not order events across location snapshot and write(3) are concurrent in client-partial order Sequentially consistent LZ and SC • Sequential consistency is a weaker model than linearizability • All linearizable schedules are sequentially consistent • But the other way around does not hold Facebook Study • Analyzed a small portion of the Facebook traffic to the TAO graph system • Analyzed what consistency models hold • Analyzed when readers get anomalous results Facebook Data Model • Graph Data Model • Vertex: unique ID + data • Edges: between two vertexes, contains data, indexed by source vertex Database • Horizontally sharded, geo-replicated database • Each region has a full copy • Each shard has a master which asynchronously updates the other regions Caching • Root cache sits in front of the database • Leaf caches sit in front of the root caches • Write-through caches Local Consistency Models • If each object provides C, whole system provides C • Used in study because this can be tested with sampling • Testing linearizability requires testing all objects Consistency Models Considered • Local Consistency Model • Linearizability • Per-Object Sequential Consistency • Read-after-Write Consistency • Eventual Consistency • Facebook Consistency: per-object sequential consistency + read-after-write (per-cache) + eventual (across caches) Analysis • Trace all requests to a small subset of vertices and their edges • Traces include invocation time (IT), response time (RT), user id • IT and RT used to determine real-time partial order • User id used to determine per-client partial order • Check • Per-cluster, per-region, and global consistency Analysis • Clocks synchronized using Network Time Protocol (NTP) • 99.9th percentile skew was observed to be 35 ms • Subtract 35 ms from invocation time • Add 35 ms to response time • Analysis done over 12 days, 2B requests, 939M vertices, 1.8B edges Why is 5 (a) a problem? Lack of total ordering: r1 observes w1 after w2 r2 observes w2 and w1 Why stale reads? Replication lag. Invalidations are async Why total order anomalies? Replication lag. “Likes” on FB Different users reading different version cause 60% of the Why so low anomalies? anomalies Low frequency of writes (1 in 450 reqs was a write) (harmless) Request locality Why more anomalies on edges? More writes (1 in 188 reqs) Practical Consistency Monitoring • Previous analysis was offline analysis performed at the end of the day • Real-time analysis • Phi(P) consistency: frequency that reads return same value from replicas • Inject reads in different caches, observe results Sources of errors • Misconfiguration errors • Developer errors: • Caching failures • Negative caching • Multiple levels of invalidation.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages27 Page
-
File Size-