Scale Out and Conquer: Architectural Decisions Behind Distributed In-Memory Systems
Denis Magda Apache Ignite PMC Chair GridGain Product Management
© 2018 GridGain Systems, Inc. GridGain Company Confidential Agenda
• Partitioning – Pitfalls of Even Distribution
• Affinity Collocation – Laid Out in Numbers
• Multi-threading of Distributed Systems – Things to Know
© 2018 GridGain Systems, Inc. GridGain Company Confidential Memory-centric distributed database, caching, and processing platform
Memory-centric platform based on Apache Ignite adds enterprise features and support for mission critical deployments
© 2018 GridGain Systems, Inc. GridGain Company Confidential Among Top 5 Apache Projects Total number of Apache Projects: 318 Top-Level Apache Projects: 193
Top 5 Apache Projects by Commits Top 10 most active Apache mailing lists 1. Flex 1. Hadoop 2. Lucene 2. Ambari 3. Ignite 3. Lucene-Solr 4. Kafka 4. Camel 5. Geode 5. Ignite 6. Flink 7. Tomcat 8. Cassandra 9. Beam 10. Sentry © 2018 GridGain Systems, Inc. GridGain Company Confidential Partitioning – Pitfalls of Even Distribution
© 2018 GridGain Systems, Inc. GridGain Company Confidential Where request goes?
© 2018 GridGain Systems, Inc. GridGain Company Confidential Affinity Function
Server Node
Key Partition
ON-DISK
© 2018 GridGain Systems, Inc. GridGain Company Confidential Affinity Function
© 2018 GridGain Systems, Inc. GridGain Company Confidential Naïve Affinity
© 2018 GridGain Systems, Inc. GridGain Company Confidential Naïve Function: New Node
© 2018 GridGain Systems, Inc. GridGain Company Confidential Naïve Function: What’s Wrong?
© 2018 GridGain Systems, Inc. GridGain Company Confidential Naïve Affinity: Redundant Partitions Re-shuffling!
© 2018 GridGain Systems, Inc. GridGain Company Confidential Productized Affinity Functions
• Consistent hashing [1] • Rendezvous hashing (HRW) [2]
[1] https://en.wikipedia.org/wiki/Consistent_hashing [2] https://en.wikipedia.org/wiki/Rendezvous_hashing
© 2018 GridGain Systems, Inc. GridGain Company Confidential Rendezvous Affinity: Weight Calculation
WEIGHT(partition, node)
© 2018 GridGain Systems, Inc. GridGain Company Confidential Rendezvous Affinity: Weight Calculation
WEIGHT(partition, node)
© 2018 GridGain Systems, Inc. GridGain Company Confidential Rendezvous Affinity: Weight Calculation
WEIGHT(partition, node)
© 2018 GridGain Systems, Inc. GridGain Company Confidential Rendezvous Affinity: Two Nodes to Three
© 2018 GridGain Systems, Inc. GridGain Company Confidential Rendezvous Affinity: Distribute Evenly at Best
© 2018 GridGain Systems, Inc. GridGain Company Confidential Rendezvous Affinity: Possible Distribution
© 2018 GridGain Systems, Inc. GridGain Company Confidential Affinity Collocation – Laid Out in Numbers
© 2018 GridGain Systems, Inc. GridGain Company Confidential Running Transaction: No Collocation
1: class Account { 2: long id; 3: long city; 4: }
© 2018 GridGain Systems, Inc. GridGain Company Confidential Running Transaction: No Collocation
© 2018 GridGain Systems, Inc. GridGain Company Confidential Non Collocated Data: Number of Operations
2 (2 nodes)
© 2018 GridGain Systems, Inc. GridGain Company Confidential Non Collocated Data: Number of Operations
2 (2 nodes) 2 (primary + backup)
© 2018 GridGain Systems, Inc. GridGain Company Confidential Non Collocated Data: Number of Operations
2 (2 nodes) 2 (primary + backup) 2 (two-phase commit)
© 2018 GridGain Systems, Inc. GridGain Company Confidential Non Collocated Data: Number of Operations
2 (2 nodes) 2 (primary + backup) 2 (two-phase commit) 2 (request-response) -- 16 network operations
© 2018 GridGain Systems, Inc. GridGain Company Confidential Network Operations!
© 2018 GridGain Systems, Inc. GridGain Company Confidential Running Transaction: Collocated Data
1: class Account{ 2: long id; 3: 4: @AffinityKeyMapped 5: long city; 6: }
© 2018 GridGain Systems, Inc. GridGain Company Confidential Running Transaction: Collocated Data!
© 2018 GridGain Systems, Inc. GridGain Company Confidential Collocated Data: Number of Operations
1 (1 node)
© 2018 GridGain Systems, Inc. GridGain Company Confidential Collocated Data: Number of Operations
1 (1 node) 2 (primary + backup)
© 2018 GridGain Systems, Inc. GridGain Company Confidential Collocated Data: Number of Operations
1 (1 node) 2 (primary + backup) 1 (one-phase commit)
© 2018 GridGain Systems, Inc. GridGain Company Confidential Collocated Data: Number of Operations
1 (1 node) 2 (primary + backup) 1 (one-phase commit) 2 (request-response) -- 4 network operations
© 2018 GridGain Systems, Inc. GridGain Company Confidential Let’s Run an SQL Query – No Collocation
© 2018 GridGain Systems, Inc. GridGain Company Confidential Executing SQL: Full Scan
• 1/3x latency • 3x throughput
© 2018 GridGain Systems, Inc. GridGain Company Confidential What About Indexes?
© 2018 GridGain Systems, Inc. GridGain Company Confidential Index Search Complexity log 1_000_000 ≈ 20 vs. log 333_333 ≈ 18 log 333_333 ≈ 18 log 333_333 ≈ 18
© 2018 GridGain Systems, Inc. GridGain Company Confidential Executing SQL: Indexes Search
• ~same latency • ~same throughput
© 2018 GridGain Systems, Inc. GridGain Company Confidential Let’s Collocate and Run SQL Again
© 2018 GridGain Systems, Inc. GridGain Company Confidential Executing SQL: Indexed Search and Collocation
• same latency • 3x throughput
© 2018 GridGain Systems, Inc. GridGain Company Confidential Multi-threading of Distributed Systems – Things to Know
© 2018 GridGain Systems, Inc. GridGain Company Confidential How Requests/Tasks/Queries Processed?
© 2018 GridGain Systems, Inc. GridGain Company Confidential Requests Processing: Bottleneck!
© 2018 GridGain Systems, Inc. GridGain Company Confidential Making Situation Worse
© 2018 GridGain Systems, Inc. GridGain Company Confidential Cluster Node Chokes!
© 2018 GridGain Systems, Inc. GridGain Company Confidential Tread-Per-Partition and Workers
© 2018 GridGain Systems, Inc. GridGain Company Confidential Thread-Per-Partition and I/O Queues
© 2018 GridGain Systems, Inc. GridGain Company Confidential Thread-Per-Partition: Not a Magic Bullet
© 2018 GridGain Systems, Inc. GridGain Company Confidential Thread-Per-Partition: Not a Magic Bullet
© 2018 GridGain Systems, Inc. GridGain Company Confidential Upshot
© 2018 GridGain Systems, Inc. GridGain Company Confidential Upshot
• Partitioning – it’s not about even distribution only
• Affinity Collocation – a golden concept of distributed systems
• Business Data Model should be adopted/reconsidered
• Thread-per-partition – speeds up most but not all usage scenarios
© 2018 GridGain Systems, Inc. GridGain Company Confidential These Days in New York
Docker NYC Meetup, Tomorrow, 6:00 PM • Location: PeerSpace, 1740 Broadway, 15th Floor • Distributed Database DevOps Dilemmas? Kubernetes to the rescue
https://ignite.apache.org/events.html
© 2018 GridGain Systems, Inc. GridGain Company Confidential Any Questions?
#apacheignite Thank you for joining us. Follow the conversation. #gridgain https://ignite.apache.org #TheASF #denismagda
https://www.meetup.com/NYC-In-Memory-Computing-Meetup/
© 2018 GridGain Systems, Inc. GridGain Company Confidential