Building Large Tarantool Cluster with 100+ Nodes Yaroslav Dynnikov

Building Large Tarantool Cluster with 100+ Nodes Yaroslav Dynnikov

Building large Tarantool cluster with 100+ nodes Yaroslav Dynnikov Tarantool, Mail.Ru Group 10 October 2019 Slides: rosik.github.io/2019-bigdatadays 1 / 40 Tarantool = + Database Application server (Lua) (Transactions, WAL) (Business logics, HTTP) 2 / 40 Core team 20 C developers Product development Solution team 35 Lua developers Commertial projects 3 / 40 Core team 20 C developers Product development Solution team 35 Lua developers Commertial projects Common goals Make development fast and reliable 4 / 40 In-memory no-SQL Not only in-memory: vinyl disk engine Supports SQL (since v.2) 5 / 40 In-memory no-SQL Not only in-memory: vinyl disk engine Supports SQL (since v.2) But We need scaling (horizontal) 6 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 7 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 8 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 9 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 10 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 11 / 40 Vshard - horizontal scaling in tarantool Vshard assigns data to virtual buckets Buckets are distributed across servers 12 / 40 Vshard configuration Lua tables sharding_cfg = { ['cbf06940-0790-498b-948d-042b62cf3d29'] = { replicas = { ... }, }, ['ac522f65-aa94-4134-9f64-51ee384f1a54'] = { replicas = { ... }, }, } vshard.router.cfg(...) vshard.storage.cfg(...) 13 / 40 Vshard automation. Options Deployment scripts Docker compose Zookeeper 14 / 40 Vshard automation. Options Deployment scripts Docker compose Zookeeper Our own orchestrator 15 / 40 Orchestrator requirements Doesn't start/stop instances (systemd can do that) Applies configuration only Every cluster node can manage others Built-in monitoring 16 / 40 Clusterwide configuration Must be the same everywhere Applied with two-phase commit topology: servers: # two servers s1: replicaset_uuid: A uri: localhost:3301 s2: replicaset_uuid: A uri: localhost:3302 replicasets: # one replicaset A: roles: ... 17 / 40 Which came first? Database Orchestrator tcp_listen() apply_2pc() 18 / 40 Membership implementation SWIM protocol - one of the gossips protocols family 19 / 40 Membership implementation SWIM protocol - one of the gossips protocols family 20 / 40 Membership implementation SWIM protocol - one of the gossips protocols family 21 / 40 Membership implementation SWIM protocol - one of the gossips protocols family 22 / 40 Membership implementation SWIM protocol - one of the gossips protocols family 23 / 40 Membership implementation SWIM protocol - one of the gossips protocols family Dissemination speed: O(logN) Network load: O(N) 24 / 40 Membership implementation SWIM protocol - one of the gossips protocols family Dissemination speed: O(logN) Network load: O(N) PING ! ! + payload ACK ! ! 25 / 40 Bootsrapping new instance 1. New process starts 2. New process joins membership 3. Cluster checks new process is alive 4. Cluster applies configuration 5. New process polls it from membership 6. New process bootstraps 7. Repeat N times 26 / 40 Bootsrapping new instance 1. New process starts 2. New process joins membership 3. Cluster checks new process is alive 4. Cluster applies configuration 5. New process polls it from membership 6. New process bootstraps 7. Repeat N times N = 100 27 / 40 Benefits so far 1. Orchestration works 28 / 40 Benefits so far 1. Orchestration works 29 / 40 Benefits so far 1. Orchestration works 30 / 40 Benefits so far 1. Orchestration works 2. Monitoring works 31 / 40 Benefits so far 1. Orchestration works 2. Monitoring works 32 / 40 Benefits so far 1. Orchestration works 2. Monitoring works 3. We can assign any role to the instance 33 / 40 Benefits so far 1. Orchestration works 2. Monitoring works 3. We can assign any role to the instance 34 / 40 Role management function init() 35 / 40 Role management function init() function validate_config() function apply_config() 36 / 40 Role management function init() function validate_config() function apply_config() function stop() 37 / 40 Refactoring the bootstrap process Assembling large clusters with 100+ instances is slow N two-phase commits are slow N config pollings is slow 38 / 40 Refactoring the bootstrap process Assembling large clusters with 100+ instances is slow N two-phase commits are slow N config pollings is slow Solution Bootstrap all instances with a single 2pc Re-implement binary protocol and reuse port 39 / 40 Links tarantool.io github.com/tarantool/tarantool Telegram - @tarantool, @tarantool_news Cartridge framework - github.com/tarantool/cartridge Cartridge CLI - github.com/tarantool/cartridge-cli Posts on Habr - habr.com/users/rosik/ This presentation - rosik.github.io/2019-bigdatadays Questions? 40 / 40.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    40 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us