Using Rust to Build a Distributed Transactional Key-Value Database Liutang | [email protected] About Me

Using Rust to Build a Distributed Transactional Key-Value Database Liutang | Tl@Pingcap.Com About Me

Using Rust to Build a Distributed Transactional Key-Value Database LiuTang | [email protected] About me ● Chief Architect at PingCAP ● TiDB and TiKV ● Open source projects ○ LedisDB ○ go-mysql ○ go-mysql-elasticsearch ○ rust-prometheus ○ ... Agenda ● Introduction ● Hierarchy ○ Storage ○ Raft ○ Transaction ○ RPC Framework ○ Monitor ○ Test ● Combine them all When we want to build a distributed transactional key-value database... Consistency Performance Scalability ACID Stability HA Others… A High Building, A Low Foundation Language Let’s start from scratch!!! RocksDB Immutable Memory Memory Table Table WAL Flush Memory Disk Compaction SST Info Log Level 0 SST SST …... SST Manifest Level 1 Current SST SST …... SST Level 2 https://github.com/pingcap/rust-rocksdb Raft Client State State State Machine Raft Machine Raft Machine Raft a = 1 Module a = 1 Module a = 1 Module b = 2 b = 2 b = 2 Log a = 1 b = 2 Log a = 1 b = 2 Log a = 1 b = 2 Multi-Raft Key Space A - B Raft Group Region 1 Region 1 Region 1 Raft Group Region 2 Region 2 Region 2 B - C Raft Group Region 3 Region 3 Region 3 C - D Multi-Raft - Scalability A B C D Region 1 Region 1 Region 1 Region 2 Region 2 Region 2 Multi-Raft - Scalability A B C D Region 1 Region 1 Region 1 Region 2 Region 2 Region 2 Region 2 Raft ConfChange - AddNode Multi-Raft - Scalability A B C D Region 1 Region 1 Region 1 Region 2 Region 2 Region 2 Raft ConfChange - RemoveNode https://github.com/pingcap/raft-rs Transaction Transaction How to keep consistency crossing multi-Raft Groups? let mut txn = store.begin() let value1 = txn.get(region1_key) let value2 = txn.get(region2_key) // do something with value txn.set(region1_key, new_value1) txn.set(region2_key, new_value2) txn.commit() // or txn.rollback() Transaction 1. Inspired by Google Percolator 2. Optimized Two Phase Commit (2 PC) 3. Multiversion Concurrency Control (MVCC) 4. Snapshot Isolation 5. Optimistic Transaction gRPC ● Mode ○ Unary ○ Client streaming ○ Server streaming ○ Duplex streaming ● Using Futures to wrap the asynchronous C gRPC API let f = unary(service, method, request); let resp = f.wait(); https://github.com/pingcap/grpc-rs Prometheus ● Type ○ Counter ○ Gauge ○ Histogram lazy_static! { static ref HTTP_COUNTER: Counter = register_counter!( "http_request_total", "Total number of HTTP request." ).unwrap(); } HTTP_COUNTER.inc(); https://github.com/pingcap/rust-prometheus Testing Testing - Failure Injection // Ingest a failure fn function_foo() { fail_point!("foo"); } // Run and Trigger the failure FAILPOINTS=foo=panic cargo run https://github.com/pingcap/fail-rs Architecture Architecture Prometheus Client gRPC Txn API Txn API Txn API MVCC MVCC MVCC Raft Raft Raft RocksDB RocksDB RocksDB https://github.com/pingcap/tikv Beyond TiKV A Distributed Relational Database TiDB Applications MySQL Drivers(e.g. JDBC) MySQL Protocol TiDB RPC TiKV A Distributed Analytical Database TiSpark Spark Driver Job Spark Cluster Worker Worker Worker RPC TiKV Hybrid Transactional/Analytical Processing Database PD PD TSO/Data location Data location PD PD Cluster Meta data Spark Driver TiDB TiKV TiKV Job TiDB API API Worker TiDB TiKV TiKV Worker TiDB TiKV TiKV Worker TiDB ... ... ... Spark Cluster TiDB Cluster TiKV Cluster (Storage) TiSpark Thank you! https://github.com/pingcap/tidb https://github.com/pingcap/tikv We are hiring… @China @Silicon Valley @Home.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    42 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us