Go in Tidb Yao Wei | Pingcap About Me

Go in Tidb Yao Wei | Pingcap About Me

Go in TiDB Yao Wei | PingCAP About me ● Yao Wei (姚维) ● TiDB Kernel Expert, General Manager of South Region, China ● 360 Infra team / Alibaba-UC / PingCAP ● Atlas/MySQL-Sniffer ● Infrastructure software engineer Why a new database? Brief History RDBMS NoSQL NewSQL ● Standalone RDBMS 1970s 2010 2015 Present ● NoSQL MySQL Redis Google Spanner PostgreSQL HBase Google F1 Oracle Cassandra TiDB ● Middleware & Proxy DB2 MongoDB ... ... ● NewSQL Architecture Stateless SQL Layer Metadata / Timestamp request TiDB ... TiDB ... TiDB Placement Driver (PD) Raft Raft Raft TiKV ... TiKV TiKV TiKV Control flow: Balance / Failover Distributed Storage Layer TiKV - Overview • Region: a set of continuous key-value pairs • Data is organized/stored/replicated by Regions • Highly layered TiKV Key Space RPC (gRPC) Node A Transaction 256MB MVCC [ start_key, Raft end_key) RocksDB (-∞, +∞) Raft Raft Sorted Map Node B Node C Raft PD - Overview ● Meta data management ● Load balance management Route Info TiKV Client PD Node/Region Management Info Command TiKV TiKV TiKV TiKV … ... TiKV Cluster TiKV - Multi-Raft Multiple raft groups in the cluster, one group for each region. Client RPC RPC RPC RPC Store 1 Store 2 Store 3 Store 4 Region 1 Region 1 Region 2 Region 1 Region 3 Region 2 Region 5 Region 2 Raft Region 5 Region 4 Region 3 Region 5 Group Region 4 Region 3 Region 4 TiKV node 1 TiKV node 2 TiKV node 3 TiKV node 4 TiKV - Horizontal Scale Node B Region 1^ Region 1 Region 2 Region 3 Region 1* Node D Region 2 Region 2 Region 3 Region 3 Node C Node A Add Replica Three steps to move a leader replica ● Transfer Leader Node E ● Add Replica ● Remove Replica SQL Layer Example - SQL CREATE TABLE t (c1 INT, c2 VARCHAR(32), INDEX idx1 (c1)); SELECT COUNT(c1) FROM t WHERE c1 > 10 AND c2 = “gopherchina”; Example - Logical Plan Example - Physical Plan Challenges of distributed ACID database? ● Distributed Database is very complex ● Lots of RPC work ● Keep high performance ● Tons of data ● Huge amount of OLTP queries ● Very complex OLAP queries ● External Consistency ● SQL is much more complex than KV Why TiDB choose Golang? ● Easy-learning ● Productivity ● Concurrency ● Easy to trace bugs and profile ● Standard libraries and tools ● Tolerant GC latency ● Good performance ● Quick improvement Go in TiDB ● More than 160K lines of Go code and 138 contributors. Memory && GC • Query may touch a huge number of data. • Memory allocation may cost a lot of time. • Put pressure on GC worker. • Degrade the performance of SQL. • OOM sucks! • runtime.morestack Reduce the Number of Allocation ● Get enough memory in one allocation operation a := []int{1, 2, 3, 4, 5} b := []int{} // a much better way: // b := make([]int, 0, len(a)) for _, i := range a { b = append(b, i) } Reuse Object ● Share a stack for all queries in one session ● Introduce a cache in goyacc ● Resource pool OLTP & OLAP OLTP OLAP ETL 8am 2pm 6pm 2am Is the data out-of-date? Database Data Warehouse TiSpark Thanks!.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    22 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us