TiDB is a NewSQL database that provides horizontal scalability, ACID transactions, high availability, and SQL support. It aims to be an HTAP (Hybrid Transactional/Analytical Processing) database by supporting both OLTP and OLAP workloads on the same database using the same SQL interface.
TiDB achieves horizontal scalability through its distributed architecture with the TiKV storage engine and PD for metadata management. It supports ACID transactions through MVCC and Raft consensus. The database is available through replication of regions across nodes. TiDB also supports real-time analytics on the same dataset as transactions through its cost-based optimizer and distributed query processing engine.
Spark can run queries directly against the
6. OLTP & OLAP
8am 2pm 6pm 2am
ETL
Database
ERP
File
CRM
OLTP OLAP
Data Warehouse
Where is my data?
Is the data out-of-date?
7. Why two separate systems
● Huge data size
● Complex query logic
● Latency VS Throughput
● Point query VS Full range scan
● Transaction & Isolation level
10. What is TiDB
• Scalability as the first class feature
• SQL is necessary
• Compatible with MySQL, in most cases
• OLTP + OLAP = HTAP (Hybrid Transactional/Analytical
Processing)
• 24/7 availability, even in case of datacenter outages
• Open source, of course
12. TiKV - Overview
● Region: a set of continuous key-value pairs
● Data is organized/stored/replicated by Regions
● Highly layered
TiKV Key Space
[ start_key,
end_key)
(-∞, +∞)
Sorted Map
RPC (gRPC)
Transaction
MVCC
Raft
RocksDB
Node B Node C
Node A
Raft Raft
Raft
256MB
13. TiKV - Multi-Raft
Multiple raft groups in the cluster, one group for each region.
Client
Store 1
Region 1
Region 3
Region 5
Region 4
Store 3
Region 3
Region 5
Region 2
Store 2
Region 1
Region 3
Region 2
Region 4
Store 4
Region 1
Region 5
Region 2
Region 4
RPC RPC RPC RPC
TiKV node 1 TiKV node 2 TiKV node 3 TiKV node 4
Raft
Group
14. TiKV - Horizontal Scale
Region 1
Region 3
Region 1^
Region 2
Region 1*
Region 2 Region 2
Region 3
Region 3
Node A
Node B
Node E
Node C
Node D
Add Replica
Three steps to move a leader replica
● Transfer Leader
● Add Replica
● Remove Replica
15. PD - Overview
TiKV TiKV TiKV TiKV… ...
TiKV
Cluster
PD
Node/Region
Info
Management
Command
TiKV Client
Route Info
● Meta data management
● Load balance management
16. PD - TiKV Cluster Managment
Region A
Region B
Node 1
Node 2
PD
Scheduling
Stratege
Cluster
Info
Admin
HeartBeat
Scheduling
Command
Region C
Config
Movement
17. TiDB - Overview
SQL AST Logical Plan
Optimized
Logical Plan
Cost Model
Selected
Physical Plan
TiKV TiKV TiKV
TiDB SQL Layer
Statistics
● The stateless SQL layer
18. TiDB - Distributed SQL
SELECT COUNT(c1) FROM t WHERE c1 > 10 AND c2 =
‘shanghai’;
Partial Aggregate
COUNT(c1)
Filter
c2 = “shanghai”
Read Index
idx1: (10, +∞)
Physical Plan on TiKV (index scan)
Read Row Data
by RowID
RowID
Row
Row
Final Aggregate
SUM(COUNT(c1))
DistSQL Scan
Physical Plan on TiDB
COUNT(c1)
COUNT(c1)
TiKV
TiKV
TiKV
COUNT(c1)
COUNT(c1)
19. TiDB - Cost Based Optimizer
● Predicate Pushdown
● Column Pruning
● Eager Aggregate
● Convert Subquery to Join
● Statistics framework
● CBO Framework
○ Index Selection
○ Join Operator Selection
■ Hash join
■ Index lookup join
■ Sort-merge join
○ Stream Operators VS Hash Operators
20. Cost estimation
Network cost Memory cost CPU cost
In TiDB, default memory factor is 5 and cpu factor is 0.8.
For example: Operator Sort(r), its cost would be:
DP (Dynamic Programming) on tree based on statistic infomation
21. Parallel Operators
TiKV Cluster: Coprocessor Workers
Scan Workers Scan Workers
Join Worker
TableScan: t IndexScan: t1 idx1
Join Worker
Join Worker
Data Reader
Join Operator
SELECT t.c2, t1.c2 FROM t JOIN t1 on t.c = t1.c WHERE t1.c1 > 10;
Projection
Join
DataSource
t
DataSource
t1
Filter
t1.c1 > 10