SlideShare uma empresa Scribd logo
1 de 84
Baixar para ler offline
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
Leif Walsh 
Tokutek, Inc. 
leif@tokutek.com 
@leifwalsh 
November 1, 2014 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 1 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
Background 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 2 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
Data structures: 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 3 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
Data structures: 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 3 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
Data structures: 
Provide retrieval of data. 
Lookup(Key) 
Pred(Key) 
Succ(Key) 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 3 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
Data structures: 
Provide retrieval of data. 
Lookup(Key) 
Pred(Key) 
Succ(Key) 
Dynamic data structures let you change 
the data. 
Insert(Key; Value) 
Delete(Key) 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 3 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
[Aggarwal & Vitter ’88] 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
DAM model 
Problem size N. 
Memory size M. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 4 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
[Aggarwal & Vitter ’88] 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
DAM model 
Problem size N. 
Memory size M. 
Transfer data to/from memory in blocks 
of size B. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 4 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
[Aggarwal & Vitter ’88] 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
DAM model 
Problem size N. 
Memory size M. 
Transfer data to/from memory in blocks 
of size B. 
Efficiency of operations is measured as the 
number of block transfers, a.k.a. IOPS. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 4 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
A B-tree (Б-tree?) is an external memory data structure: 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
A B-tree (Б-tree?) is an external memory data structure: 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
A B-tree (Б-tree?) is an external memory data structure: 
Balanced search tree. 
Fanout of B 
(block size / key size). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
A B-tree (Б-tree?) is an external memory data structure: 
Balanced search tree. 
Fanout of B 
(block size / key size). 
Internal nodes < M. 
Leaf nodes > M. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
A B-tree (Б-tree?) is an external memory data structure: 
Balanced search tree. 
Fanout of B 
(block size / key size). 
Internal nodes < M. 
Leaf nodes > M. 
Search: O(logB N) I/Os 
Insert: O(logB N) I/Os 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
. 
. 
. 
. 
. 
. 
. 
[Brodal & Fagerberg ’03] 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 6 / 31
. 
. 
. 
. 
. 
. 
. 
[Brodal & Fagerberg ’03] 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 7 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
OLAP 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 8 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
OLAP: Online Analytical Processing 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 9 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
OLAP: Online Analytical Processing 
Key idea: Analyze data collected in the past. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 9 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
OLAP: Online Analytical Processing 
Key idea: Analyze data collected in the past. 
B-tree inserts are slow, but…logging and sorting are fast. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 9 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Merge sort: 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 10 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Merge sort in external memory: 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Merge sort in external memory: 
Merge sort cost in DAM model is: 
Cost to scan through all the data once. 
Multiplied by the # of levels in the merge tree. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Merge sort in external memory: 
Merge sort cost in DAM model is: 
Cost to scan through all the data once. 
N/B 
Multiplied by the # of levels in the merge tree. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Merge sort in external memory: 
Merge sort cost in DAM model is: 
Cost to scan through all the data once. 
N/B 
Multiplied by the # of levels in the merge tree. 
logM/B N/B 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Merge sort in external memory: 
Merge sort cost in DAM model is: 
Cost to scan through all the data once. 
N/B 
Multiplied by the # of levels in the merge tree. 
logM/B N/B 
O 
( 
N 
B 
logM/B 
N 
B 
) 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Insert N elements into a B-tree: 
O 
( 
N logB 
N 
M 
) 
Merge sort: 
O 
( 
N 
B 
logM/B 
N 
B 
) 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Insert N elements into a B-tree: 
O 
( 
N logB 
N 
M 
) 
Merge sort: 
O 
( 
N 
B 
logM/B 
N 
B 
) 
 2N 
B 
Typically, M/B is large, so only two passes are needed to sort. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Insert N elements into a B-tree: 
O 
( 
N logB 
N 
M 
) 
 N 
Merge sort: 
O 
( 
N 
B 
logM/B 
N 
B 
) 
 2N 
B 
Typically, M/B is large, so only two passes are needed to sort. 
Intuition: Each insert into a B-tree costs 1 seek, while sorting is close to disk bandwidth. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Insert N elements into a B-tree: (assuming 100-1000 byte elements) 
O 
( 
N logB 
N 
M 
) 
 N  10  100kB/s = 100 elements/s 
Merge sort: 
O 
( 
N 
B 
logM/B 
N 
B 
) 
 2N 
B 
Typically, M/B is large, so only two passes are needed to sort. 
Intuition: Each insert into a B-tree costs 1 seek, while sorting is close to disk bandwidth. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
Insert N elements into a B-tree: (assuming 100-1000 byte elements) 
O 
( 
N logB 
N 
M 
) 
 N  10  100kB/s = 100 elements/s 
Merge sort: 
O 
( 
N 
B 
logM/B 
N 
B 
) 
 2N 
B 
 50MB/s = 50k  500k elements/s 
Typically, M/B is large, so only two passes are needed to sort. 
Intuition: Each insert into a B-tree costs 1 seek, while sorting is close to disk bandwidth. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
So, how does OLAP work? 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
So, how does OLAP work? 
Log new data unindexed until you accumulate a lot of it (10% of the data set). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
So, how does OLAP work? 
Log new data unindexed until you accumulate a lot of it (10% of the data set). 
Sort the new data. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
So, how does OLAP work? 
Log new data unindexed until you accumulate a lot of it (10% of the data set). 
Sort the new data. 
Use a merge pass through existing indexes to incorporate new data. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
So, how does OLAP work? 
Log new data unindexed until you accumulate a lot of it (10% of the data set). 
Sort the new data. 
Use a merge pass through existing indexes to incorporate new data. 
Use indexes to do analytics. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
So, how does OLAP work? 
Log new data unindexed until you accumulate a lot of it (10% of the data set). 
Sort the new data. 
Use a merge pass through existing indexes to incorporate new data. 
Use indexes to do analytics. 
Moral: OLAP techniques can handle high insertion volume, but query results are delayed. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #1: OLAP 
So, how does OLAP work? 
Log new data unindexed until you accumulate a lot of it (10% of the data set). 
Sort the new data. 
Use a merge pass through existing indexes to incorporate new data. 
Use indexes to do analytics. 
Moral: OLAP techniques can handle high insertion volume, but query results are delayed. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
LSM-trees 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 14 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
The insight for LSM-trees starts by asking: how can we reduce the queryability delay in OLAP? 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 15 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
The insight for LSM-trees starts by asking: how can we reduce the queryability delay in OLAP? 
The buffer is small, let’s index it! 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 15 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
The insight for LSM-trees starts by asking: how can we reduce the queryability delay in OLAP? 
The buffer is small, let’s index it! 
Inserts go into the “buffer B-tree”. 
When the buffer gets full, we merge it with the “main B-tree”. 
Queries have to touch both trees and merge results, but results are available immediately. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 15 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
The insight for LSM-trees starts by asking: how can we reduce the queryability delay in OLAP? 
The buffer is small, let’s index it! 
Inserts go into the “buffer B-tree”. 
When the buffer gets full, we merge it with the “main B-tree”. 
Queries have to touch both trees and merge results, but results are available immediately. 
(This specific technique (which is not yet an LSM-tree) is used in InnoDB and is called the “change buffer”.) 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 15 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
Why is this fast? 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 16 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
Why is this fast? 
The buffer is in-memory, so inserts are fast. 
When we merge, we put many new elements in each leaf in the main B-tree (this amortizes 
the I/O cost to read the leaf ). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 16 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
Why is this fast? 
The buffer is in-memory, so inserts are fast. 
When we merge, we put many new elements in each leaf in the main B-tree (this amortizes 
the I/O cost to read the leaf ). 
Eventually, we reach a problem: 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 16 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
Why is this fast? 
The buffer is in-memory, so inserts are fast. 
When we merge, we put many new elements in each leaf in the main B-tree (this amortizes 
the I/O cost to read the leaf ). 
Eventually, we reach a problem: 
If the buffer gets too big, inserts get slow. 
If the buffer stays too small, the merge gets inefficient because each leaf node receives 
only a few elements (back to O(N logB N)). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 16 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How can we fix this? 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How can we fix this? More buffering! 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How can we fix this? More buffering! 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How can we fix this? More buffering! 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How can we fix this? More buffering! 
Each level is twice as large as the previous level, for some value of 2 (usually 10). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How can we fix this? More buffering! 
Each level is twice as large as the previous level, for some value of 2 (usually 10). We’ll use 2. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How do queries work? 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How do queries work? 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How do queries work? 
Search cost is: 
logB B + : : : + logB 
N 
8 
+ logB 
N 
4 
+ logB 
N 
2 
+ logB N 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How do queries work? 
Search cost is: 
logB B + : : : + logB 
N 
8 
+ logB 
N 
4 
+ logB 
N 
2 
+ logB N 
= 
1 
log B (1 + : : : + lg(N)  3 + lg(N)  2 + lg(N)  1 + lg(N)) 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How do queries work? 
Search cost is: 
logB B + : : : + logB 
N 
8 
+ logB 
N 
4 
+ logB 
N 
2 
+ logB N 
= 
1 
log B (1 + : : : + lg(N)  3 + lg(N)  2 + lg(N)  1 + lg(N)) 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How do queries work? 
Search cost is: 
logB B + : : : + logB 
N 
8 
+ logB 
N 
4 
+ logB 
N 
2 
+ logB N 
= 
1 
log B (1 + : : : + lg(N)  3 + lg(N)  2 + lg(N)  1 + lg(N)) = O(log N  logB N) 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How much do inserts cost? 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How much do inserts cost? 
Cost to flush a tree Tj of size X is O(X/B). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How much do inserts cost? 
Cost to flush a tree Tj of size X is O(X/B). 
Cost per element to flush Tj is O(1/B). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How much do inserts cost? 
Cost to flush a tree Tj of size X is O(X/B). 
Cost per element to flush Tj is O(1/B). 
Each element moves  log N times. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #2: LSM-trees 
How much do inserts cost? 
Cost to flush a tree Tj of size X is O(X/B). 
Cost per element to flush Tj is O(1/B). 
Each element moves  log N times. 
Total amortized insert cost per element is O 
( 
log N 
B 
) 
. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization in external memory data structures 
Fractal Trees 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 20 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
The pain in LSM-trees is doing a full O(logB N) search in each level. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 21 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
The pain in LSM-trees is doing a full O(logB N) search in each level. 
We use fractional cascading to reduce the search per level to O(1). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 21 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
The pain in LSM-trees is doing a full O(logB N) search in each level. 
We use fractional cascading to reduce the search per level to O(1). 
The idea is that once we’ve searched Ti, we know where the key would be in Ti, and we can use 
that information to guide our search of Ti+1. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 21 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
Add forwarding pointers from leaves in Ti to leaves in Ti+1 (but remove the redundant ones that 
point to the same leaf ): 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 22 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
Add ghost pointers to leaves not pointed to in Ti+1 in leaves in Ti: 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 23 / 31
[Bender, Farach-Colton, Fineman, Fogel, Kuszmaul,  Nelson ’07] 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
Now, after searching Ti for a missing element c, we look left and right for forwarding or ghost 
pointers, and follow them down to look at O(1) leaves in Ti+1. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 24 / 31
[Bender, Farach-Colton, Fineman, Fogel, Kuszmaul,  Nelson ’07] 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
Now, after searching Ti for a missing element c, we look left and right for forwarding or ghost 
pointers, and follow them down to look at O(1) leaves in Ti+1. 
This way, search is only O(logR N) (in our example, R = 2). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 24 / 31
[Bender, Farach-Colton, Fineman, Fogel, Kuszmaul,  Nelson ’07] 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
Now, after searching Ti for a missing element c, we look left and right for forwarding or ghost 
pointers, and follow them down to look at O(1) leaves in Ti+1. 
This way, search is only O(logR N) (in our example, R = 2). 
The internal node structure in each level is now redundant, so we can represent each level as an 
array. This is called a Cache-Oblivious Lookahead Array. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 24 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
Though the amortized analysis says our inserts are fast, when we flush a very large level to the 
next one, we might see a big stall. Concurrent merge algorithms exist, but we can do better. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 25 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
Though the amortized analysis says our inserts are fast, when we flush a very large level to the 
next one, we might see a big stall. Concurrent merge algorithms exist, but we can do better. 
We break each level’s array into chunks that can be flushed independently. Each chunk flushes 
to a localized region of a few chunks in the next level down, found using its forwarding pointers. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 25 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
Though the amortized analysis says our inserts are fast, when we flush a very large level to the 
next one, we might see a big stall. Concurrent merge algorithms exist, but we can do better. 
We break each level’s array into chunks that can be flushed independently. Each chunk flushes 
to a localized region of a few chunks in the next level down, found using its forwarding pointers. 
Now we have a tree again! 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 25 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Write-optimization technique #3: Fractal Trees 
Though the amortized analysis says our inserts are fast, when we flush a very large level to the 
next one, we might see a big stall. Concurrent merge algorithms exist, but we can do better. 
We break each level’s array into chunks that can be flushed independently. Each chunk flushes 
to a localized region of a few chunks in the next level down, found using its forwarding pointers. 
Now we have a tree again! 
As it turns out, this structure makes it easier to manage an LRU-style cache of blocks and is more 
flexible in the face of “hotspot” workloads. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 25 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Results 
Modified B-tree-like dynamic (inserts, updates, deletes) data structure that supports point 
and range queries. 
Inserts ( 
are a factor B/ log B (typically 10-100x in practice) faster than a B-tree: 
O 
log N 
B 
) 
 O 
( 
log N 
log B 
) 
. 
Searches are a factor log B/ log R slower than a B-tree: O 
( 
log N 
log R 
) 
 O 
( 
log N 
log B 
) 
. 
To amortize flush costs over many elements, we want each block we write to be large 
(4MB), much larger than typical B-tree blocks (16KB). These compress well. 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 26 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Applications 
TokuDB for MySQL, TokuMX for MongoDB: 
Faster indexed insertions. 
Hot schema changes. 
Compression. 
Faster replication on secondaries (TokuMX). 
Lower impact migrations (TokuMX). 
Fast (no read before write) updates (in TokuDB, coming soon in TokuMX). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 27 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Applications 
TokuDB for MySQL, TokuMX for MongoDB: 
Faster indexed insertions. 
Hot schema changes. 
Compression. 
Faster replication on secondaries (TokuMX). 
Lower impact migrations (TokuMX). 
Fast (no read before write) updates (in TokuDB, coming soon in TokuMX). 
ACID transactions. 
Concurrency (TokuMX). 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 27 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Benchmarks 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 28 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Benchmarks 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 29 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Benchmarks 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 30 / 31
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
Questions? 
Leif Walsh 
leif@tokutek.com 
@leifwalsh 
Downloads: www.tokutek.com/downloads 
Docs: docs.tokutek.com 
Slides: slidesha.re/1tqwORg 
Leif Walsh (Tokutek) Fractal Trees November 1, 2014 31 / 31

Mais conteúdo relacionado

Mais de Ontico

One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...Ontico
 
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Ontico
 
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Ontico
 
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Ontico
 
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Ontico
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)Ontico
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Ontico
 
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Ontico
 
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)Ontico
 
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)Ontico
 
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Ontico
 
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Ontico
 
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Ontico
 
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Ontico
 
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)Ontico
 
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Ontico
 
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Ontico
 
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...Ontico
 
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Ontico
 
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Ontico
 

Mais de Ontico (20)

One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
One-cloud — система управления дата-центром в Одноклассниках / Олег Анастасье...
 
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)Масштабируя DNS / Артем Гавриченков (Qrator Labs)
Масштабируя DNS / Артем Гавриченков (Qrator Labs)
 
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
Создание BigData-платформы для ФГУП Почта России / Андрей Бащенко (Luxoft)
 
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
Готовим тестовое окружение, или сколько тестовых инстансов вам нужно / Алекса...
 
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
Новые технологии репликации данных в PostgreSQL / Александр Алексеев (Postgre...
 
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
PostgreSQL Configuration for Humans / Alvaro Hernandez (OnGres)
 
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
Inexpensive Datamasking for MySQL with ProxySQL — Data Anonymization for Deve...
 
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
Опыт разработки модуля межсетевого экранирования для MySQL / Олег Брославский...
 
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
ProxySQL Use Case Scenarios / Alkin Tezuysal (Percona)
 
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)MySQL Replication — Advanced Features / Петр Зайцев (Percona)
MySQL Replication — Advanced Features / Петр Зайцев (Percona)
 
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
Внутренний open-source. Как разрабатывать мобильное приложение большим количе...
 
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
Подробно о том, как Causal Consistency реализовано в MongoDB / Михаил Тюленев...
 
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
Балансировка на скорости проводов. Без ASIC, без ограничений. Решения NFWare ...
 
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
Перехват трафика — мифы и реальность / Евгений Усков (Qrator Labs)
 
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
И тогда наверняка вдруг запляшут облака! / Алексей Сушков (ПЕТЕР-СЕРВИС)
 
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
Как мы заставили Druid работать в Одноклассниках / Юрий Невиницин (OK.RU)
 
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
Разгоняем ASP.NET Core / Илья Вербицкий (WebStoating s.r.o.)
 
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...100500 способов кэширования в Oracle Database или как достичь максимальной ск...
100500 способов кэширования в Oracle Database или как достичь максимальной ск...
 
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
Apache Ignite Persistence: зачем Persistence для In-Memory, и как он работает...
 
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
Механизмы мониторинга баз данных: взгляд изнутри / Дмитрий Еманов (Firebird P...
 

Último

Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Roommeghakumariji156
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...kajalverma014
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.krishnachandrapal52
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptxAsmae Rabhi
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge GraphsEleniIlkou
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdfMatthew Sinclair
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftAanSulistiyo
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsMonica Sydney
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfJOHNBEBONYAP1
 
Power point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria IuzzolinoPower point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria Iuzzolinonuriaiuzzolino1
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdfMatthew Sinclair
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 

Último (20)

Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx75539-Cyber Security Challenges PPT.pptx
75539-Cyber Security Challenges PPT.pptx
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
Microsoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck MicrosoftMicrosoft Azure Arc Customer Deck Microsoft
Microsoft Azure Arc Customer Deck Microsoft
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
Power point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria IuzzolinoPower point inglese - educazione civica di Nuria Iuzzolino
Power point inglese - educazione civica di Nuria Iuzzolino
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 

Глубокое погружение в дисковые структуры данных, B-деревья, LSM-деревья и фрактальные деревья, Leif Walsh (Tokutek)

  • 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures Leif Walsh Tokutek, Inc. leif@tokutek.com @leifwalsh November 1, 2014 Leif Walsh (Tokutek) Fractal Trees November 1, 2014 1 / 31
  • 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures Background Leif Walsh (Tokutek) Fractal Trees November 1, 2014 2 / 31
  • 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures Data structures: Leif Walsh (Tokutek) Fractal Trees November 1, 2014 3 / 31
  • 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures Data structures: Leif Walsh (Tokutek) Fractal Trees November 1, 2014 3 / 31
  • 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures Data structures: Provide retrieval of data. Lookup(Key) Pred(Key) Succ(Key) Leif Walsh (Tokutek) Fractal Trees November 1, 2014 3 / 31
  • 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures Data structures: Provide retrieval of data. Lookup(Key) Pred(Key) Succ(Key) Dynamic data structures let you change the data. Insert(Key; Value) Delete(Key) Leif Walsh (Tokutek) Fractal Trees November 1, 2014 3 / 31
  • 7. . . . . . . . . . [Aggarwal & Vitter ’88] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures DAM model Problem size N. Memory size M. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 4 / 31
  • 8. . . . . . . . . . [Aggarwal & Vitter ’88] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures DAM model Problem size N. Memory size M. Transfer data to/from memory in blocks of size B. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 4 / 31
  • 9. . . . . . . . . . [Aggarwal & Vitter ’88] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures DAM model Problem size N. Memory size M. Transfer data to/from memory in blocks of size B. Efficiency of operations is measured as the number of block transfers, a.k.a. IOPS. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 4 / 31
  • 10. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures A B-tree (Б-tree?) is an external memory data structure: Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
  • 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures A B-tree (Б-tree?) is an external memory data structure: Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
  • 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures A B-tree (Б-tree?) is an external memory data structure: Balanced search tree. Fanout of B (block size / key size). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
  • 13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures A B-tree (Б-tree?) is an external memory data structure: Balanced search tree. Fanout of B (block size / key size). Internal nodes < M. Leaf nodes > M. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
  • 14. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures A B-tree (Б-tree?) is an external memory data structure: Balanced search tree. Fanout of B (block size / key size). Internal nodes < M. Leaf nodes > M. Search: O(logB N) I/Os Insert: O(logB N) I/Os Leif Walsh (Tokutek) Fractal Trees November 1, 2014 5 / 31
  • 15. . . . . . . . [Brodal & Fagerberg ’03] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures Leif Walsh (Tokutek) Fractal Trees November 1, 2014 6 / 31
  • 16. . . . . . . . [Brodal & Fagerberg ’03] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures Leif Walsh (Tokutek) Fractal Trees November 1, 2014 7 / 31
  • 17. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures OLAP Leif Walsh (Tokutek) Fractal Trees November 1, 2014 8 / 31
  • 18. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP OLAP: Online Analytical Processing Leif Walsh (Tokutek) Fractal Trees November 1, 2014 9 / 31
  • 19. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP OLAP: Online Analytical Processing Key idea: Analyze data collected in the past. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 9 / 31
  • 20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP OLAP: Online Analytical Processing Key idea: Analyze data collected in the past. B-tree inserts are slow, but…logging and sorting are fast. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 9 / 31
  • 21. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Merge sort: Leif Walsh (Tokutek) Fractal Trees November 1, 2014 10 / 31
  • 22. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Merge sort in external memory: Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
  • 23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Merge sort in external memory: Merge sort cost in DAM model is: Cost to scan through all the data once. Multiplied by the # of levels in the merge tree. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
  • 24. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Merge sort in external memory: Merge sort cost in DAM model is: Cost to scan through all the data once. N/B Multiplied by the # of levels in the merge tree. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
  • 25. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Merge sort in external memory: Merge sort cost in DAM model is: Cost to scan through all the data once. N/B Multiplied by the # of levels in the merge tree. logM/B N/B Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
  • 26. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Merge sort in external memory: Merge sort cost in DAM model is: Cost to scan through all the data once. N/B Multiplied by the # of levels in the merge tree. logM/B N/B O ( N B logM/B N B ) Leif Walsh (Tokutek) Fractal Trees November 1, 2014 11 / 31
  • 27. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Insert N elements into a B-tree: O ( N logB N M ) Merge sort: O ( N B logM/B N B ) Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
  • 28. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Insert N elements into a B-tree: O ( N logB N M ) Merge sort: O ( N B logM/B N B ) 2N B Typically, M/B is large, so only two passes are needed to sort. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
  • 29. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Insert N elements into a B-tree: O ( N logB N M ) N Merge sort: O ( N B logM/B N B ) 2N B Typically, M/B is large, so only two passes are needed to sort. Intuition: Each insert into a B-tree costs 1 seek, while sorting is close to disk bandwidth. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
  • 30. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Insert N elements into a B-tree: (assuming 100-1000 byte elements) O ( N logB N M ) N 10 100kB/s = 100 elements/s Merge sort: O ( N B logM/B N B ) 2N B Typically, M/B is large, so only two passes are needed to sort. Intuition: Each insert into a B-tree costs 1 seek, while sorting is close to disk bandwidth. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
  • 31. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP Insert N elements into a B-tree: (assuming 100-1000 byte elements) O ( N logB N M ) N 10 100kB/s = 100 elements/s Merge sort: O ( N B logM/B N B ) 2N B 50MB/s = 50k 500k elements/s Typically, M/B is large, so only two passes are needed to sort. Intuition: Each insert into a B-tree costs 1 seek, while sorting is close to disk bandwidth. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 12 / 31
  • 32. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP So, how does OLAP work? Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
  • 33. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP So, how does OLAP work? Log new data unindexed until you accumulate a lot of it (10% of the data set). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
  • 34. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP So, how does OLAP work? Log new data unindexed until you accumulate a lot of it (10% of the data set). Sort the new data. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
  • 35. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP So, how does OLAP work? Log new data unindexed until you accumulate a lot of it (10% of the data set). Sort the new data. Use a merge pass through existing indexes to incorporate new data. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
  • 36. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP So, how does OLAP work? Log new data unindexed until you accumulate a lot of it (10% of the data set). Sort the new data. Use a merge pass through existing indexes to incorporate new data. Use indexes to do analytics. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
  • 37. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP So, how does OLAP work? Log new data unindexed until you accumulate a lot of it (10% of the data set). Sort the new data. Use a merge pass through existing indexes to incorporate new data. Use indexes to do analytics. Moral: OLAP techniques can handle high insertion volume, but query results are delayed. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
  • 38. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #1: OLAP So, how does OLAP work? Log new data unindexed until you accumulate a lot of it (10% of the data set). Sort the new data. Use a merge pass through existing indexes to incorporate new data. Use indexes to do analytics. Moral: OLAP techniques can handle high insertion volume, but query results are delayed. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 13 / 31
  • 39. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures LSM-trees Leif Walsh (Tokutek) Fractal Trees November 1, 2014 14 / 31
  • 40. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees The insight for LSM-trees starts by asking: how can we reduce the queryability delay in OLAP? Leif Walsh (Tokutek) Fractal Trees November 1, 2014 15 / 31
  • 41. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees The insight for LSM-trees starts by asking: how can we reduce the queryability delay in OLAP? The buffer is small, let’s index it! Leif Walsh (Tokutek) Fractal Trees November 1, 2014 15 / 31
  • 42. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees The insight for LSM-trees starts by asking: how can we reduce the queryability delay in OLAP? The buffer is small, let’s index it! Inserts go into the “buffer B-tree”. When the buffer gets full, we merge it with the “main B-tree”. Queries have to touch both trees and merge results, but results are available immediately. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 15 / 31
  • 43. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees The insight for LSM-trees starts by asking: how can we reduce the queryability delay in OLAP? The buffer is small, let’s index it! Inserts go into the “buffer B-tree”. When the buffer gets full, we merge it with the “main B-tree”. Queries have to touch both trees and merge results, but results are available immediately. (This specific technique (which is not yet an LSM-tree) is used in InnoDB and is called the “change buffer”.) Leif Walsh (Tokutek) Fractal Trees November 1, 2014 15 / 31
  • 44. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees Why is this fast? Leif Walsh (Tokutek) Fractal Trees November 1, 2014 16 / 31
  • 45. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees Why is this fast? The buffer is in-memory, so inserts are fast. When we merge, we put many new elements in each leaf in the main B-tree (this amortizes the I/O cost to read the leaf ). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 16 / 31
  • 46. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees Why is this fast? The buffer is in-memory, so inserts are fast. When we merge, we put many new elements in each leaf in the main B-tree (this amortizes the I/O cost to read the leaf ). Eventually, we reach a problem: Leif Walsh (Tokutek) Fractal Trees November 1, 2014 16 / 31
  • 47. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees Why is this fast? The buffer is in-memory, so inserts are fast. When we merge, we put many new elements in each leaf in the main B-tree (this amortizes the I/O cost to read the leaf ). Eventually, we reach a problem: If the buffer gets too big, inserts get slow. If the buffer stays too small, the merge gets inefficient because each leaf node receives only a few elements (back to O(N logB N)). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 16 / 31
  • 48. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How can we fix this? Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
  • 49. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How can we fix this? More buffering! Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
  • 50. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How can we fix this? More buffering! Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
  • 51. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How can we fix this? More buffering! Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
  • 52. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How can we fix this? More buffering! Each level is twice as large as the previous level, for some value of 2 (usually 10). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
  • 53. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How can we fix this? More buffering! Each level is twice as large as the previous level, for some value of 2 (usually 10). We’ll use 2. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 17 / 31
  • 54. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How do queries work? Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
  • 55. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How do queries work? Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
  • 56. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How do queries work? Search cost is: logB B + : : : + logB N 8 + logB N 4 + logB N 2 + logB N Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
  • 57. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How do queries work? Search cost is: logB B + : : : + logB N 8 + logB N 4 + logB N 2 + logB N = 1 log B (1 + : : : + lg(N) 3 + lg(N) 2 + lg(N) 1 + lg(N)) Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
  • 58. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How do queries work? Search cost is: logB B + : : : + logB N 8 + logB N 4 + logB N 2 + logB N = 1 log B (1 + : : : + lg(N) 3 + lg(N) 2 + lg(N) 1 + lg(N)) Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
  • 59. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How do queries work? Search cost is: logB B + : : : + logB N 8 + logB N 4 + logB N 2 + logB N = 1 log B (1 + : : : + lg(N) 3 + lg(N) 2 + lg(N) 1 + lg(N)) = O(log N logB N) Leif Walsh (Tokutek) Fractal Trees November 1, 2014 18 / 31
  • 60. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How much do inserts cost? Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
  • 61. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How much do inserts cost? Cost to flush a tree Tj of size X is O(X/B). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
  • 62. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How much do inserts cost? Cost to flush a tree Tj of size X is O(X/B). Cost per element to flush Tj is O(1/B). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
  • 63. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How much do inserts cost? Cost to flush a tree Tj of size X is O(X/B). Cost per element to flush Tj is O(1/B). Each element moves log N times. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
  • 64. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #2: LSM-trees How much do inserts cost? Cost to flush a tree Tj of size X is O(X/B). Cost per element to flush Tj is O(1/B). Each element moves log N times. Total amortized insert cost per element is O ( log N B ) . Leif Walsh (Tokutek) Fractal Trees November 1, 2014 19 / 31
  • 65. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization in external memory data structures Fractal Trees Leif Walsh (Tokutek) Fractal Trees November 1, 2014 20 / 31
  • 66. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees The pain in LSM-trees is doing a full O(logB N) search in each level. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 21 / 31
  • 67. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees The pain in LSM-trees is doing a full O(logB N) search in each level. We use fractional cascading to reduce the search per level to O(1). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 21 / 31
  • 68. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees The pain in LSM-trees is doing a full O(logB N) search in each level. We use fractional cascading to reduce the search per level to O(1). The idea is that once we’ve searched Ti, we know where the key would be in Ti, and we can use that information to guide our search of Ti+1. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 21 / 31
  • 69. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees Add forwarding pointers from leaves in Ti to leaves in Ti+1 (but remove the redundant ones that point to the same leaf ): Leif Walsh (Tokutek) Fractal Trees November 1, 2014 22 / 31
  • 70. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees Add ghost pointers to leaves not pointed to in Ti+1 in leaves in Ti: Leif Walsh (Tokutek) Fractal Trees November 1, 2014 23 / 31
  • 71. [Bender, Farach-Colton, Fineman, Fogel, Kuszmaul, Nelson ’07] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees Now, after searching Ti for a missing element c, we look left and right for forwarding or ghost pointers, and follow them down to look at O(1) leaves in Ti+1. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 24 / 31
  • 72. [Bender, Farach-Colton, Fineman, Fogel, Kuszmaul, Nelson ’07] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees Now, after searching Ti for a missing element c, we look left and right for forwarding or ghost pointers, and follow them down to look at O(1) leaves in Ti+1. This way, search is only O(logR N) (in our example, R = 2). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 24 / 31
  • 73. [Bender, Farach-Colton, Fineman, Fogel, Kuszmaul, Nelson ’07] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees Now, after searching Ti for a missing element c, we look left and right for forwarding or ghost pointers, and follow them down to look at O(1) leaves in Ti+1. This way, search is only O(logR N) (in our example, R = 2). The internal node structure in each level is now redundant, so we can represent each level as an array. This is called a Cache-Oblivious Lookahead Array. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 24 / 31
  • 74. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees Though the amortized analysis says our inserts are fast, when we flush a very large level to the next one, we might see a big stall. Concurrent merge algorithms exist, but we can do better. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 25 / 31
  • 75. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees Though the amortized analysis says our inserts are fast, when we flush a very large level to the next one, we might see a big stall. Concurrent merge algorithms exist, but we can do better. We break each level’s array into chunks that can be flushed independently. Each chunk flushes to a localized region of a few chunks in the next level down, found using its forwarding pointers. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 25 / 31
  • 76. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees Though the amortized analysis says our inserts are fast, when we flush a very large level to the next one, we might see a big stall. Concurrent merge algorithms exist, but we can do better. We break each level’s array into chunks that can be flushed independently. Each chunk flushes to a localized region of a few chunks in the next level down, found using its forwarding pointers. Now we have a tree again! Leif Walsh (Tokutek) Fractal Trees November 1, 2014 25 / 31
  • 77. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Write-optimization technique #3: Fractal Trees Though the amortized analysis says our inserts are fast, when we flush a very large level to the next one, we might see a big stall. Concurrent merge algorithms exist, but we can do better. We break each level’s array into chunks that can be flushed independently. Each chunk flushes to a localized region of a few chunks in the next level down, found using its forwarding pointers. Now we have a tree again! As it turns out, this structure makes it easier to manage an LRU-style cache of blocks and is more flexible in the face of “hotspot” workloads. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 25 / 31
  • 78. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results Modified B-tree-like dynamic (inserts, updates, deletes) data structure that supports point and range queries. Inserts ( are a factor B/ log B (typically 10-100x in practice) faster than a B-tree: O log N B ) O ( log N log B ) . Searches are a factor log B/ log R slower than a B-tree: O ( log N log R ) O ( log N log B ) . To amortize flush costs over many elements, we want each block we write to be large (4MB), much larger than typical B-tree blocks (16KB). These compress well. Leif Walsh (Tokutek) Fractal Trees November 1, 2014 26 / 31
  • 79. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applications TokuDB for MySQL, TokuMX for MongoDB: Faster indexed insertions. Hot schema changes. Compression. Faster replication on secondaries (TokuMX). Lower impact migrations (TokuMX). Fast (no read before write) updates (in TokuDB, coming soon in TokuMX). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 27 / 31
  • 80. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Applications TokuDB for MySQL, TokuMX for MongoDB: Faster indexed insertions. Hot schema changes. Compression. Faster replication on secondaries (TokuMX). Lower impact migrations (TokuMX). Fast (no read before write) updates (in TokuDB, coming soon in TokuMX). ACID transactions. Concurrency (TokuMX). Leif Walsh (Tokutek) Fractal Trees November 1, 2014 27 / 31
  • 81. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmarks Leif Walsh (Tokutek) Fractal Trees November 1, 2014 28 / 31
  • 82. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmarks Leif Walsh (Tokutek) Fractal Trees November 1, 2014 29 / 31
  • 83. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmarks Leif Walsh (Tokutek) Fractal Trees November 1, 2014 30 / 31
  • 84. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Questions? Leif Walsh leif@tokutek.com @leifwalsh Downloads: www.tokutek.com/downloads Docs: docs.tokutek.com Slides: slidesha.re/1tqwORg Leif Walsh (Tokutek) Fractal Trees November 1, 2014 31 / 31