SlideShare uma empresa Scribd logo
1 de 89
Baixar para ler offline
Optimizer Histograms
When they Help and When Do Not?
February, 01, 2019
Sveta Smirnova
• MySQL Support engineer
• Author of
• MySQL Troubleshooting
• JSON UDF functions
• FILTER clause for MySQL
• Speaker
• Percona Live, OOW, Fosdem,
DevConf, HighLoad...
Sveta Smirnova
2
•Why do I Care?
•The Use Case
•Even Worse Use Case
•Why the Difference?
•How Histograms Work?
Table of Contents
3
The column statistics data dictionary table stores histogram statistics about
column values, for use by the optimizer in constructing query execution plans
MySQL User Reference Manual
Optimizer Statistics aka Histograms
4
Why do I Care?
• Data distribution vary
•
Big difference between number of values
•
Costantly changing
Latest Support Tickets
6
• Data distribution vary
• Cardinality is not correct
• Was not updated in time
•
Updates too often
• Calculated wrongly
Latest Support Tickets
6
• Data distribution vary
• Cardinality is not correct
• Index maintenance costs a lot
• Hardware resources
•
Slow updates
• Window to run CREATE INDEX
Latest Support Tickets
6
• Data distribution vary
• Cardinality is not correct
• Index maintenance costs a lot
• Optimizer does not work as we wish to
Examples in my talk @Percona Live
Latest Support Tickets
6
• Topic based on real Support cases
•
Couple of them are still in progress
Disclaimer
7
• Topic based on real Support cases
• All examples are 100% fake
•
They created such that
• No customer can be identified
• Everything generated
Table names
Column names
Data
• Use case itself is fictional
Disclaimer
7
• Topic based on real Support cases
• All examples are 100% fake
• All examples are simplified
• Only columns, required to show the issue
•
Everything extra removed
• Real tables usually store much more data
Disclaimer
7
• Topic based on real Support cases
• All examples are 100% fake
• All examples are simplified
• All disasters happened with version 5.7
Disclaimer
7
The Use Case
•
categories
• Less than 20 rows
Two tables
9
•
categories
• Less than 20 rows
• goods
• More than 1M rows
• 20 unique cat id values
• Many other fields
Price
Date: added, last updated, etc.
Characteristics
Store
...
Two tables
9
select *
from
goods
join
categories
on
(categories.id=goods.cat_id)
where
date_added between ’2018-07-01’ and ’2018-08-01’
and
cat_id in (16,11)
and
price >= 1000 and <=10000 [ and ... ]
[ GROUP BY ... [ORDER BY ... [ LIMIT ...]]]
;
JOIN
10
• Select from the Small Table
Option 1: Select from the Small Table First
11
• Select from the Small Table
• For each cat id select from the large table
Option 1: Select from the Small Table First
11
• Select from the Small Table
• For each cat id select from the large table
• Filter result on date added[ and price[...]]
Option 1: Select from the Small Table First
11
• Select from the Small Table
• For each cat id select from the large table
• Filter result on date added[ and price[...]]
• Slow with many items in the category
Option 1: Select from the Small Table First
11
• Filter rows by date added[ and price[...]]
Option 2: Select from the Large Table First
12
• Filter rows by date added[ and price[...]]
• Get cat id values
Option 2: Select from the Large Table First
12
• Filter rows by date added[ and price[...]]
• Get cat id values
• Retrieve rows from the small table
Option 2: Select from the Large Table First
12
• Filter rows by date added[ and price[...]]
• Get cat id values
• Retrieve rows from the small table
• Slow if number of rows, filtered by
date added, is larger than number of goods in
the selected categories
Option 2: Select from the Large Table First
12
•
CREATE INDEX index everything
(cat id, date added[, price[, ...]])
• It resolves the issue
What if use Combined Indexes?
13
•
CREATE INDEX index everything
(cat id, date added[, price[, ...]])
• It resolves the issue
• But not in all cases
What if use Combined Indexes?
13
• Maintenance cost
•
Slower INSERT/UPDATE/DELETE
• Disk space
The Problem
14
• Maintenance cost
•
Slower INSERT/UPDATE/DELETE
• Disk space
• Index not useful for selecting rows
JOIN categories ON (categories.id=goods.cat_id)
JOIN shops ON (shops.id=goods.shop_id)
[ JOIN ... ]
WHERE
date_added between ’2018-07-01’ and ’2018-08-01’
AND
cat_id in (16,11) AND price >= 1000 AND price <=10000 [ AND ... ]
GROUP BY product_type
ORDER BY date_updated DESC
LIMIT 50,100
The Problem
14
• Maintenance cost
•
Slower INSERT/UPDATE/DELETE
• Disk space
• Index not useful for selecting rows
• Tables may have wrong cardinality
The Problem
14
• EXPLAIN without histograms
mysql> explain select goods.* from goods
-> join categories on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’ -- Large range
-> order by goods.cat_id
-> limit 10G -- We ask for 10 rows only!
Example
15
• EXPLAIN without histograms
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: categories -- Small table first
partitions: NULL
type: index
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
rows: 20
filtered: 70.00
Extra: Using where; Using index;
Using temporary; Using filesort
Example
15
• EXPLAIN without histograms
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: goods -- Large table
partitions: NULL
type: ref
possible_keys: cat_id_2
key: cat_id_2
key_len: 5
ref: orig.categories.id
rows: 51827
filtered: 11.11 -- Default value
Extra: Using where
2 rows in set, 1 warning (0.01 sec)
Example
15
• Execution time without histograms
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> select goods.* from goods
-> join categories on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’
-> order by goods.cat_id
-> limit 10;
ab9f9bb7bc4f357712ec34f067eda364 -
10 rows in set (56.47 sec)
Example
15
• Engine statistics without histograms
mysql> show status like ’Handler%’;
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
...
| Handler_read_next | 964718 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 10 |
| Handler_read_rnd_next | 951671 |
...
| Handler_write | 951670 |
+----------------------------+--------+
18 rows in set (0.01 sec)
Example
15
• Now lets add the histogram
mysql> analyze table goods update histogram on date_added;
+------------+-----------+----------+------------------------------+
| Table | Op | Msg_type | Msg_text |
+------------+-----------+----------+------------------------------+
| orig.goods | histogram | status | Histogram statistics created
for column ’date_added’. |
+------------+-----------+----------+------------------------------+
1 row in set (2.01 sec)
Example
15
• EXPLAIN with the histogram
mysql> explain select goods.* from goods
-> join categories
-> on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’
-> order by goods.cat_id
-> limit 10G
Example
15
• EXPLAIN with the histogram
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: goods -- Large table first
partitions: NULL
type: index
possible_keys: cat_id_2
key: cat_id_2
key_len: 5
ref: NULL
rows: 10 -- Same as we asked
filtered: 98.70 -- True numbers
Extra: Using where
Example
15
• EXPLAIN with the histogram
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: categories -- Small table
partitions: NULL
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: orig.goods.cat_id
rows: 1
filtered: 100.00
Extra: Using index
2 rows in set, 1 warning (0.01 sec)
Example
15
• Execution time with the histogram
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> select goods.* from goods
-> join categories on (categories.id=goods.cat_id)
-> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17)
-> and
-> date_added between ’2000-01-01’ and ’2001-01-01’
-> order by goods.cat_id
-> limit 10;
eeb005fae0dd3441c5c380e1d87fee84 -
10 rows in set (0.00 sec) -- 56/0 times faster!
Example
15
• Engine statistics with the histogram
mysql> show status like ’Handler%’;
+----------------------------+-------++----------------------------+-------+
| Variable_name | Value || Variable_name | Value |
+----------------------------+-------++----------------------------+-------+
| Handler_commit | 1 || Handler_read_prev | 0 |
| Handler_delete | 0 || Handler_read_rnd | 0 |
| Handler_discover | 0 || Handler_read_rnd_next | 0 |
| Handler_external_lock | 4 || Handler_rollback | 0 |
| Handler_mrr_init | 0 || Handler_savepoint | 0 |
| Handler_prepare | 0 || Handler_savepoint_rollback | 0 |
| Handler_read_first | 1 || Handler_update | 0 |
| Handler_read_key | 3 || Handler_write | 0 |
| Handler_read_last | 0 |+----------------------------+-------+
| Handler_read_next | 9 |18 rows in set (0.00 sec)
Example
15
Even Worse Use Case
•
goods characteristics
CREATE TABLE ‘goods_characteristics‘ (
‘id‘ int(11) NOT NULL AUTO_INCREMENT,
‘good_id‘ varchar(30) DEFAULT NULL,
‘size‘ int(11) DEFAULT NULL,
‘manufacturer‘ varchar(30) DEFAULT NULL,
PRIMARY KEY (‘id‘),
KEY ‘good_id‘ (‘good_id‘,‘size‘,‘manufacturer‘),
KEY ‘size‘ (‘size‘,‘manufacturer‘)
) ENGINE=InnoDB AUTO_INCREMENT=196606 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Two Similar Tables
17
•
goods shops
CREATE TABLE ‘goods_shops‘ (
‘id‘ int(11) NOT NULL AUTO_INCREMENT,
‘good_id‘ varchar(30) DEFAULT NULL,
‘location‘ varchar(30) DEFAULT NULL,
‘delivery_options‘ varchar(30) DEFAULT NULL,
PRIMARY KEY (‘id‘),
KEY ‘good_id‘ (‘good_id‘,‘location‘,‘delivery_options‘),
KEY ‘location‘ (‘location‘,‘delivery_options‘)
) ENGINE=InnoDB AUTO_INCREMENT=131071 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
Two Similar Tables
17
• Size
mysql> select count(*) from goods_characteristics;
+----------+
| count(*) |
+----------+
| 131072 |
+----------+
1 row in set (0.08 sec)
mysql> select count(*) from goods_shops;
+----------+
| count(*) |
+----------+
| 65536 |
+----------+
1 row in set (0.04 sec)
Two Similar Tables
17
• Data Distribution: goods characteristics
mysql> select count(*) num_rows, good_id, size
-> from goods_characteristics group by good_id, size;
+----------+---------+------+
| num_rows | good_id | size |
+----------+---------+------+
| 65536 | laptop | 7 |
| 8187 | laptop | 8 |
| 8190 | laptop | 9 |
| 8188 | laptop | 10 |
| 8192 | laptop | 11 |
| 8189 | laptop | 12 |
| 8189 | laptop | 13 |
| 8191 | laptop | 14 |
| 8190 | laptop | 15 |
| 10 | laptop | 16 |
| 10 | laptop | 17 |
+----------+---------+------+
Two Similar Tables
17
• Data Distribution: goods characteristics
mysql> select count(*) num_rows, good_id, manufacturer
-> from goods_characteristics group by good_id, manufacturer order by num_rows desc;
+----------+---------+--------------+
| num_rows | good_id | manufacturer |
+----------+---------+--------------+
| 65536 | laptop | Noname |
| 8191 | laptop | Samsung |
| 8191 | laptop | Acer |
| 8189 | laptop | Dell |
| 8189 | laptop | HP |
| 8189 | laptop | Lenovo |
| 8189 | laptop | Toshiba |
| 8189 | laptop | Apple |
| 8189 | laptop | Asus |
| 10 | laptop | Sony |
| 10 | laptop | Casper |
+----------+---------+--------------+
Two Similar Tables
17
• Data Distribution: goods shops
mysql> select count(*) num_rows, good_id, location
-> from goods_shops group by good_id, location order by num_rows desc;
+----------+---------+---------------+
| num_rows | good_id | location |
+----------+---------+---------------+
| 8191 | laptop | New York |
| 8191 | laptop | San Francisco |
| 8189 | laptop | Paris |
| 8189 | laptop | Berlin |
| 8189 | laptop | Brussels |
| 8189 | laptop | Tokio |
| 8189 | laptop | Istanbul |
| 8189 | laptop | London |
| 10 | laptop | Moscow |
| 10 | laptop | Kiev |
+----------+---------+---------------+
Two Similar Tables
17
• Data Distribution: goods shops
mysql> select count(*) num_rows, good_id, delivery_options
-> from goods_shops group by good_id, delivery_options order by num_rows desc;
+----------+---------+------------------+
| num_rows | good_id | delivery_options |
+----------+---------+------------------+
| 8192 | laptop | DHL |
| 8191 | laptop | PTT |
| 8190 | laptop | Normal Post |
| 8190 | laptop | Tracked |
| 8189 | laptop | Fedex |
| 8189 | laptop | Gruzovichkof |
| 8188 | laptop | Courier |
| 8187 | laptop | No delivery |
| 10 | laptop | Premium |
| 10 | laptop | Urgent |
+----------+---------+------------------+
Two Similar Tables
17
Histogram statistics are useful primarily for nonindexed columns. Adding an
index to a column for which histogram statistics are applicable might also help
the optimizer make row estimates. The tradeoffs are:
An index must be updated when table data is modified.
A histogram is created or updated only on demand, so it adds no overhead
when table data is modified. On the other hand, the statistics become progres-
sively more out of date when table modifications occur, until the next time they
are updated.
MySQL User Reference Manual
Optimizer Statistics aka Histograms
18
mysql> alter table goods_characteristics stats_sample_pages=5000;
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> alter table goods_shops stats_sample_pages=5000;
Query OK, 0 rows affected (0.05 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> analyze table goods_characteristics, goods_shops;
+----------------------------+---------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+----------------------------+---------+----------+----------+
| test.goods_characteristics | analyze | status | OK |
| test.goods_shops | analyze | status | OK |
+----------------------------+---------+----------+----------+
2 rows in set (0.35 sec)
Index Statistics is More than Good
19
• The query
mysql> select count(*) from goods_shops join goods_characteristics using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
^C^C -- query aborted
ERROR 1317 (70100): Query execution was interrupted
Performance?
20
• Handlers
mysql> show status like ’Handler%’;
+----------------------------+-------------+
| Variable_name | Value |
+----------------------------+-------------+
| Handler_commit | 0 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_external_lock | 4 |
| Handler_mrr_init | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 1 |
| Handler_read_key | 13043 |
| Handler_read_last | 0 |
| Handler_read_next | 854,767,916 |
...
Performance?
20
• Table order
mysql> explain select count(*) from goods_shops join goods_characteristics using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| id | table | type | key | rows | filtered | Extra |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| 1 | goods_characteristics | index | good_id | 131072 | 25.00 | Using where; Using index |
| 1 | goods_shops | ref | good_id | 65536 | 36.00 | Using where; Using index |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
Performance?
20
• Table order matters
mysql> explain select count(*) from goods_shops straight_join goods_characteristics
-> using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| id | table | type | key | rows | filtered | Extra |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| 1 | goods_shops | index | good_id | 65536 | 36.00 | Using where; Using index |
| 1 | goods_characteristics | ref | good_id | 131072 | 25.00 | Using where; Using index |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
Performance?
20
• Table order matters
mysql> select count(*) from goods_shops straight_join goods_characteristics using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
+----------+
| count(*) |
+----------+
| 816640 |
+----------+
1 row in set (2.11 sec)
mysql> show status like ’Handler_read_next’;
+-------------------+-----------+
| Variable_name | Value |
+-------------------+-----------+
| Handler_read_next | 5,308,416 |
+-------------------+-----------+
1 row in set (0.00 sec)
Performance?
20
mysql> analyze table goods_shops update histogram on location, delivery_options;
+-------------+-----------+----------+-----------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------------+-----------+----------+-----------------------------------------------------+
| goods_shops | histogram | status | Histogram statistics created... ’delivery_options’. |
| goods_shops | histogram | status | Histogram statistics created for column ’location’. |
+-------------+-----------+----------+-----------------------------------------------------+
2 rows in set (0.18 sec)
mysql> analyze table goods_characteristics update histogram on size, manufacturer ;
+-----------------------+-----------+----------+-------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-----------------------+-----------+----------+-------------------------------------------------+
| goods_characteristics | histogram | status | Histogram statistics created... ’manufacturer’. |
| goods_characteristics | histogram | status | Histogram statistics created for column ’size’. |
+-----------------------+-----------+----------+-------------------------------------------------+
2 rows in set (0.23 sec)
Histograms to Rescue
21
• The query
mysql> select count(*) from goods_shops join goods_characteristics using (good_id)
-> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’)
-> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’));
+----------+
| count(*) |
+----------+
| 816640 |
+----------+
1 row in set (2.16 sec)
mysql> show status like ’Handler_read_next’;
+-------------------+-----------+
| Variable_name | Value |
+-------------------+-----------+
| Handler_read_next | 5,308,418 |
+-------------------+-----------+
1 row in set (0.00 sec)
Histograms to Rescue
21
• Filtering effect
mysql> explain select count(*) from goods_shops join goods_characteristics using (good_id) where s
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| id | table | type | key | rows | filtered | Extra |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
| 1 | goods_shops | index | good_id | 65536 | 0.06 | Using where; Using index |
| 1 | goods_characteristics | ref | good_id | 131072 | 15.63 | Using where; Using index |
+----+-----------------------+-------+---------+--------+----------+--------------------------+
2 rows in set, 1 warning (0.00 sec)
Histograms to Rescue
21
Why the Difference?
1 2 3 4 5 6 7 8 9 10
0
200
400
600
800
Indexes: Number of Items with Same Value
23
1 2 3 4 5 6 7 8 9 10
0
200
400
600
800
Indexes: Cardinality
24
1 2 3 4 5 6 7 8 9 10
0
200
400
600
800
Histograms: Number of Values in Each Bucket
25
1 2 3 4 5 6 7 8 9 10
0
0.2
0.4
0.6
0.8
1
Histograms: Data in the Histogram
26
How Histograms Work?
↓ sql/sql planner.cc
Low Level
28
↓ sql/sql planner.cc
↓ calculate condition filter
Low Level
28
↓ sql/sql planner.cc
↓ calculate condition filter
↓ Item func *::get filtering effect
Low Level
28
↓ sql/sql planner.cc
↓ calculate condition filter
↓ Item func *::get filtering effect
• get histogram selectivity
Low Level
28
↓ sql/sql planner.cc
↓ calculate condition filter
↓ Item func *::get filtering effect
• get histogram selectivity
• Seen as a percent of filtered rows in EXPLAIN
Low Level
28
• Example data
mysql> create table example(f1 int) engine=innodb;
mysql> insert into example values(1),(1),(1),(2),(3);
mysql> select f1, count(f1) from example group by f1;
+------+-----------+
| f1 | count(f1) |
+------+-----------+
| 1 | 3 |
| 2 | 1 |
| 3 | 1 |
+------+-----------+
3 rows in set (0.00 sec)
Filtered Rows
29
• Without a histogram
mysql> explain select * from example where f1 > 0G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• Without a histogram
mysql> explain select * from example where f1 > 1G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• Without a histogram
mysql> explain select * from example where f1 > 2G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• Without a histogram
mysql> explain select * from example where f1 > 3G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 33.33
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> analyze table example update histogram on f1 with 3 buckets;
+-----------------+-----------+----------+------------------------------+
| Table | Op | Msg_type | Msg_text |
+-----------------+-----------+----------+------------------------------+
| hist_ex.example | histogram | status | Histogram statistics created
for column ’f1’. |
+-----------------+-----------+----------+------------------------------+
1 row in set (0.03 sec)
Filtered Rows
29
• With the histogram
mysql> select * from information_schema.column_statistics
-> where table_name=’example’G
*************************** 1. row ***************************
SCHEMA_NAME: hist_ex
TABLE_NAME: example
COLUMN_NAME: f1
HISTOGRAM:
"buckets": [[1, 0.6], [2, 0.8], [3, 1.0]],
"data-type": "int", "null-values": 0.0, "collation-id": 8,
"last-updated": "2018-11-07 09:07:19.791470",
"sampling-rate": 1.0, "histogram-type": "singleton",
"number-of-buckets-specified": 3
1 row in set (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> explain select * from example where f1 > 0G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 100.00 -- all rows
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> explain select * from example where f1 > 1G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 40.00 -- 2 rows
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> explain select * from example where f1 > 2G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 20.00 -- one row
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
• With the histogram
mysql> explain select * from example where f1 > 3G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: example
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 5
filtered: 20.00 - one row
Extra: Using where
1 row in set, 1 warning (0.00 sec)
Filtered Rows
29
•
CREATE INDEX
• Metadata lock
•
Can be blocked by any query
Locking
30
•
CREATE INDEX
• Metadata lock
•
Can be blocked by any query
• UPDATE HISTOGRAM
• Backup lock
• Can be locked only by a backup
•
Can be created any time without fear
Locking
30
• Helps if query plan can be changed
• Not a replacement for the index:
•
GROUP BY
• ORDER BY
• Query on a single table ∗
Outcome
31
• Data distribution is uniform
• Range optimization can be used
• Full table scan is fast
When Histogram are not Helpful?
32
• Index statistics collected by the engine
• Optimizer calculates Cardinality each time
when accesses statistics
•
Indexes not always improve performance
• Histograms can help
Still new feature
• Histograms do not replace other optimizations!
Conclusion
33
MySQL User Reference Manual
Blog by Erik Froseth
Blog by Frederic Descamps
Talk by Oystein Grovlen @Fosdem
Talk by Sergei Petrunia @PerconaLive
WL #8707
More information
34
www.slideshare.net/SvetaSmirnova
twitter.com/svetsmirnova
github.com/svetasmirnova
Thank you!
35
DATABASE PERFORMANCE
MATTERS

Mais conteúdo relacionado

Mais procurados

Understanding Query Execution
Understanding Query ExecutionUnderstanding Query Execution
Understanding Query Execution
webhostingguy
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf Tuning
HighLoad2009
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
Jonathan Levin
 
Explaining the MySQL Explain
Explaining the MySQL ExplainExplaining the MySQL Explain
Explaining the MySQL Explain
MYXPLAIN
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 Minutes
Sveta Smirnova
 

Mais procurados (20)

Preparse Query Rewrite Plugins
Preparse Query Rewrite PluginsPreparse Query Rewrite Plugins
Preparse Query Rewrite Plugins
 
0888 learning-mysql
0888 learning-mysql0888 learning-mysql
0888 learning-mysql
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]s
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
 
Introduction into MySQL Query Tuning
Introduction into MySQL Query TuningIntroduction into MySQL Query Tuning
Introduction into MySQL Query Tuning
 
Performance Schema in Action: demo
Performance Schema in Action: demoPerformance Schema in Action: demo
Performance Schema in Action: demo
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101
 
Understanding Query Execution
Understanding Query ExecutionUnderstanding Query Execution
Understanding Query Execution
 
Troubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-onsTroubleshooting MySQL Performance add-ons
Troubleshooting MySQL Performance add-ons
 
Why Use EXPLAIN FORMAT=JSON?
 Why Use EXPLAIN FORMAT=JSON?  Why Use EXPLAIN FORMAT=JSON?
Why Use EXPLAIN FORMAT=JSON?
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf Tuning
 
Optimizer overviewoow2014
Optimizer overviewoow2014Optimizer overviewoow2014
Optimizer overviewoow2014
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQL
 
MySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZEMySQL 8.0 EXPLAIN ANALYZE
MySQL 8.0 EXPLAIN ANALYZE
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
Explaining the MySQL Explain
Explaining the MySQL ExplainExplaining the MySQL Explain
Explaining the MySQL Explain
 
Oracle Database Advanced Querying
Oracle Database Advanced QueryingOracle Database Advanced Querying
Oracle Database Advanced Querying
 
New features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in actionNew features in Performance Schema 5.7 in action
New features in Performance Schema 5.7 in action
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 Minutes
 

Semelhante a Optimizer Histograms: When they Help and When Do Not?

10x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp0210x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp02
promethius
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance Improvements
Ronald Bradford
 

Semelhante a Optimizer Histograms: When they Help and When Do Not? (20)

5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
 
Part2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer StatisticsPart2 Best Practices for Managing Optimizer Statistics
Part2 Best Practices for Managing Optimizer Statistics
 
How Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data ManagementHow Database Convergence Impacts the Coming Decades of Data Management
How Database Convergence Impacts the Coming Decades of Data Management
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New Tricks
 
SunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQLSunshinePHP 2017 - Making the most out of MySQL
SunshinePHP 2017 - Making the most out of MySQL
 
Cassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super ModelerCassandra Community Webinar | Become a Super Modeler
Cassandra Community Webinar | Become a Super Modeler
 
Improving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimatesImproving MariaDB’s Query Optimizer with better selectivity estimates
Improving MariaDB’s Query Optimizer with better selectivity estimates
 
Quick Wins
Quick WinsQuick Wins
Quick Wins
 
Migration from mysql to elasticsearch
Migration from mysql to elasticsearchMigration from mysql to elasticsearch
Migration from mysql to elasticsearch
 
Advanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdfAdvanced MariaDB features that developers love.pdf
Advanced MariaDB features that developers love.pdf
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
Optimizer features in recent releases of other databases
Optimizer features in recent releases of other databasesOptimizer features in recent releases of other databases
Optimizer features in recent releases of other databases
 
Presentation top tips for getting optimal sql execution
Presentation    top tips for getting optimal sql executionPresentation    top tips for getting optimal sql execution
Presentation top tips for getting optimal sql execution
 
DPC18 - Making the most out of MySQL
DPC18 - Making the most out of MySQLDPC18 - Making the most out of MySQL
DPC18 - Making the most out of MySQL
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStore
 
10x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp0210x improvement-mysql-100419105218-phpapp02
10x improvement-mysql-100419105218-phpapp02
 
10x Performance Improvements
10x Performance Improvements10x Performance Improvements
10x Performance Improvements
 
Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8Improved histograms in MariaDB 10.8
Improved histograms in MariaDB 10.8
 
MySQL performance tuning
MySQL performance tuningMySQL performance tuning
MySQL performance tuning
 

Mais de Sveta Smirnova

MySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации баговMySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации багов
Sveta Smirnova
 
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB ClusterHow to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
Sveta Smirnova
 
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения PerconaСовременному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Sveta Smirnova
 

Mais de Sveta Smirnova (17)

MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
MySQL 2024: Зачем переходить на MySQL 8, если в 5.х всё устраивает?
 
Database in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and MonitoringDatabase in Kubernetes: Diagnostics and Monitoring
Database in Kubernetes: Diagnostics and Monitoring
 
MySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to HaveMySQL Database Monitoring: Must, Good and Nice to Have
MySQL Database Monitoring: Must, Good and Nice to Have
 
MySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for DevelopersMySQL Cookbook: Recipes for Developers
MySQL Cookbook: Recipes for Developers
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOps
 
MySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации баговMySQL Test Framework для поддержки клиентов и верификации багов
MySQL Test Framework для поддержки клиентов и верификации багов
 
MySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your BusinessMySQL Cookbook: Recipes for Your Business
MySQL Cookbook: Recipes for Your Business
 
Производительность MySQL для DevOps
 Производительность MySQL для DevOps Производительность MySQL для DevOps
Производительность MySQL для DevOps
 
MySQL Performance for DevOps
MySQL Performance for DevOpsMySQL Performance for DevOps
MySQL Performance for DevOps
 
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB ClusterHow to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
How to Avoid Pitfalls in Schema Upgrade with Percona XtraDB Cluster
 
How to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tearsHow to migrate from MySQL to MariaDB without tears
How to migrate from MySQL to MariaDB without tears
 
How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?
 
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения PerconaСовременному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
Современному хайлоду - современные решения: MySQL 8.0 и улучшения Percona
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
 
How Safe is Asynchronous Master-Master Setup?
 How Safe is Asynchronous Master-Master Setup? How Safe is Asynchronous Master-Master Setup?
How Safe is Asynchronous Master-Master Setup?
 
Что нужно знать о трёх топовых фичах MySQL
Что нужно знать  о трёх топовых фичах  MySQLЧто нужно знать  о трёх топовых фичах  MySQL
Что нужно знать о трёх топовых фичах MySQL
 
Why MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it BackWhy MySQL Replication Fails, and How to Get it Back
Why MySQL Replication Fails, and How to Get it Back
 

Último

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Último (20)

Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

Optimizer Histograms: When they Help and When Do Not?

  • 1. Optimizer Histograms When they Help and When Do Not? February, 01, 2019 Sveta Smirnova
  • 2. • MySQL Support engineer • Author of • MySQL Troubleshooting • JSON UDF functions • FILTER clause for MySQL • Speaker • Percona Live, OOW, Fosdem, DevConf, HighLoad... Sveta Smirnova 2
  • 3. •Why do I Care? •The Use Case •Even Worse Use Case •Why the Difference? •How Histograms Work? Table of Contents 3
  • 4. The column statistics data dictionary table stores histogram statistics about column values, for use by the optimizer in constructing query execution plans MySQL User Reference Manual Optimizer Statistics aka Histograms 4
  • 5. Why do I Care?
  • 6. • Data distribution vary • Big difference between number of values • Costantly changing Latest Support Tickets 6
  • 7. • Data distribution vary • Cardinality is not correct • Was not updated in time • Updates too often • Calculated wrongly Latest Support Tickets 6
  • 8. • Data distribution vary • Cardinality is not correct • Index maintenance costs a lot • Hardware resources • Slow updates • Window to run CREATE INDEX Latest Support Tickets 6
  • 9. • Data distribution vary • Cardinality is not correct • Index maintenance costs a lot • Optimizer does not work as we wish to Examples in my talk @Percona Live Latest Support Tickets 6
  • 10. • Topic based on real Support cases • Couple of them are still in progress Disclaimer 7
  • 11. • Topic based on real Support cases • All examples are 100% fake • They created such that • No customer can be identified • Everything generated Table names Column names Data • Use case itself is fictional Disclaimer 7
  • 12. • Topic based on real Support cases • All examples are 100% fake • All examples are simplified • Only columns, required to show the issue • Everything extra removed • Real tables usually store much more data Disclaimer 7
  • 13. • Topic based on real Support cases • All examples are 100% fake • All examples are simplified • All disasters happened with version 5.7 Disclaimer 7
  • 15. • categories • Less than 20 rows Two tables 9
  • 16. • categories • Less than 20 rows • goods • More than 1M rows • 20 unique cat id values • Many other fields Price Date: added, last updated, etc. Characteristics Store ... Two tables 9
  • 17. select * from goods join categories on (categories.id=goods.cat_id) where date_added between ’2018-07-01’ and ’2018-08-01’ and cat_id in (16,11) and price >= 1000 and <=10000 [ and ... ] [ GROUP BY ... [ORDER BY ... [ LIMIT ...]]] ; JOIN 10
  • 18. • Select from the Small Table Option 1: Select from the Small Table First 11
  • 19. • Select from the Small Table • For each cat id select from the large table Option 1: Select from the Small Table First 11
  • 20. • Select from the Small Table • For each cat id select from the large table • Filter result on date added[ and price[...]] Option 1: Select from the Small Table First 11
  • 21. • Select from the Small Table • For each cat id select from the large table • Filter result on date added[ and price[...]] • Slow with many items in the category Option 1: Select from the Small Table First 11
  • 22. • Filter rows by date added[ and price[...]] Option 2: Select from the Large Table First 12
  • 23. • Filter rows by date added[ and price[...]] • Get cat id values Option 2: Select from the Large Table First 12
  • 24. • Filter rows by date added[ and price[...]] • Get cat id values • Retrieve rows from the small table Option 2: Select from the Large Table First 12
  • 25. • Filter rows by date added[ and price[...]] • Get cat id values • Retrieve rows from the small table • Slow if number of rows, filtered by date added, is larger than number of goods in the selected categories Option 2: Select from the Large Table First 12
  • 26. • CREATE INDEX index everything (cat id, date added[, price[, ...]]) • It resolves the issue What if use Combined Indexes? 13
  • 27. • CREATE INDEX index everything (cat id, date added[, price[, ...]]) • It resolves the issue • But not in all cases What if use Combined Indexes? 13
  • 28. • Maintenance cost • Slower INSERT/UPDATE/DELETE • Disk space The Problem 14
  • 29. • Maintenance cost • Slower INSERT/UPDATE/DELETE • Disk space • Index not useful for selecting rows JOIN categories ON (categories.id=goods.cat_id) JOIN shops ON (shops.id=goods.shop_id) [ JOIN ... ] WHERE date_added between ’2018-07-01’ and ’2018-08-01’ AND cat_id in (16,11) AND price >= 1000 AND price <=10000 [ AND ... ] GROUP BY product_type ORDER BY date_updated DESC LIMIT 50,100 The Problem 14
  • 30. • Maintenance cost • Slower INSERT/UPDATE/DELETE • Disk space • Index not useful for selecting rows • Tables may have wrong cardinality The Problem 14
  • 31. • EXPLAIN without histograms mysql> explain select goods.* from goods -> join categories on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -- Large range -> order by goods.cat_id -> limit 10G -- We ask for 10 rows only! Example 15
  • 32. • EXPLAIN without histograms *************************** 1. row *************************** id: 1 select_type: SIMPLE table: categories -- Small table first partitions: NULL type: index possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: NULL rows: 20 filtered: 70.00 Extra: Using where; Using index; Using temporary; Using filesort Example 15
  • 33. • EXPLAIN without histograms *************************** 2. row *************************** id: 1 select_type: SIMPLE table: goods -- Large table partitions: NULL type: ref possible_keys: cat_id_2 key: cat_id_2 key_len: 5 ref: orig.categories.id rows: 51827 filtered: 11.11 -- Default value Extra: Using where 2 rows in set, 1 warning (0.01 sec) Example 15
  • 34. • Execution time without histograms mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select goods.* from goods -> join categories on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -> order by goods.cat_id -> limit 10; ab9f9bb7bc4f357712ec34f067eda364 - 10 rows in set (56.47 sec) Example 15
  • 35. • Engine statistics without histograms mysql> show status like ’Handler%’; +----------------------------+--------+ | Variable_name | Value | +----------------------------+--------+ ... | Handler_read_next | 964718 | | Handler_read_prev | 0 | | Handler_read_rnd | 10 | | Handler_read_rnd_next | 951671 | ... | Handler_write | 951670 | +----------------------------+--------+ 18 rows in set (0.01 sec) Example 15
  • 36. • Now lets add the histogram mysql> analyze table goods update histogram on date_added; +------------+-----------+----------+------------------------------+ | Table | Op | Msg_type | Msg_text | +------------+-----------+----------+------------------------------+ | orig.goods | histogram | status | Histogram statistics created for column ’date_added’. | +------------+-----------+----------+------------------------------+ 1 row in set (2.01 sec) Example 15
  • 37. • EXPLAIN with the histogram mysql> explain select goods.* from goods -> join categories -> on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -> order by goods.cat_id -> limit 10G Example 15
  • 38. • EXPLAIN with the histogram *************************** 1. row *************************** id: 1 select_type: SIMPLE table: goods -- Large table first partitions: NULL type: index possible_keys: cat_id_2 key: cat_id_2 key_len: 5 ref: NULL rows: 10 -- Same as we asked filtered: 98.70 -- True numbers Extra: Using where Example 15
  • 39. • EXPLAIN with the histogram *************************** 2. row *************************** id: 1 select_type: SIMPLE table: categories -- Small table partitions: NULL type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: orig.goods.cat_id rows: 1 filtered: 100.00 Extra: Using index 2 rows in set, 1 warning (0.01 sec) Example 15
  • 40. • Execution time with the histogram mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select goods.* from goods -> join categories on (categories.id=goods.cat_id) -> where cat_id in (20,2,18,4,16,6,14,1,12,11,10,9,8,17) -> and -> date_added between ’2000-01-01’ and ’2001-01-01’ -> order by goods.cat_id -> limit 10; eeb005fae0dd3441c5c380e1d87fee84 - 10 rows in set (0.00 sec) -- 56/0 times faster! Example 15
  • 41. • Engine statistics with the histogram mysql> show status like ’Handler%’; +----------------------------+-------++----------------------------+-------+ | Variable_name | Value || Variable_name | Value | +----------------------------+-------++----------------------------+-------+ | Handler_commit | 1 || Handler_read_prev | 0 | | Handler_delete | 0 || Handler_read_rnd | 0 | | Handler_discover | 0 || Handler_read_rnd_next | 0 | | Handler_external_lock | 4 || Handler_rollback | 0 | | Handler_mrr_init | 0 || Handler_savepoint | 0 | | Handler_prepare | 0 || Handler_savepoint_rollback | 0 | | Handler_read_first | 1 || Handler_update | 0 | | Handler_read_key | 3 || Handler_write | 0 | | Handler_read_last | 0 |+----------------------------+-------+ | Handler_read_next | 9 |18 rows in set (0.00 sec) Example 15
  • 43. • goods characteristics CREATE TABLE ‘goods_characteristics‘ ( ‘id‘ int(11) NOT NULL AUTO_INCREMENT, ‘good_id‘ varchar(30) DEFAULT NULL, ‘size‘ int(11) DEFAULT NULL, ‘manufacturer‘ varchar(30) DEFAULT NULL, PRIMARY KEY (‘id‘), KEY ‘good_id‘ (‘good_id‘,‘size‘,‘manufacturer‘), KEY ‘size‘ (‘size‘,‘manufacturer‘) ) ENGINE=InnoDB AUTO_INCREMENT=196606 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci Two Similar Tables 17
  • 44. • goods shops CREATE TABLE ‘goods_shops‘ ( ‘id‘ int(11) NOT NULL AUTO_INCREMENT, ‘good_id‘ varchar(30) DEFAULT NULL, ‘location‘ varchar(30) DEFAULT NULL, ‘delivery_options‘ varchar(30) DEFAULT NULL, PRIMARY KEY (‘id‘), KEY ‘good_id‘ (‘good_id‘,‘location‘,‘delivery_options‘), KEY ‘location‘ (‘location‘,‘delivery_options‘) ) ENGINE=InnoDB AUTO_INCREMENT=131071 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci Two Similar Tables 17
  • 45. • Size mysql> select count(*) from goods_characteristics; +----------+ | count(*) | +----------+ | 131072 | +----------+ 1 row in set (0.08 sec) mysql> select count(*) from goods_shops; +----------+ | count(*) | +----------+ | 65536 | +----------+ 1 row in set (0.04 sec) Two Similar Tables 17
  • 46. • Data Distribution: goods characteristics mysql> select count(*) num_rows, good_id, size -> from goods_characteristics group by good_id, size; +----------+---------+------+ | num_rows | good_id | size | +----------+---------+------+ | 65536 | laptop | 7 | | 8187 | laptop | 8 | | 8190 | laptop | 9 | | 8188 | laptop | 10 | | 8192 | laptop | 11 | | 8189 | laptop | 12 | | 8189 | laptop | 13 | | 8191 | laptop | 14 | | 8190 | laptop | 15 | | 10 | laptop | 16 | | 10 | laptop | 17 | +----------+---------+------+ Two Similar Tables 17
  • 47. • Data Distribution: goods characteristics mysql> select count(*) num_rows, good_id, manufacturer -> from goods_characteristics group by good_id, manufacturer order by num_rows desc; +----------+---------+--------------+ | num_rows | good_id | manufacturer | +----------+---------+--------------+ | 65536 | laptop | Noname | | 8191 | laptop | Samsung | | 8191 | laptop | Acer | | 8189 | laptop | Dell | | 8189 | laptop | HP | | 8189 | laptop | Lenovo | | 8189 | laptop | Toshiba | | 8189 | laptop | Apple | | 8189 | laptop | Asus | | 10 | laptop | Sony | | 10 | laptop | Casper | +----------+---------+--------------+ Two Similar Tables 17
  • 48. • Data Distribution: goods shops mysql> select count(*) num_rows, good_id, location -> from goods_shops group by good_id, location order by num_rows desc; +----------+---------+---------------+ | num_rows | good_id | location | +----------+---------+---------------+ | 8191 | laptop | New York | | 8191 | laptop | San Francisco | | 8189 | laptop | Paris | | 8189 | laptop | Berlin | | 8189 | laptop | Brussels | | 8189 | laptop | Tokio | | 8189 | laptop | Istanbul | | 8189 | laptop | London | | 10 | laptop | Moscow | | 10 | laptop | Kiev | +----------+---------+---------------+ Two Similar Tables 17
  • 49. • Data Distribution: goods shops mysql> select count(*) num_rows, good_id, delivery_options -> from goods_shops group by good_id, delivery_options order by num_rows desc; +----------+---------+------------------+ | num_rows | good_id | delivery_options | +----------+---------+------------------+ | 8192 | laptop | DHL | | 8191 | laptop | PTT | | 8190 | laptop | Normal Post | | 8190 | laptop | Tracked | | 8189 | laptop | Fedex | | 8189 | laptop | Gruzovichkof | | 8188 | laptop | Courier | | 8187 | laptop | No delivery | | 10 | laptop | Premium | | 10 | laptop | Urgent | +----------+---------+------------------+ Two Similar Tables 17
  • 50. Histogram statistics are useful primarily for nonindexed columns. Adding an index to a column for which histogram statistics are applicable might also help the optimizer make row estimates. The tradeoffs are: An index must be updated when table data is modified. A histogram is created or updated only on demand, so it adds no overhead when table data is modified. On the other hand, the statistics become progres- sively more out of date when table modifications occur, until the next time they are updated. MySQL User Reference Manual Optimizer Statistics aka Histograms 18
  • 51. mysql> alter table goods_characteristics stats_sample_pages=5000; Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table goods_shops stats_sample_pages=5000; Query OK, 0 rows affected (0.05 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> analyze table goods_characteristics, goods_shops; +----------------------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +----------------------------+---------+----------+----------+ | test.goods_characteristics | analyze | status | OK | | test.goods_shops | analyze | status | OK | +----------------------------+---------+----------+----------+ 2 rows in set (0.35 sec) Index Statistics is More than Good 19
  • 52. • The query mysql> select count(*) from goods_shops join goods_characteristics using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); ^C^C -- query aborted ERROR 1317 (70100): Query execution was interrupted Performance? 20
  • 53. • Handlers mysql> show status like ’Handler%’; +----------------------------+-------------+ | Variable_name | Value | +----------------------------+-------------+ | Handler_commit | 0 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_external_lock | 4 | | Handler_mrr_init | 0 | | Handler_prepare | 0 | | Handler_read_first | 1 | | Handler_read_key | 13043 | | Handler_read_last | 0 | | Handler_read_next | 854,767,916 | ... Performance? 20
  • 54. • Table order mysql> explain select count(*) from goods_shops join goods_characteristics using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); +----+-----------------------+-------+---------+--------+----------+--------------------------+ | id | table | type | key | rows | filtered | Extra | +----+-----------------------+-------+---------+--------+----------+--------------------------+ | 1 | goods_characteristics | index | good_id | 131072 | 25.00 | Using where; Using index | | 1 | goods_shops | ref | good_id | 65536 | 36.00 | Using where; Using index | +----+-----------------------+-------+---------+--------+----------+--------------------------+ 2 rows in set, 1 warning (0.00 sec) Performance? 20
  • 55. • Table order matters mysql> explain select count(*) from goods_shops straight_join goods_characteristics -> using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); +----+-----------------------+-------+---------+--------+----------+--------------------------+ | id | table | type | key | rows | filtered | Extra | +----+-----------------------+-------+---------+--------+----------+--------------------------+ | 1 | goods_shops | index | good_id | 65536 | 36.00 | Using where; Using index | | 1 | goods_characteristics | ref | good_id | 131072 | 25.00 | Using where; Using index | +----+-----------------------+-------+---------+--------+----------+--------------------------+ 2 rows in set, 1 warning (0.00 sec) Performance? 20
  • 56. • Table order matters mysql> select count(*) from goods_shops straight_join goods_characteristics using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); +----------+ | count(*) | +----------+ | 816640 | +----------+ 1 row in set (2.11 sec) mysql> show status like ’Handler_read_next’; +-------------------+-----------+ | Variable_name | Value | +-------------------+-----------+ | Handler_read_next | 5,308,416 | +-------------------+-----------+ 1 row in set (0.00 sec) Performance? 20
  • 57. mysql> analyze table goods_shops update histogram on location, delivery_options; +-------------+-----------+----------+-----------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-------------+-----------+----------+-----------------------------------------------------+ | goods_shops | histogram | status | Histogram statistics created... ’delivery_options’. | | goods_shops | histogram | status | Histogram statistics created for column ’location’. | +-------------+-----------+----------+-----------------------------------------------------+ 2 rows in set (0.18 sec) mysql> analyze table goods_characteristics update histogram on size, manufacturer ; +-----------------------+-----------+----------+-------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +-----------------------+-----------+----------+-------------------------------------------------+ | goods_characteristics | histogram | status | Histogram statistics created... ’manufacturer’. | | goods_characteristics | histogram | status | Histogram statistics created for column ’size’. | +-----------------------+-----------+----------+-------------------------------------------------+ 2 rows in set (0.23 sec) Histograms to Rescue 21
  • 58. • The query mysql> select count(*) from goods_shops join goods_characteristics using (good_id) -> where size < 12 and manufacturer in (’Lenovo’, ’Dell’, ’Toshiba’, ’Samsung’, ’Acer’) -> and (location in (’Moscow’, ’Kiev’) or delivery_options in (’Premium’, ’Urgent’)); +----------+ | count(*) | +----------+ | 816640 | +----------+ 1 row in set (2.16 sec) mysql> show status like ’Handler_read_next’; +-------------------+-----------+ | Variable_name | Value | +-------------------+-----------+ | Handler_read_next | 5,308,418 | +-------------------+-----------+ 1 row in set (0.00 sec) Histograms to Rescue 21
  • 59. • Filtering effect mysql> explain select count(*) from goods_shops join goods_characteristics using (good_id) where s +----+-----------------------+-------+---------+--------+----------+--------------------------+ | id | table | type | key | rows | filtered | Extra | +----+-----------------------+-------+---------+--------+----------+--------------------------+ | 1 | goods_shops | index | good_id | 65536 | 0.06 | Using where; Using index | | 1 | goods_characteristics | ref | good_id | 131072 | 15.63 | Using where; Using index | +----+-----------------------+-------+---------+--------+----------+--------------------------+ 2 rows in set, 1 warning (0.00 sec) Histograms to Rescue 21
  • 61. 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 Indexes: Number of Items with Same Value 23
  • 62. 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 Indexes: Cardinality 24
  • 63. 1 2 3 4 5 6 7 8 9 10 0 200 400 600 800 Histograms: Number of Values in Each Bucket 25
  • 64. 1 2 3 4 5 6 7 8 9 10 0 0.2 0.4 0.6 0.8 1 Histograms: Data in the Histogram 26
  • 67. ↓ sql/sql planner.cc ↓ calculate condition filter Low Level 28
  • 68. ↓ sql/sql planner.cc ↓ calculate condition filter ↓ Item func *::get filtering effect Low Level 28
  • 69. ↓ sql/sql planner.cc ↓ calculate condition filter ↓ Item func *::get filtering effect • get histogram selectivity Low Level 28
  • 70. ↓ sql/sql planner.cc ↓ calculate condition filter ↓ Item func *::get filtering effect • get histogram selectivity • Seen as a percent of filtered rows in EXPLAIN Low Level 28
  • 71. • Example data mysql> create table example(f1 int) engine=innodb; mysql> insert into example values(1),(1),(1),(2),(3); mysql> select f1, count(f1) from example group by f1; +------+-----------+ | f1 | count(f1) | +------+-----------+ | 1 | 3 | | 2 | 1 | | 3 | 1 | +------+-----------+ 3 rows in set (0.00 sec) Filtered Rows 29
  • 72. • Without a histogram mysql> explain select * from example where f1 > 0G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 73. • Without a histogram mysql> explain select * from example where f1 > 1G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 74. • Without a histogram mysql> explain select * from example where f1 > 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 75. • Without a histogram mysql> explain select * from example where f1 > 3G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 33.33 Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 76. • With the histogram mysql> analyze table example update histogram on f1 with 3 buckets; +-----------------+-----------+----------+------------------------------+ | Table | Op | Msg_type | Msg_text | +-----------------+-----------+----------+------------------------------+ | hist_ex.example | histogram | status | Histogram statistics created for column ’f1’. | +-----------------+-----------+----------+------------------------------+ 1 row in set (0.03 sec) Filtered Rows 29
  • 77. • With the histogram mysql> select * from information_schema.column_statistics -> where table_name=’example’G *************************** 1. row *************************** SCHEMA_NAME: hist_ex TABLE_NAME: example COLUMN_NAME: f1 HISTOGRAM: "buckets": [[1, 0.6], [2, 0.8], [3, 1.0]], "data-type": "int", "null-values": 0.0, "collation-id": 8, "last-updated": "2018-11-07 09:07:19.791470", "sampling-rate": 1.0, "histogram-type": "singleton", "number-of-buckets-specified": 3 1 row in set (0.00 sec) Filtered Rows 29
  • 78. • With the histogram mysql> explain select * from example where f1 > 0G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 100.00 -- all rows Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 79. • With the histogram mysql> explain select * from example where f1 > 1G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 40.00 -- 2 rows Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 80. • With the histogram mysql> explain select * from example where f1 > 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 20.00 -- one row Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 81. • With the histogram mysql> explain select * from example where f1 > 3G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: example partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5 filtered: 20.00 - one row Extra: Using where 1 row in set, 1 warning (0.00 sec) Filtered Rows 29
  • 82. • CREATE INDEX • Metadata lock • Can be blocked by any query Locking 30
  • 83. • CREATE INDEX • Metadata lock • Can be blocked by any query • UPDATE HISTOGRAM • Backup lock • Can be locked only by a backup • Can be created any time without fear Locking 30
  • 84. • Helps if query plan can be changed • Not a replacement for the index: • GROUP BY • ORDER BY • Query on a single table ∗ Outcome 31
  • 85. • Data distribution is uniform • Range optimization can be used • Full table scan is fast When Histogram are not Helpful? 32
  • 86. • Index statistics collected by the engine • Optimizer calculates Cardinality each time when accesses statistics • Indexes not always improve performance • Histograms can help Still new feature • Histograms do not replace other optimizations! Conclusion 33
  • 87. MySQL User Reference Manual Blog by Erik Froseth Blog by Frederic Descamps Talk by Oystein Grovlen @Fosdem Talk by Sergei Petrunia @PerconaLive WL #8707 More information 34