SlideShare uma empresa Scribd logo
1 de 43
Explain that “Explain”
The Road to Understanding
“You Should Not Fight with the Database, the Database is your Friend”
put together by Fabrizio Parrella
PARTS ARE QUOTED OR COPIED OR REF FROM:
http://devzone.zend.com/1436/the-zendcon-sessions-episode-17-sql-query-tuning-the-legend-of-drunken-query-master/
http://www.slideshare.net/phpcodemonkey/mysql-explain-explained
get to know your friend
➲ Recognize the strengths and also the weaknesses of
your database
➲ No database is perfect -- deal with it, you're not
perfect either
➲ Think of both big things and small things
BIG: Architecture, surrounding servers, caching
SMALL: SQL coding, join rewrites, server config
becoming friends
➲ Understand storage engine abilities and weaknesses
➲ Understand how the query cache and important
buffers works
➲ Understand optimizer's limitations
➲ Understand what should and should not be done at
the application level
➲ If you understand the above, you'll start to see the
database as a friend and not an enemy
the schema
➲ Basic foundation of performance
➲ Everything else depends on it
➲ Choose your data types wisely
➲ “Divide et Impera” the schema through partitioning
A divide and conquer (D&C) algorithm works by recursively break down a problem into two or more sub-
problems of the same (or related) type, until there become simple enough to be solved directly. The
solution to the sub-problems is then combined to give a solution to the original problem.
http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
size does matter !!
smaller, smaller, SMALLER
The more records you can fit into a single page of
memory/disk, the faster your seeks and scans will be
➲ Do you really need that BIGINT?
➲ Use INT UNSIGNED for IPv4 addresses
➲ Use VARCHAR carefully
Converted to CHAR when used in a temporary table
➲ Use TEXT sparingly
Consider separate tables
➲ Use BLOBs very sparingly
Use the filesystem for what it was intended
real life example... handling IPv4 addresses
CREATE TABLE Sessions (
session_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
ip_address INT UNSIGNED NOT NULL, // Compare to CHAR(15)...
session_data TEXT NOT NULL,
PRIMARY KEY (session_id),
INDEX (ip_address)
) ENGINE=InnoDB;
// Insert a new dummy record
INSERT INTO Sessions VALUES
(NULL, INET_ATON('192.168.0.2'), 'some session data');
SELECT
session_id,
ip_address as ip_raw,
INET_NTOA(ip_address) as ip,
session_data
FROM Sessions
WHERE
ip_address BETWEEN INET_ATON('192.168.0.1') AND INET_ATON('192.168.0.255');
+------------+------------+-------------+-------------------+
| session_id | ip_raw | ip | session_data |
+------------+------------+-------------+-------------------+
| 1 | 3232235522 | 192.168.0.2 | some session data |
+------------+------------+-------------+-------------------+
SETs and ENUMs
➲ Often sign of poor schema design
➲ Changing the definition will most likely require a full
rebuild of the table
➲ Search functions like FIND_IN_SET() are inefficient
compared to index operation on a join
normalization, taking it too far
DateDate ?
http://thedailywtf.com/forums/thread/75982.aspx
vertical partitioning
➲ Never mix frequently and infrequently
accessed fields in a single table
➲ Splitting tables allows main records to consume the buffer
pages without the extra data taking up space in memory
➲ Do you need FULLTEXT on your text columns (PRE 5.6.4)?
CREATE TABLE Users (
user_id INT NOT NULL AUTO_INCREMENT,
email VARCHAR(80) NOT NULL,
display_name VARCHAR(50) NOT NULL,
password CHAR(41) NOT NULL,
first_name VARCHAR(25) NOT NULL,
last_name VARCHAR(25) NOT NULL,
address VARCHAR(80) NOT NULL,
city VARCHAR(30) NOT NULL,
province CHAR(2) NOT NULL,
postcode CHAR(7) NOT NULL,
interests TEXT NULL,
bio TEXT NULL,
signature TEXT NULL,
skills TEXT NULL,
PRIMARY KEY (user_id),
UNIQUE INDEX (email)
) ENGINE=InnoDB;
CREATE TABLE Users (
user_id INT NOT NULL AUTO_INCREMENT,
email VARCHAR(80) NOT NULL,
display_name VARCHAR(50) NOT NULL,
password CHAR(41) NOT NULL,
PRIMARY KEY (user_id),
UNIQUE INDEX (email)
) ENGINE=InnoDB;
CREATE TABLE UserExtra (
user_id INT NOT NULL
first_name VARCHAR(25) NOT NULL
last_name VARCHAR(25) NOT NULL
address VARCHAR(80) NOT NULL
city VARCHAR(30) NOT NULL
province CHAR(2) NOT NULL
postcode CHAR(7) NOT NULL
interests TEXT NULL
bio TEXT NULL
signature TEXT NULL
skills TEXT NULL
PRIMARY KEY (user_id)
FULLTEXT KEY (interests, skills)
) ENGINE=MyISAM;
understand MySQL query cache
➲ You must understand your application's read/write
patterns
➲ Internal query cache design is a compromise between
CPU usage and read performance
➲ Stores the MYSQL_RESULT of a SELECT along with a hash
of the SELECT SQL statement
➲ Any modification to any table involved in the SELECT
invalidates the stored result
➲ Write applications to be aware of the query cache
Use SELECT SQL_NO_CACHE
coding like a master
➲ Be consistent (for crying out loud)
➲ Use ANSI SQL coding style (vs. Theta)
➲ Stop thinking in terms of iterators, for loops, while
loops, etc
➲ Instead, think in terms of sets
➲ Break complex SQL statements (or business requests)
into smaller, manageable chunks
Consistency, consistency, CONSISTENCY !!
➲ Tabs and Spacing
➲ Upper and Lower Case
➲ Keywords, function names
Nothing pisses offthe query master likeinconsistent SQL code!
SELECT
a.first_name,
a.last_name,
COUNT(*) as num_rentals
FROM actor a
INNER JOIN film f ON a.actor_id = f.actor_id
GROUP BY a.actor_id
ORDER BY
num_rentals DESC,
a.last_name,
a.first_name
LIMIT 10;
vs.
select first_name, a.last_name,
count(*) AS num_rentals
FROM actor a join film on a.actor_id = film.actor_id
group by a.actor_id order by
num_rentals DESC, a.last_name, a.first_name
LIMIT 10;
➲ Aliases
➲ Consider your
teammates
➲ Like your code, SQL is
meant to be read, not
written
guidelines
➲ Beware of join hints
“force index” can get “out of date”
➲ Just because it can be done in a single SQL
statement doesn't meat it should
➲ ALWAYS test and benchmark your solution
ANSI vs. THETA
SELECT
a.first_name,
a.last_name,
COUNT(*) as num_rentals
FROM actor a
INNER JOIN film_actor fa ON a.actor_id = fa.actor_id
INNER JOIN film f ON fa.film_id = f.film_id
INNER JOIN inventory I ON f.film_id = i.film_id
INNER JOIN rental r ON r.inventory_id = i.inventory_id
GROUP BY a.actor_id
ORDER BY
num_rentals DESC,
a.last_name,
a.first_name
LIMIT 10; SELECT
a.first_name,
a.last_name,
COUNT(*) as num_rentals
FROM
actor a,
film f,
film_actor fa,
inventory i,
rental r
WHERE
a.actor_id = fa.actor_id
AND fa.film_id = f.film_id
AND f.film_id = i.film_id
AND r.inventory_id = i.inventory_id
GROUP BY a.actor_id
ORDER BY
num_rentals DESC,
a.last_name,
a.first_name
LIMIT 10;
ANSI STYLE
Explicitly declare JOIN conditions
using the ON clause
THETA STYLE
Implicitly declare JOIN conditions
in the WHERE clause
why ANSI style kicks THETA style's A55
➲ MySQL THETA style only supports INNER and CROSS
join
But MySQL ANSI style supports INNER, CROSS, LEFT, RIGHT,
and NATURAL joins
Mixing and matching both styles can lead to hard-to-read
SQL code
➲ It is extremely easy to miss a join condition with
THETA style
Especially when joining many tables
Forgetting a Join will produce a cartesian product (NOT
GOOD !!!)
WITHOUT THE STRENGHT OF
EXPLAIN
YOU WILL GET LOST IN THE FIELDS
OF MISUNDERSTANDING
how to test our SQL
EXPLAIN the basics
➲ Provides the execution plan chosen by the MySQL
optimizer
➲ Simply prepend the word EXPLAIN in front of your
SELECT statement
➲ Each row represent a set of information for each
table used in the SELECT
EXPLAIN the columns
➲ select_type - type of “set” the data in this row
contains (SIMPLE, DERIVATE, SUBQUERY, etc..)
➲ table - alias (or full table name if no alias) of the table
or derived table from which the data in this set
comes
➲ type - “access strategy” used to grab the data in this
set (ALL, CONST, REF, etc...)
➲ possible_keys - keys available to optimizer for query
➲ keys - keys chosen by the optimizer
➲ key_len – number of bytes used from the keys
➲ ref - shows the column used in join relations
➲ rows - estimate of the number of rows in this set
➲ Extra - information the optimizer chooses to give you
EXPLAIN the output
EXPLAIN
SELECT
a.first_name,
a.last_name,
COUNT(*) as num_rentals
FROM film f
INNER JOIN film_category fc ON f.film_id = fc.film_id
INNER JOIN category c ON fc.category_id = c.category_id
WHERE f.title LIKE 'T%'G
*************************** 1. row ***************************
select_type: SIMPLE
table: c
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 16
Extra:
*************************** 2. row ***************************
select_type: SIMPLE
table: fc
type: ref
possible_keys: PRIMARY, fk_film_category_category
key: fk_film_category_category
key_len: 1
ref: c.category_id
rows: 1
Extra: using index
*************************** 2. row ***************************
select_type: SIMPLE
table: f
type: eq_ref
possible_keys: PRIMARY, idx_title
key: PRIMARY
key_len: 2
ref: fc.film_id
rows: 1
Extra: using where
estimate row count
available indexes and
the chosen one
a covering index
was used
EXPLAIN a real world example
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
AND registration_status > 0
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: NULL
key: NULL
rows: 14052
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
➲ The three most important columns returned by EXPLAIN
possible_keys
All possible indexes which MYSQL could have used
Based on a series of very quick lookups and calculations
key: chosen key
rows: estimate of the scanned rows
EXPLAIN a real world example
➲ Interpreting the result:
No suitable indexes for this query
MySQL has to do a full scan of the table
Full table scans are almost always the slowest
Full table scans are usually an indication that an index is
needed
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
AND registration_status > 0
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: NULL
key: NULL
rows: 14052
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
EXPLAIN a real world example
➲ MySQL has two indexes to choose from
➲ “reg” is not “sufficently unique”
the spread of the values can also be a factor (e.g. when 99% of
rows contain the same value)
➲ Index “uniqueness” is called cardinality
➲ There is space for performance increase
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
AND registration_status > 0
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: conf, reg
key: conf
rows: 331
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
ADD INDEX conf (conference_id),
ADD INDEX reg (registration_status);
EXPLAIN a real world example
➲ “reg_conf_index” is a much better choice
➲ Other keys are still available, just not as effective
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
AND registration_status > 0
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: reg, conf, reg_conf_index
key: reg_conf_index
rows: 204
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
ADD INDEX reg_conf_index (registration_status, conference_id);
EXPLAIN a real world example
➲ Seems like that also without the “reg” index everything is
working just as expected
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
registration_status = 2
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: reg_conf_index
key: reg_conf_index
rows: 372
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
DELETE INDEX reg,
DELETE INDEX conf;
EXPLAIN a real world example
➲ Without the “conf” index we are at square one
➲ The orders in which the fields are defined in a composite index
affects whether is available in a query
➲ Potential workaround
SELECT * FROM attendees WHERE conference_id = 123 AND
registration_id > 0;
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: NULL
key: NULL
rows: 14502
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
DELETE INDEX reg,
DELETE INDEX conf;
EXPLAIN a real world example
➲ Great, MySQL it is using the index on “lastname”, which is good
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
lastname LIKE “parr%”
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: lastname
key: lastname
rows: 234
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
ADD INDEX lastname (lastname);
EXPLAIN a real world example
➲ MySQL doesn't even try to use an index !
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM attendees
WHERE
lastname LIKE “%arr%”
//Let's only show the important parts for now
*************************** 1. row ***************************
table: attendees
possible_keys: NULL
key: NULL
rows: 14052
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE attendees
ADD INDEX lastname (lastname);
EXPLAIN a real world example (pre MySQL 5.1)
➲ MySQL doesn't use an index because of the OR
➲ MySQL perform a full table scan
➲ Workaround, use “UNION”
➲ Workaround, add a composite INDEX
ALTER TABLE conference
ADD INDEX location_topic (location_id, topic_id);
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM conferences
WHERE
location_id = 2
OR topic_id IN (4,6,1)
//Let's only show the important parts for now
*************************** 1. row ***************************
table: conferences
possible_keys: location_id, topic_id
key: NULL
rows: 5043
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
ALTER TABLE conferences
ADD INDEX location_id (location_id)
ADD INDEX topic_id (topic_id);
EXPLAIN a real world example
➲ Looks like we need an index on “conference_id” on attendees
➲ How many total ROWS are estimate ?
CREATE TABLE `attendees` (
`attendee_id` int(11) NOT NULL,
`lastname` varchar(50) NOT NULL,
`conference_id` int(11) NOT NULL,
`registration_status` tinyint(4) NOT NULL,
PRIMARY KEY (`attendee_id`)
) ENGINE=InnoDB;
EXPLAIN
SELECT *
FROM conferences c
INNER JOIN attendees a USING (conference_id)
WHERE
c.location_id = 2
AND c.topic_id IN (4,6,1)
AND a.registration_status > 1
//Let's only show the important parts for now
*************************** 1. row ***************************
table: c
possible_keys: conference_topic
key: conference_topic
rows: 15
*************************** 1. row ***************************
table: a
possible_keys: NULL
key: NULL
rows: 14502
CREATE TABLE `conferences` (
`conference_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
`topic_id` int(11) NOT NULL,
`date` date NOT NULL,
PRIMARY KEY (`conference_id`)
) ENGINE=InnoDB;
15 x 14502
EXPLAIN the “type”
EXPLAIN the type
➲ CONST: SELECT * FROM table WHERE field = “value”;
 The field needs to be indexed with a unique non-nullable key
 If non-unique or nullable the type will be “ref”
 It refers to when a table with a single row is referenced in the SELECT
 Can be propagate across multiple joined columns:
EXPLAIN
SELECT r.*
FROM rental r
INNER JOIN customer c ON r.customer_id = c.customer_id
WHERE r.rental_id = 13G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: r
type: const
possible_keys: PRIMARY,idx_fk_customer_id
key: PRIMARY
key_len: 4
ref: const
rows: 1
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: c
type: const
possible_keys: PRIMARY
key: PRIMARY
key_len: 2
ref: const /* Here is where the propagation occurs...*/
rows: 1
Extra:
2 rows in set (0.00 sec)
EXPLAIN the type
➲ RANGE: SELECT * FROM table WHERE field BETWEEN “value” AND
“value”;
 The field needs to be indexed
 It too many records are estimated, it won't be used
EPLAIN
SELECT *
FROM rental
WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: range
possible_keys: rental_date
key: rental_date
key_len: 8
ref: NULL
rows: 364
Extra: Using where
1 row in set (0.00 sec)
EXPLAIN the type
➲ ALL: SELECT * FROM table WHERE field BETWEEN “value” AND
“far away from starting value”;
 No WHERE condition (duh)
 No index on the field in the WHERE condition
 Poor selectivity on the indexed field
 Too many records meet the WHERE condition
 SEEK: jumps into random places to fetch the data and repeat for each
piece of data needed
 SCAN: jump to the start and sequentially read the data
 For large amount of data, SCAN operations tends to be more efficient than
multiple SEEK operations
 Using SELECT * FROM
EPLAIN
SELECT *
FROM rental
WHERE rental_date BETWEEN '2001-01-14' AND '2012-12-31'G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: ALL
possible_keys: rental_date /* large range force full scan */
key: NULL
key_len: NULL
ref: NULL
rows: 16298
Extra: Using where
1 row in set (0.00 sec)
EXPLAIN the type
➲ INDEX_MERGE: SELECT * FROM table WHERE field = “value”
AND field1 = “value”;
 Introduced with the optimizer on MySQL 5.0
 Allows the optimizer to use more than one index to satisfy a join condition
 Prior to MySQL 5.0, only one index
 In case of OR conditions, MySQL < 5.0 would use a full table scan
EXPLAIN
SELECT *
FROM rental
WHERE
rental_id IN (10,11,12)
OR rental_date = '2006-02-01' G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: index_merge
possible_keys: PRIMARY,rental_date
key: rental_date,PRIMARY
key_len: 8,4
ref: NULL
rows: 4
Extra: Using sort_union(rental_date,PRIMARY); Using where
1 row in set (0.02 sec)
EXPLAIN the “Extra”
EXPLAIN the Extra
➲ “Extra” shows additional operations invoked to get your result set
➲ Some common values are (more are discussed in the MySQL
manual):
 Using where
 Using temporary table
 Using filesort
 Using index
EXPLAIN
SELECT *
FROM rental
WHERE
rental_id IN (10,11,12)
OR rental_date = '2006-02-01' G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: index_merge
possible_keys: PRIMARY,rental_date
key: rental_date,PRIMARY
key_len: 8,4
ref: NULL
rows: 4
Extra: Using sort_union(rental_date,PRIMARY); Using where
1 row in set (0.02 sec)
EXPLAIN the Extra
➲ Using filesort: AVOID
➲ Avoid because
 Doesn't Use Index
 Involves a full scan
 Uses a generic algorithm (one fits all)
 Uses filesystem (BAD !!)
 Gets slower with more data
➲ It's not all that bad
 Sometime unavoidable - ORDER BY RAND()
 Acceptable provided you get to your result as quickly as possible, and
keep it predictably small
EXPLAIN
SELECT *
FROM attendees
WHERE
conference_id = 123
ORDER BY lastname
*************************** 1. row ***************************
table: attendees
possible_keys: conference_id
key: conference_id
rows: 331
Extra: Using filesort
EXPLAIN the Extra
➲ Using index: GOOD
➲ Celebrate because
 MySQL got your results just by consulting the index
 MySQL didn't need to look at the table to get the results (open table is
expensive)
 Fastest way to get your data
➲ Particularly useful...
 When you are interested in a single data or id
 When you are interested in COUNT(), SUM(), AVG(), etc. of a field
EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123
*************************** 1. row ***************************
table: attendees
possible_keys: conference_id
key: conference_id
rows: 331
Extra:
ALTER TABLE attendees ADD INDEX conf_age (conference_id, age);
EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123
*************************** 1. row ***************************
table: attendees
possible_keys: conference_id, conf_surname
key: conf_surname
rows: 331
Extra: Using index
Nothing is actually wrong with this query, it just could be quicker
Outside from caching, this is the fastest way to get your data
INDEXES... your schema's phone book
➲ Speed up SELECTs, but slow down modifications
➲ Make sure you have indexes on columns used in
WHERE, ON, and GROUP BY clauses
➲ Always ensure that JOIN conditions are indexed AND
have identical data types
➲ Good keys:
Selectivity:
% of distinct values
= distinct values / number rows
unique or primary always 1
Low selectivity:
Maybe you can put it in a multi-column index
Prefix ? Suffix ? It depends on your application
indexed columns and functions don't mix
A full table scan is used because a function (LEFT) is operating on
the lastname column.
Let's Fix this...
EXPLAIN
SELECT *
FROM attendees
WHERE
LEFT(lastname.2) = “Pa”
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: film
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 951
Extra: Using where
EXPLAIN
SELECT *
FROM attendees
WHERE
lastname LIKE “Pa%”
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: film
type: range
possible_keys: idx_title
key: idx_title
key_len: 767
ref: NULL
rows: 15
Extra: Using where
let's fix multiple issues with a SELECT query
First, we are operating on an index column (order_created) with a function – let's fix that:
SELECT * FROM orders WHERE TO_DAYS(CURRENT_DATE()) - TO_DAYS(order_created) <= 7;
Even if we removed the function in the WHERE expression, we still have a non-
deterministic function in the statement which eliminates this query from being places in
the query cache – let's fix that:
SELECT * FROM orders WHERE order_created >= CURRENT_DATE() - INTERVAL 7 DAYS;
We replaced the function with a constant, however we are specifying a SELECT * instead
than the actual fields that we need.
What is there is a TEXT field in the table that we don't seen to see ? Having it included in
the result means a larger result set which may not fit in the query cache and may force a
disk-based temporary table – let's fix that:
SELECT * FROM orders WHERE order_created >= '2013-01-13' - INTERVAL 7 DAYS;
SELECT
order_id,
customer_id,
order_total,
date_created
FROM orders
WHERE order_created >= '2013-01-13' - INTERVAL 7 DAYS;
good indexes vs. bad indexes
Don't forget that MySQL string indexes allow only 1000 characters (333
using UTF-8).
Let's say you have 11,000,000 records in a table called “USERS” with
the following fields:
➲ user, firstname, lastname, gender, email, age, country_id
Our application perform searched on the following fields:
➲ user
➲ firstname, lastname, gender
➲ email
It is obvious to create indexes on user and email, especially if they are
unique, but what about the other fields?
➲ “gender” can be M or F, selectivity is very low 2/11,000,000 = 0.
Best would be to remove the index on gender if you have it
➲ “firstname”/”lastname” depend on the uniqueness of the values
stores.
SELECT DISTINCT to calculate the selectivity
if it is above 15% keep it
below 15% you might want to create a composite INDEX
removing crappy or redundant indexes
SELECT
t.TABLE_SCHEMA AS `db`,
t.TABLE_NAME AS `table`,
s.INDEX_NAME AS `index name`,
s.COLUMN_NAME AS `field name`,
s.SEQ_IN_INDEX `seq in index`,
s2.max_columns AS `# cols,
s.CARDINALITY AS `card`,
t.TABLE_ROWS AS `est rows`,
ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %`
FROM
INFORMATION_SCHEMA.STATISTICS s
INNER JOIN INFORMATION_SCHEMA.TABLES t ON s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAME
INNER JOIN (
SELECT
TABLE_SCHEMA,
TABLE_NAME,
INDEX_NAME,
MAX(SEQ_IN_INDEX) AS max_columns
FROM INFORMATION_SCHEMA.STATISTICS
WHERE TABLE_SCHEMA != 'mysql'
GROUP BY
TABLE_SCHEMA,
TABLE_NAME,
INDEX_NAME
) AS s2 ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAME
WHERE
t.TABLE_SCHEMA != 'mysql' /* Filter out the mysql system DB */
AND t.TABLE_ROWS > 10 /* Only tables with some rows */
AND s.CARDINALITY IS NOT NULL /* Need at least one non-NULL value in the field */
AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* unique indexes are perfect anyway */
ORDER BY
`sel %`, /* DESC for best non-unique indexes */
s.TABLE_SCHEMA,
s.TABLE_NAME
LIMIT 100

Mais conteúdo relacionado

Mais procurados

MySQL8.0_performance_schema.pptx
MySQL8.0_performance_schema.pptxMySQL8.0_performance_schema.pptx
MySQL8.0_performance_schema.pptxNeoClova
 
MySQL Atchitecture and Concepts
MySQL Atchitecture and ConceptsMySQL Atchitecture and Concepts
MySQL Atchitecture and ConceptsTuyen Vuong
 
MySQL Database Architectures - 2020-10
MySQL Database Architectures -  2020-10MySQL Database Architectures -  2020-10
MySQL Database Architectures - 2020-10Kenny Gryp
 
MySQL Administrator 2021 - 네오클로바
MySQL Administrator 2021 - 네오클로바MySQL Administrator 2021 - 네오클로바
MySQL Administrator 2021 - 네오클로바NeoClova
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performanceoysteing
 
What is new in PostgreSQL 14?
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?Mydbops
 
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바NeoClova
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performancePostgreSQL-Consulting
 
MySQL 상태 메시지 분석 및 활용
MySQL 상태 메시지 분석 및 활용MySQL 상태 메시지 분석 및 활용
MySQL 상태 메시지 분석 및 활용I Goo Lee
 
Parallel Query in AWS Aurora MySQL
Parallel Query in AWS Aurora MySQLParallel Query in AWS Aurora MySQL
Parallel Query in AWS Aurora MySQLMydbops
 
MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsOSSCube
 
MySQL Shell - the best DBA tool !
MySQL Shell - the best DBA tool !MySQL Shell - the best DBA tool !
MySQL Shell - the best DBA tool !Frederic Descamps
 
MySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software TestMySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software TestI Goo Lee
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Mydbops
 
MySQL: Indexing for Better Performance
MySQL: Indexing for Better PerformanceMySQL: Indexing for Better Performance
MySQL: Indexing for Better Performancejkeriaki
 
How to Take Advantage of Optimizer Improvements in MySQL 8.0
How to Take Advantage of Optimizer Improvements in MySQL 8.0How to Take Advantage of Optimizer Improvements in MySQL 8.0
How to Take Advantage of Optimizer Improvements in MySQL 8.0Norvald Ryeng
 
MySQL Performance Schema in MySQL 8.0
MySQL Performance Schema in MySQL 8.0MySQL Performance Schema in MySQL 8.0
MySQL Performance Schema in MySQL 8.0Mayank Prasad
 
MariaDB Optimization
MariaDB OptimizationMariaDB Optimization
MariaDB OptimizationJongJin Lee
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 MinutesSveta Smirnova
 

Mais procurados (20)

MySQL8.0_performance_schema.pptx
MySQL8.0_performance_schema.pptxMySQL8.0_performance_schema.pptx
MySQL8.0_performance_schema.pptx
 
MySQL Atchitecture and Concepts
MySQL Atchitecture and ConceptsMySQL Atchitecture and Concepts
MySQL Atchitecture and Concepts
 
MySQL Database Architectures - 2020-10
MySQL Database Architectures -  2020-10MySQL Database Architectures -  2020-10
MySQL Database Architectures - 2020-10
 
MySQL Administrator 2021 - 네오클로바
MySQL Administrator 2021 - 네오클로바MySQL Administrator 2021 - 네오클로바
MySQL Administrator 2021 - 네오클로바
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
 
What is new in PostgreSQL 14?
What is new in PostgreSQL 14?What is new in PostgreSQL 14?
What is new in PostgreSQL 14?
 
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바
 
Mysql security 5.7
Mysql security 5.7 Mysql security 5.7
Mysql security 5.7
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
MySQL 상태 메시지 분석 및 활용
MySQL 상태 메시지 분석 및 활용MySQL 상태 메시지 분석 및 활용
MySQL 상태 메시지 분석 및 활용
 
Parallel Query in AWS Aurora MySQL
Parallel Query in AWS Aurora MySQLParallel Query in AWS Aurora MySQL
Parallel Query in AWS Aurora MySQL
 
MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 Tips
 
MySQL Shell - the best DBA tool !
MySQL Shell - the best DBA tool !MySQL Shell - the best DBA tool !
MySQL Shell - the best DBA tool !
 
MySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software TestMySQL/MariaDB Proxy Software Test
MySQL/MariaDB Proxy Software Test
 
Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0Redo log improvements MYSQL 8.0
Redo log improvements MYSQL 8.0
 
MySQL: Indexing for Better Performance
MySQL: Indexing for Better PerformanceMySQL: Indexing for Better Performance
MySQL: Indexing for Better Performance
 
How to Take Advantage of Optimizer Improvements in MySQL 8.0
How to Take Advantage of Optimizer Improvements in MySQL 8.0How to Take Advantage of Optimizer Improvements in MySQL 8.0
How to Take Advantage of Optimizer Improvements in MySQL 8.0
 
MySQL Performance Schema in MySQL 8.0
MySQL Performance Schema in MySQL 8.0MySQL Performance Schema in MySQL 8.0
MySQL Performance Schema in MySQL 8.0
 
MariaDB Optimization
MariaDB OptimizationMariaDB Optimization
MariaDB Optimization
 
MySQL Performance Schema in 20 Minutes
 MySQL Performance Schema in 20 Minutes MySQL Performance Schema in 20 Minutes
MySQL Performance Schema in 20 Minutes
 

Semelhante a Explain that explain

15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performanceguest9912e5
 
Optimizing MySQL Queries
Optimizing MySQL QueriesOptimizing MySQL Queries
Optimizing MySQL QueriesAchievers Tech
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOAltinity Ltd
 
Mysqlppt
MysqlpptMysqlppt
MysqlpptReka
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the ServerdevObjective
 
Sydney Oracle Meetup - execution plans
Sydney Oracle Meetup - execution plansSydney Oracle Meetup - execution plans
Sydney Oracle Meetup - execution planspaulguerin
 
ABAP Programming Overview
ABAP Programming OverviewABAP Programming Overview
ABAP Programming Overviewsapdocs. info
 
Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02wingsrai
 
Chapter 1abapprogrammingoverview-091205081953-phpapp01
Chapter 1abapprogrammingoverview-091205081953-phpapp01Chapter 1abapprogrammingoverview-091205081953-phpapp01
Chapter 1abapprogrammingoverview-091205081953-phpapp01tabish
 
chapter-1abapprogrammingoverview-091205081953-phpapp01
chapter-1abapprogrammingoverview-091205081953-phpapp01chapter-1abapprogrammingoverview-091205081953-phpapp01
chapter-1abapprogrammingoverview-091205081953-phpapp01tabish
 
Chapter 1 Abap Programming Overview
Chapter 1 Abap Programming OverviewChapter 1 Abap Programming Overview
Chapter 1 Abap Programming OverviewAshish Kumar
 
Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02tabish
 
PHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & PinbaPHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & PinbaPatrick Allaert
 
MySQL Scaling Presentation
MySQL Scaling PresentationMySQL Scaling Presentation
MySQL Scaling PresentationTommy Falgout
 

Semelhante a Explain that explain (20)

15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 
Optimizing MySQL Queries
Optimizing MySQL QueriesOptimizing MySQL Queries
Optimizing MySQL Queries
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
 
Mysqlppt
MysqlpptMysqlppt
Mysqlppt
 
Optimizando MySQL
Optimizando MySQLOptimizando MySQL
Optimizando MySQL
 
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBAPHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
 
My SQL Skills Killed the Server
My SQL Skills Killed the ServerMy SQL Skills Killed the Server
My SQL Skills Killed the Server
 
Sql killedserver
Sql killedserverSql killedserver
Sql killedserver
 
Sydney Oracle Meetup - execution plans
Sydney Oracle Meetup - execution plansSydney Oracle Meetup - execution plans
Sydney Oracle Meetup - execution plans
 
Sql for dbaspresentation
Sql for dbaspresentationSql for dbaspresentation
Sql for dbaspresentation
 
ABAP Programming Overview
ABAP Programming OverviewABAP Programming Overview
ABAP Programming Overview
 
Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02
 
Chapter 1abapprogrammingoverview-091205081953-phpapp01
Chapter 1abapprogrammingoverview-091205081953-phpapp01Chapter 1abapprogrammingoverview-091205081953-phpapp01
Chapter 1abapprogrammingoverview-091205081953-phpapp01
 
chapter-1abapprogrammingoverview-091205081953-phpapp01
chapter-1abapprogrammingoverview-091205081953-phpapp01chapter-1abapprogrammingoverview-091205081953-phpapp01
chapter-1abapprogrammingoverview-091205081953-phpapp01
 
Chapter 1 Abap Programming Overview
Chapter 1 Abap Programming OverviewChapter 1 Abap Programming Overview
Chapter 1 Abap Programming Overview
 
Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02Abapprogrammingoverview 090715081305-phpapp02
Abapprogrammingoverview 090715081305-phpapp02
 
PHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & PinbaPHP applications/environments monitoring: APM & Pinba
PHP applications/environments monitoring: APM & Pinba
 
MySQL Scaling Presentation
MySQL Scaling PresentationMySQL Scaling Presentation
MySQL Scaling Presentation
 
Sql General
Sql General Sql General
Sql General
 

Último

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Explain that explain

  • 1. Explain that “Explain” The Road to Understanding “You Should Not Fight with the Database, the Database is your Friend” put together by Fabrizio Parrella PARTS ARE QUOTED OR COPIED OR REF FROM: http://devzone.zend.com/1436/the-zendcon-sessions-episode-17-sql-query-tuning-the-legend-of-drunken-query-master/ http://www.slideshare.net/phpcodemonkey/mysql-explain-explained
  • 2. get to know your friend ➲ Recognize the strengths and also the weaknesses of your database ➲ No database is perfect -- deal with it, you're not perfect either ➲ Think of both big things and small things BIG: Architecture, surrounding servers, caching SMALL: SQL coding, join rewrites, server config
  • 3. becoming friends ➲ Understand storage engine abilities and weaknesses ➲ Understand how the query cache and important buffers works ➲ Understand optimizer's limitations ➲ Understand what should and should not be done at the application level ➲ If you understand the above, you'll start to see the database as a friend and not an enemy
  • 4. the schema ➲ Basic foundation of performance ➲ Everything else depends on it ➲ Choose your data types wisely ➲ “Divide et Impera” the schema through partitioning A divide and conquer (D&C) algorithm works by recursively break down a problem into two or more sub- problems of the same (or related) type, until there become simple enough to be solved directly. The solution to the sub-problems is then combined to give a solution to the original problem. http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm
  • 5. size does matter !! smaller, smaller, SMALLER The more records you can fit into a single page of memory/disk, the faster your seeks and scans will be ➲ Do you really need that BIGINT? ➲ Use INT UNSIGNED for IPv4 addresses ➲ Use VARCHAR carefully Converted to CHAR when used in a temporary table ➲ Use TEXT sparingly Consider separate tables ➲ Use BLOBs very sparingly Use the filesystem for what it was intended
  • 6. real life example... handling IPv4 addresses CREATE TABLE Sessions ( session_id INT UNSIGNED NOT NULL AUTO_INCREMENT, ip_address INT UNSIGNED NOT NULL, // Compare to CHAR(15)... session_data TEXT NOT NULL, PRIMARY KEY (session_id), INDEX (ip_address) ) ENGINE=InnoDB; // Insert a new dummy record INSERT INTO Sessions VALUES (NULL, INET_ATON('192.168.0.2'), 'some session data'); SELECT session_id, ip_address as ip_raw, INET_NTOA(ip_address) as ip, session_data FROM Sessions WHERE ip_address BETWEEN INET_ATON('192.168.0.1') AND INET_ATON('192.168.0.255'); +------------+------------+-------------+-------------------+ | session_id | ip_raw | ip | session_data | +------------+------------+-------------+-------------------+ | 1 | 3232235522 | 192.168.0.2 | some session data | +------------+------------+-------------+-------------------+
  • 7. SETs and ENUMs ➲ Often sign of poor schema design ➲ Changing the definition will most likely require a full rebuild of the table ➲ Search functions like FIND_IN_SET() are inefficient compared to index operation on a join
  • 8. normalization, taking it too far DateDate ? http://thedailywtf.com/forums/thread/75982.aspx
  • 9. vertical partitioning ➲ Never mix frequently and infrequently accessed fields in a single table ➲ Splitting tables allows main records to consume the buffer pages without the extra data taking up space in memory ➲ Do you need FULLTEXT on your text columns (PRE 5.6.4)? CREATE TABLE Users ( user_id INT NOT NULL AUTO_INCREMENT, email VARCHAR(80) NOT NULL, display_name VARCHAR(50) NOT NULL, password CHAR(41) NOT NULL, first_name VARCHAR(25) NOT NULL, last_name VARCHAR(25) NOT NULL, address VARCHAR(80) NOT NULL, city VARCHAR(30) NOT NULL, province CHAR(2) NOT NULL, postcode CHAR(7) NOT NULL, interests TEXT NULL, bio TEXT NULL, signature TEXT NULL, skills TEXT NULL, PRIMARY KEY (user_id), UNIQUE INDEX (email) ) ENGINE=InnoDB; CREATE TABLE Users ( user_id INT NOT NULL AUTO_INCREMENT, email VARCHAR(80) NOT NULL, display_name VARCHAR(50) NOT NULL, password CHAR(41) NOT NULL, PRIMARY KEY (user_id), UNIQUE INDEX (email) ) ENGINE=InnoDB; CREATE TABLE UserExtra ( user_id INT NOT NULL first_name VARCHAR(25) NOT NULL last_name VARCHAR(25) NOT NULL address VARCHAR(80) NOT NULL city VARCHAR(30) NOT NULL province CHAR(2) NOT NULL postcode CHAR(7) NOT NULL interests TEXT NULL bio TEXT NULL signature TEXT NULL skills TEXT NULL PRIMARY KEY (user_id) FULLTEXT KEY (interests, skills) ) ENGINE=MyISAM;
  • 10. understand MySQL query cache ➲ You must understand your application's read/write patterns ➲ Internal query cache design is a compromise between CPU usage and read performance ➲ Stores the MYSQL_RESULT of a SELECT along with a hash of the SELECT SQL statement ➲ Any modification to any table involved in the SELECT invalidates the stored result ➲ Write applications to be aware of the query cache Use SELECT SQL_NO_CACHE
  • 11. coding like a master ➲ Be consistent (for crying out loud) ➲ Use ANSI SQL coding style (vs. Theta) ➲ Stop thinking in terms of iterators, for loops, while loops, etc ➲ Instead, think in terms of sets ➲ Break complex SQL statements (or business requests) into smaller, manageable chunks
  • 12. Consistency, consistency, CONSISTENCY !! ➲ Tabs and Spacing ➲ Upper and Lower Case ➲ Keywords, function names Nothing pisses offthe query master likeinconsistent SQL code! SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM actor a INNER JOIN film f ON a.actor_id = f.actor_id GROUP BY a.actor_id ORDER BY num_rentals DESC, a.last_name, a.first_name LIMIT 10; vs. select first_name, a.last_name, count(*) AS num_rentals FROM actor a join film on a.actor_id = film.actor_id group by a.actor_id order by num_rentals DESC, a.last_name, a.first_name LIMIT 10; ➲ Aliases ➲ Consider your teammates ➲ Like your code, SQL is meant to be read, not written
  • 13. guidelines ➲ Beware of join hints “force index” can get “out of date” ➲ Just because it can be done in a single SQL statement doesn't meat it should ➲ ALWAYS test and benchmark your solution
  • 14. ANSI vs. THETA SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM actor a INNER JOIN film_actor fa ON a.actor_id = fa.actor_id INNER JOIN film f ON fa.film_id = f.film_id INNER JOIN inventory I ON f.film_id = i.film_id INNER JOIN rental r ON r.inventory_id = i.inventory_id GROUP BY a.actor_id ORDER BY num_rentals DESC, a.last_name, a.first_name LIMIT 10; SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM actor a, film f, film_actor fa, inventory i, rental r WHERE a.actor_id = fa.actor_id AND fa.film_id = f.film_id AND f.film_id = i.film_id AND r.inventory_id = i.inventory_id GROUP BY a.actor_id ORDER BY num_rentals DESC, a.last_name, a.first_name LIMIT 10; ANSI STYLE Explicitly declare JOIN conditions using the ON clause THETA STYLE Implicitly declare JOIN conditions in the WHERE clause
  • 15. why ANSI style kicks THETA style's A55 ➲ MySQL THETA style only supports INNER and CROSS join But MySQL ANSI style supports INNER, CROSS, LEFT, RIGHT, and NATURAL joins Mixing and matching both styles can lead to hard-to-read SQL code ➲ It is extremely easy to miss a join condition with THETA style Especially when joining many tables Forgetting a Join will produce a cartesian product (NOT GOOD !!!)
  • 16. WITHOUT THE STRENGHT OF EXPLAIN YOU WILL GET LOST IN THE FIELDS OF MISUNDERSTANDING how to test our SQL
  • 17. EXPLAIN the basics ➲ Provides the execution plan chosen by the MySQL optimizer ➲ Simply prepend the word EXPLAIN in front of your SELECT statement ➲ Each row represent a set of information for each table used in the SELECT
  • 18. EXPLAIN the columns ➲ select_type - type of “set” the data in this row contains (SIMPLE, DERIVATE, SUBQUERY, etc..) ➲ table - alias (or full table name if no alias) of the table or derived table from which the data in this set comes ➲ type - “access strategy” used to grab the data in this set (ALL, CONST, REF, etc...) ➲ possible_keys - keys available to optimizer for query ➲ keys - keys chosen by the optimizer ➲ key_len – number of bytes used from the keys ➲ ref - shows the column used in join relations ➲ rows - estimate of the number of rows in this set ➲ Extra - information the optimizer chooses to give you
  • 19. EXPLAIN the output EXPLAIN SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM film f INNER JOIN film_category fc ON f.film_id = fc.film_id INNER JOIN category c ON fc.category_id = c.category_id WHERE f.title LIKE 'T%'G *************************** 1. row *************************** select_type: SIMPLE table: c type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 16 Extra: *************************** 2. row *************************** select_type: SIMPLE table: fc type: ref possible_keys: PRIMARY, fk_film_category_category key: fk_film_category_category key_len: 1 ref: c.category_id rows: 1 Extra: using index *************************** 2. row *************************** select_type: SIMPLE table: f type: eq_ref possible_keys: PRIMARY, idx_title key: PRIMARY key_len: 2 ref: fc.film_id rows: 1 Extra: using where estimate row count available indexes and the chosen one a covering index was used
  • 20. EXPLAIN a real world example CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: NULL key: NULL rows: 14052 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ➲ The three most important columns returned by EXPLAIN possible_keys All possible indexes which MYSQL could have used Based on a series of very quick lookups and calculations key: chosen key rows: estimate of the scanned rows
  • 21. EXPLAIN a real world example ➲ Interpreting the result: No suitable indexes for this query MySQL has to do a full scan of the table Full table scans are almost always the slowest Full table scans are usually an indication that an index is needed CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: NULL key: NULL rows: 14052 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB;
  • 22. EXPLAIN a real world example ➲ MySQL has two indexes to choose from ➲ “reg” is not “sufficently unique” the spread of the values can also be a factor (e.g. when 99% of rows contain the same value) ➲ Index “uniqueness” is called cardinality ➲ There is space for performance increase CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: conf, reg key: conf rows: 331 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees ADD INDEX conf (conference_id), ADD INDEX reg (registration_status);
  • 23. EXPLAIN a real world example ➲ “reg_conf_index” is a much better choice ➲ Other keys are still available, just not as effective CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 AND registration_status > 0 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: reg, conf, reg_conf_index key: reg_conf_index rows: 204 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees ADD INDEX reg_conf_index (registration_status, conference_id);
  • 24. EXPLAIN a real world example ➲ Seems like that also without the “reg” index everything is working just as expected CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE registration_status = 2 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: reg_conf_index key: reg_conf_index rows: 372 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees DELETE INDEX reg, DELETE INDEX conf;
  • 25. EXPLAIN a real world example ➲ Without the “conf” index we are at square one ➲ The orders in which the fields are defined in a composite index affects whether is available in a query ➲ Potential workaround SELECT * FROM attendees WHERE conference_id = 123 AND registration_id > 0; CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: NULL key: NULL rows: 14502 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees DELETE INDEX reg, DELETE INDEX conf;
  • 26. EXPLAIN a real world example ➲ Great, MySQL it is using the index on “lastname”, which is good CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE lastname LIKE “parr%” //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: lastname key: lastname rows: 234 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees ADD INDEX lastname (lastname);
  • 27. EXPLAIN a real world example ➲ MySQL doesn't even try to use an index ! CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM attendees WHERE lastname LIKE “%arr%” //Let's only show the important parts for now *************************** 1. row *************************** table: attendees possible_keys: NULL key: NULL rows: 14052 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE attendees ADD INDEX lastname (lastname);
  • 28. EXPLAIN a real world example (pre MySQL 5.1) ➲ MySQL doesn't use an index because of the OR ➲ MySQL perform a full table scan ➲ Workaround, use “UNION” ➲ Workaround, add a composite INDEX ALTER TABLE conference ADD INDEX location_topic (location_id, topic_id); CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM conferences WHERE location_id = 2 OR topic_id IN (4,6,1) //Let's only show the important parts for now *************************** 1. row *************************** table: conferences possible_keys: location_id, topic_id key: NULL rows: 5043 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; ALTER TABLE conferences ADD INDEX location_id (location_id) ADD INDEX topic_id (topic_id);
  • 29. EXPLAIN a real world example ➲ Looks like we need an index on “conference_id” on attendees ➲ How many total ROWS are estimate ? CREATE TABLE `attendees` ( `attendee_id` int(11) NOT NULL, `lastname` varchar(50) NOT NULL, `conference_id` int(11) NOT NULL, `registration_status` tinyint(4) NOT NULL, PRIMARY KEY (`attendee_id`) ) ENGINE=InnoDB; EXPLAIN SELECT * FROM conferences c INNER JOIN attendees a USING (conference_id) WHERE c.location_id = 2 AND c.topic_id IN (4,6,1) AND a.registration_status > 1 //Let's only show the important parts for now *************************** 1. row *************************** table: c possible_keys: conference_topic key: conference_topic rows: 15 *************************** 1. row *************************** table: a possible_keys: NULL key: NULL rows: 14502 CREATE TABLE `conferences` ( `conference_id` int(11) NOT NULL, `location_id` int(11) NOT NULL, `topic_id` int(11) NOT NULL, `date` date NOT NULL, PRIMARY KEY (`conference_id`) ) ENGINE=InnoDB; 15 x 14502
  • 31. EXPLAIN the type ➲ CONST: SELECT * FROM table WHERE field = “value”;  The field needs to be indexed with a unique non-nullable key  If non-unique or nullable the type will be “ref”  It refers to when a table with a single row is referenced in the SELECT  Can be propagate across multiple joined columns: EXPLAIN SELECT r.* FROM rental r INNER JOIN customer c ON r.customer_id = c.customer_id WHERE r.rental_id = 13G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: r type: const possible_keys: PRIMARY,idx_fk_customer_id key: PRIMARY key_len: 4 ref: const rows: 1 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: c type: const possible_keys: PRIMARY key: PRIMARY key_len: 2 ref: const /* Here is where the propagation occurs...*/ rows: 1 Extra: 2 rows in set (0.00 sec)
  • 32. EXPLAIN the type ➲ RANGE: SELECT * FROM table WHERE field BETWEEN “value” AND “value”;  The field needs to be indexed  It too many records are estimated, it won't be used EPLAIN SELECT * FROM rental WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16'G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: range possible_keys: rental_date key: rental_date key_len: 8 ref: NULL rows: 364 Extra: Using where 1 row in set (0.00 sec)
  • 33. EXPLAIN the type ➲ ALL: SELECT * FROM table WHERE field BETWEEN “value” AND “far away from starting value”;  No WHERE condition (duh)  No index on the field in the WHERE condition  Poor selectivity on the indexed field  Too many records meet the WHERE condition  SEEK: jumps into random places to fetch the data and repeat for each piece of data needed  SCAN: jump to the start and sequentially read the data  For large amount of data, SCAN operations tends to be more efficient than multiple SEEK operations  Using SELECT * FROM EPLAIN SELECT * FROM rental WHERE rental_date BETWEEN '2001-01-14' AND '2012-12-31'G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: ALL possible_keys: rental_date /* large range force full scan */ key: NULL key_len: NULL ref: NULL rows: 16298 Extra: Using where 1 row in set (0.00 sec)
  • 34. EXPLAIN the type ➲ INDEX_MERGE: SELECT * FROM table WHERE field = “value” AND field1 = “value”;  Introduced with the optimizer on MySQL 5.0  Allows the optimizer to use more than one index to satisfy a join condition  Prior to MySQL 5.0, only one index  In case of OR conditions, MySQL < 5.0 would use a full table scan EXPLAIN SELECT * FROM rental WHERE rental_id IN (10,11,12) OR rental_date = '2006-02-01' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: index_merge possible_keys: PRIMARY,rental_date key: rental_date,PRIMARY key_len: 8,4 ref: NULL rows: 4 Extra: Using sort_union(rental_date,PRIMARY); Using where 1 row in set (0.02 sec)
  • 36. EXPLAIN the Extra ➲ “Extra” shows additional operations invoked to get your result set ➲ Some common values are (more are discussed in the MySQL manual):  Using where  Using temporary table  Using filesort  Using index EXPLAIN SELECT * FROM rental WHERE rental_id IN (10,11,12) OR rental_date = '2006-02-01' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: index_merge possible_keys: PRIMARY,rental_date key: rental_date,PRIMARY key_len: 8,4 ref: NULL rows: 4 Extra: Using sort_union(rental_date,PRIMARY); Using where 1 row in set (0.02 sec)
  • 37. EXPLAIN the Extra ➲ Using filesort: AVOID ➲ Avoid because  Doesn't Use Index  Involves a full scan  Uses a generic algorithm (one fits all)  Uses filesystem (BAD !!)  Gets slower with more data ➲ It's not all that bad  Sometime unavoidable - ORDER BY RAND()  Acceptable provided you get to your result as quickly as possible, and keep it predictably small EXPLAIN SELECT * FROM attendees WHERE conference_id = 123 ORDER BY lastname *************************** 1. row *************************** table: attendees possible_keys: conference_id key: conference_id rows: 331 Extra: Using filesort
  • 38. EXPLAIN the Extra ➲ Using index: GOOD ➲ Celebrate because  MySQL got your results just by consulting the index  MySQL didn't need to look at the table to get the results (open table is expensive)  Fastest way to get your data ➲ Particularly useful...  When you are interested in a single data or id  When you are interested in COUNT(), SUM(), AVG(), etc. of a field EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123 *************************** 1. row *************************** table: attendees possible_keys: conference_id key: conference_id rows: 331 Extra: ALTER TABLE attendees ADD INDEX conf_age (conference_id, age); EXPLAIN SELECT AVG(age) FROM attendees WHERE conference_id = 123 *************************** 1. row *************************** table: attendees possible_keys: conference_id, conf_surname key: conf_surname rows: 331 Extra: Using index Nothing is actually wrong with this query, it just could be quicker Outside from caching, this is the fastest way to get your data
  • 39. INDEXES... your schema's phone book ➲ Speed up SELECTs, but slow down modifications ➲ Make sure you have indexes on columns used in WHERE, ON, and GROUP BY clauses ➲ Always ensure that JOIN conditions are indexed AND have identical data types ➲ Good keys: Selectivity: % of distinct values = distinct values / number rows unique or primary always 1 Low selectivity: Maybe you can put it in a multi-column index Prefix ? Suffix ? It depends on your application
  • 40. indexed columns and functions don't mix A full table scan is used because a function (LEFT) is operating on the lastname column. Let's Fix this... EXPLAIN SELECT * FROM attendees WHERE LEFT(lastname.2) = “Pa” *************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 951 Extra: Using where EXPLAIN SELECT * FROM attendees WHERE lastname LIKE “Pa%” *************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: range possible_keys: idx_title key: idx_title key_len: 767 ref: NULL rows: 15 Extra: Using where
  • 41. let's fix multiple issues with a SELECT query First, we are operating on an index column (order_created) with a function – let's fix that: SELECT * FROM orders WHERE TO_DAYS(CURRENT_DATE()) - TO_DAYS(order_created) <= 7; Even if we removed the function in the WHERE expression, we still have a non- deterministic function in the statement which eliminates this query from being places in the query cache – let's fix that: SELECT * FROM orders WHERE order_created >= CURRENT_DATE() - INTERVAL 7 DAYS; We replaced the function with a constant, however we are specifying a SELECT * instead than the actual fields that we need. What is there is a TEXT field in the table that we don't seen to see ? Having it included in the result means a larger result set which may not fit in the query cache and may force a disk-based temporary table – let's fix that: SELECT * FROM orders WHERE order_created >= '2013-01-13' - INTERVAL 7 DAYS; SELECT order_id, customer_id, order_total, date_created FROM orders WHERE order_created >= '2013-01-13' - INTERVAL 7 DAYS;
  • 42. good indexes vs. bad indexes Don't forget that MySQL string indexes allow only 1000 characters (333 using UTF-8). Let's say you have 11,000,000 records in a table called “USERS” with the following fields: ➲ user, firstname, lastname, gender, email, age, country_id Our application perform searched on the following fields: ➲ user ➲ firstname, lastname, gender ➲ email It is obvious to create indexes on user and email, especially if they are unique, but what about the other fields? ➲ “gender” can be M or F, selectivity is very low 2/11,000,000 = 0. Best would be to remove the index on gender if you have it ➲ “firstname”/”lastname” depend on the uniqueness of the values stores. SELECT DISTINCT to calculate the selectivity if it is above 15% keep it below 15% you might want to create a composite INDEX
  • 43. removing crappy or redundant indexes SELECT t.TABLE_SCHEMA AS `db`, t.TABLE_NAME AS `table`, s.INDEX_NAME AS `index name`, s.COLUMN_NAME AS `field name`, s.SEQ_IN_INDEX `seq in index`, s2.max_columns AS `# cols, s.CARDINALITY AS `card`, t.TABLE_ROWS AS `est rows`, ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %` FROM INFORMATION_SCHEMA.STATISTICS s INNER JOIN INFORMATION_SCHEMA.TABLES t ON s.TABLE_SCHEMA = t.TABLE_SCHEMA AND s.TABLE_NAME = t.TABLE_NAME INNER JOIN ( SELECT TABLE_SCHEMA, TABLE_NAME, INDEX_NAME, MAX(SEQ_IN_INDEX) AS max_columns FROM INFORMATION_SCHEMA.STATISTICS WHERE TABLE_SCHEMA != 'mysql' GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME ) AS s2 ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA AND s.TABLE_NAME = s2.TABLE_NAME AND s.INDEX_NAME = s2.INDEX_NAME WHERE t.TABLE_SCHEMA != 'mysql' /* Filter out the mysql system DB */ AND t.TABLE_ROWS > 10 /* Only tables with some rows */ AND s.CARDINALITY IS NOT NULL /* Need at least one non-NULL value in the field */ AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* unique indexes are perfect anyway */ ORDER BY `sel %`, /* DESC for best non-unique indexes */ s.TABLE_SCHEMA, s.TABLE_NAME LIMIT 100