3. What is Query Optimizer?
- attempts to determine the most efficient
way to execute a given query by considering
the possible query plans.
4. Topics
1.Data Types and Schema Design
2.Indexing
3.Query Optimization
- These three go hand in hand to achieve high
query performance.
5. Data Types and Schema Design
Optimization
Good logical and physical design is the cornerstone of high
performance, and you must design your schema for the specific
queries you will run.
6. Smaller is Usually Better.
-Use the smallest data type that can correctly store and represent
your data. Smaller data types are usually faster, because they use
less space on the disk, in memory, and in the CPU cache.
- For storage and computational purposes, INT(1) is identical to
INT(20).
- The “1” in INT(1) sometines is useless because it does not restrict
the legal range of values, but simply specifies the number of
characters MySQL’s interactive tools (such as the command-line
client) will reserve for display purposes.
7. Simple is Good.
● Integers are cheaper to compare than characters, because
character sets and collations (sorting rules) make character
comparisons compli- cated.
● Use INT for dotted-quad representation of an IPv4 network
address.
○ insert into user(ip) VALUE(INET_ATON("127.0.0.1")); //
will insert 2130706433
○ select INET_NTOA(ip) from user; // will diplay 127.0.0.1
8. Simple is Good.
-Use DATETIME and TIMESTAMP built- in mysql
and time rather than VARCHAR.
types for date
9. Avoid NULL if Possible.
- It’s usually best to specify columns as NOT NULL unless you intend
to store NULL in them.
- It’s harder for MySQL to optimize queries that refer to nullable columns,
because they make indexes, index statistics, and value comparisons more
complicated.
- The performance improvement from changing NULL columns to NOT
NULL is usually small, so don’t make it a priority to find and change them
on an existing schema unless you know they are causing problems.
However, if you’re planning to index columns, avoid making them nullable
if possible.
10. Schema Design Gotchas in MySQL
1. Too many columns
Q. Storing the value 'hello' requires the same amount of space
in a VARCHAR(5) and a VARCHAR(200) column. Is there any
advantage to using the shorter column?
A. The larger column can use much more memory, because
MySQL often allocates fixed-size chunks of memory to hold values
internally.
11. Schema Design Gotchas in MySQL
2. Too many joins
o Limit your join to dozen or fewer join.
o MySQL has a limitation of 61 tables per join.
3. Beware of overusing ENUM.
4. Other tips:
o If your not going to use negative numbers, use UNSIGNED.
o If your not sure about the lenght VARCHAR(100) is still much better
than VARCHAR(255)
12. Index Optimization
Indexes are critical for good performance, and become more
important as your data grows larger.
Index optimization is perhaps the most powerful way to improve
query performance.
13. Basic Rule on choosing column for
Indexes
- Choose the fields most commonly used in
your where clauses.
14. Index
Table t
key(b)
SELECT * FROM t where b = 7;
a
b
c
key(b)
a
1
10
3
2
4
2
7
7
6
5
3
15
4
7
2
a
b
c
4
2
12
9
6
2
7
7
5
6
4
10
1
6
9
7
7
15
6
15
7
b=7
POINT QUERIES
15
3
r
e
s
u
l
t
16. Cardinality
•
•
•
Number of unique data in a column.
Higher cardinality = more unique values.
If the column has high cardinality and this column is commonly
used on where clause, then it can be a good index.
col_cardinality = SELECT COUNT(DISTINCT(col_name)) FROM tbl_name
17. Selectivity
- The state or quality of being selective.
- Selectivity Formula
S(I) = d/n * 100%
S(I) = Selectivity
d = cardinality
n = number of records
-
Formula
SELECT ( count(distinct(column_name)) / count(column_name) ) * 100%
18. Selectivity
- Suppose you have a userbase of 10,000 users. You have an index on the country field and you want to do this
query:
SELECT * FROM users WHERE country="Netherlands";
you have a cardinality of 10 (there are 10 unique countries in your userbase) so the selectivity will be:
selectivity = 10 / 10000 * 100% = 0.1%
which is very low (i’ve seen much, much, much lower values though).
So, there we are… MySQL has calculated the selectivity of an index he’s planning to use. At this point, it will
decide whether or not this index is useful. With such a low selectivity, it’s GUESSING that there are so many
duplicates records that it will probably be faster for him to skip browsing the index. This is part of the costbased optimizer. When it decides it’s too costly to browse an index (because of the low selectivity), it will
never use this index.
19. Selectivity
- When the selectivity percentage is less than 30%, it will decide
to use table scan.
20. Rules on selectivity
1. Put higher selectivity column on first condition on where
clause.
2. If the where clause contains an equality and a range condition,
put the equality condition first.
a. Ex. SELECT col1, col2 FROM tbl_name where col1=2 and
col2 > 5;
3. If conditions consists of one or more unequality column, make
first unequality columns in index as selective as possible.
col2 >= 5 and col3 > 5
21. Isolating the Column
MySQL generally can’t use indexes on columns unless the
columns are isolated in the query. “Isolating” the column means it
should not be part of an expression or be inside a function in the
query.
Wrong: mysql> SELECT actor_id FROM sakila.actor WHERE
actor_id + 1 = 5;
Correct: mysql> SELECT actor_id FROM sakila.actor WHERE
actor_id = 4;
22. Prefix Indexes for Indexing text or blob
column
If you’re indexing BLOB or TEXT columns, or very long
VARCHAR columns, you must define prefix indexes, because MySQL
disallows indexing their full length. The trick is to choose a prefix
that’s long enough to give good selectivity, but short enough to
save space. The prefix should be long enough to make the index
nearly as useful as it would be if you’d indexed the whole column.
What is prefix indexes?
Please Read….. :D
23. Prefix Indexing
This allows you to specify how many bytes to index, which can
reduce index size or allow you to index the larger data types (ie.
BLOB/TEXT).
Ex. CREATE INDEX part_of_address ON user_address(address(10));
24. Ways on how to make a good prefix index
1. Check the full selectivity of your column.
Ex.
2. Check selectivity of several prefix lengths.
Ex.
25. Index Optimization
3. Choose the prefix length having almost the same selectivity with
the full column selectivity.
In the example, prefix length 7 selectivity is almost the same
with the full column selectivity so we will used it as the index.
4. Create prefix indexes.
26. Multicolumn Indexes
- Common mistakes are to index many or all of the columns
separately, or to index columns in the wrong order.
- 16 columns only per index.
The first mistake, indexing many columns separately.
28. Your query
SELECT * FROM t where c1 = ? and c2 = ?;
SELECT * FROM t where c1 = ?;
29. Clustering Index
Every InnoDB table has a special index called the clustered
index where the data for the rows is stored. Typically, the
clustered index is synonymous with the primary key.
A clustered index means that the records are physically stored
in order (at least near each other), based on the index.
Only one clustered index column on a table .
TokuDB can have multiple clustering indexes.
30. SELECT name, birthday, address FROM user
WHERE id = ? AND sex = ‘M’ ORDER BY name
ASC LIMIT 10;
31. Create indexes for the query’s in WHERE clause is the basic rules
right(Rule of the thumb)?
Is it possible that we can index everything in our query?
32. Covering Index
Indexes need to be designed for the whole query, not just the
WHERE clause.
Contains (or “covers”) all the data needed to satisfy a query.
33. Sample Query
SELECT sum(value)
FROM Table1
WHERE item_id=? AND category_id = ?
GROUP BY customer_id;
ALTER TABLE Table1 ADD INDEX
t1_index(item_id,category_id,customer_id,v
alue)
34. Check if used covering index
mysql> EXPLAIN SELECT sum(value)
FROM Table1
WHERE item_id=? AND category_id = ?
GROUP BY customer_id;
*************************** 1. row ***********!
table: Table1
…..
possible_keys: t1_index
This signs that query used
key: t1_index
covering index.
….
Using index
Extra: Using where;Using index
35. Check if used covering index
mysql> explain select name from City where CountryCode =
'USA' and District = 'Alaska' and population > 10000G!
*************************** 1. row ***********!
table: City
type: range
possible_keys: cov1
key: cov1
key_len: 27
ref: NULL
rows: 1
Using index
Extra: Using where;Using index
36. Covering Index Star
3 Star
What is the 3 stars on indexing?
Please Read :
Relational Database Index Design and the Optimizers,
by Tapio Lah- denmaki and Mike Leach (Wiley).
37. Column Order in Covering Index
1. Const or equality
WHERE a = 5
2. Range
WHERE a = 5 AND b > 5
3. Order or Group By
WHERE a = 5 AND b > 5 GROUP BY c
WHERE a = 5 AND b > 5 ORDER BY c DESC
4. Select
count(d), sum(d)
38. What are Bad Indexes?
Duplicate indexes: always bad.
Redundant indexes: generally bad
Low-cardinality indexes: depends
Unused indexes: always bad
Indexing all columns.
39. Do you know that?
Key: key(item_id, person_id)
Query: SELECT * from t1 where item_id=5
order by person_id asc;
Sorting here is not operational.
41. Slow Query Basics: Optimize Data
Access
The most basic reason a query doesn’t
perform well is because it’s working with
too much data.
42. Fetching more rows than needed.
Ex. A select query that returns many rows while it needed only 10
to display on frontpage.
Solution: Use LIMIT instead
43. Fetching all columns from a multitable join
If you want to retrieve all actors who appear in the film Academy
Dinosaur, don’t write the query this way:
mysql> SELECT * FROM sakila.actor
-> INNER JOIN sakila.film_actor USING(actor_id)
-> INNER JOIN sakila.film USING(film_id)
-> WHERE sakila.film.title = 'Academy Dinosaur';
That returns all columns from all three tables. Instead, write the query as follows:
mysql> SELECT sakila.actor.* FROM sakila.actor...;
44. Fetching all columns
Do you really need all columns? Probably not.
Retrieving all columns can prevent optimizations such
as covering indexes, as well as adding I/O, memory,
and CPU overhead for the server.
But its ok when you use caching system on your
application.
45. Fetching the same data repeatedly
Solution :
Use caching application.
46. The Query Optimizer
MySQL uses a cost-based optimizer, which
means it tries to predict the cost of various
execution plans and choose the least
expensive.
47. The MYSQL Optimizer Choose:
1. Which index will be useful.
2. Which indexes should be avoided.
3. Which better index if there is more than
one.
48. Book to Read
High Performance MySQL, 3rd Edition
Optimization, Backups, and Replication
By Baron Schwartz, Peter Zaitsev, Vadim Tkachenko
49. Video to watch for Indexing
http://www.youtube.com/watch?
v=AVNjqgf7zNw&list=FLUvfNgLXOkXVKfQyO5mQ6w&index=1