SlideShare a Scribd company logo
1 of 48
Download to read offline
Join-fu: The Art of SQL

                                                       Jay Pipes
                                    Community Relations Manager
                                                          MySQL
                                                 jay@mysql.com
                                              http://jpipes.com




These slides released under the Creative Commons Attribution­Noncommercial­Share Alike 3.0 License
before we start

    ●
        Who am I?
           –   Just some dude who works at MySQL
               (eh...Sun)
           –   Oh, I co-wrote a book on MySQL
           –   Active PHP/MySQL community member
           –   Other than that, semi-normal geek,
               married, 2 dogs, 2 cats, blah blah
    ●
        This talk is about how to code your
        app to get the best performance
        out of your (My)SQL

09/18/08                     zendcon08 - join-fu    2
system architecture of MySQL


                                  Clients                                              Query
                                                                                       Cache




                    Query
                    Cache                                      Net I/O                 Parser




                                 “Packaging”
                                                                                      Optimizer




                                                Pluggable Storage Engine API




                                                                                                    Cluster
           MyISAM       InnoDB     MEMORY          Falcon         Archive      PBXT       SolidDB
                                                                                                     (Ndb)




09/18/08                                    zendcon08 - join-fu                                               3
the schema

    ●
        Basic foundation of
        performance
    ●   Normalize first, de-normalize
        later
    ●
        Smaller, smaller, smaller
    ●   Divide and conquer                   The Leaning Tower of Pisa
                                             from Wikipedia:
    ●
        Understand benefits and              “Although intended to stand vertically, the
                                             tower began leaning to the southeast

        disadvantages of different           soon after the onset of construction in
                                             1173 due to a poorly laid foundation
                                             and loose substrate that has allowed the
        storage engines                      foundation to shift direction.”




09/18/08               zendcon08 - join-fu                                            4
taking normalization way too far


                        Hmm......
                        DateDate?




 http://thedailywtf.com/forums/thread/75982.aspx


09/18/08                                           zendcon08 - join-fu   5
smaller, smaller, smaller

                                         The more records you can fit into a single page of 
                                        memory/disk, the faster your seeks and scans will be
                                   ●   Do you really need that BIGINT?
                                   ●   Use INT UNSIGNED for IPv4 addresses
                                   ●   Use VARCHAR carefully
                                        –   Converted to CHAR when used in a
                                            temporary table
The Pygmy Marmoset
world's smallest monkey
                                   ●   Use TEXT sparingly
This picture is a cheap stunt
intended to induce kind feelings
                                        –   Consider separate tables
for the presenter.
                                   ●   Use BLOBs very sparingly
Oh, and I totally want one of
these guys for a pet.
                                        –   Use the filesystem for what it was intended


09/18/08                                zendcon08 - join-fu                                    6
handling IPv4 addresses

           CREATE TABLE Sessions (
             session_id INT UNSIGNED NOT NULL AUTO_INCREMENT
           , ip_address INT UNSIGNED NOT NULL // Compare to CHAR(15)...
           , session_data TEXT NOT NULL
           , PRIMARY KEY (session_id)
           , INDEX (ip_address)
                                            // Insert a new dummy record
           ) ENGINE=InnoDB;
                                            INSERT INTO Sessions VALUES
                                            (NULL, INET_ATON('192.168.0.2'), 'some session data');


                                                      INSERT INTO Session VALUES (NULL, 3232235522, 'some session data');


           // Find all sessions coming from a local subnet
           SELECT
             session_id
           , ip_address as ip_raw
           , INET_NTOA(ip_address) as ip
           , session_data
           FROM Sessions
           WHERE ip_address
           BETWEEN INET_ATON('192.168.0.1')   WHERE ip_address BETWEEN 3232235521 AND 3232235775
           AND INET_ATON('192.168.0.255');

           mysql> SELECT session_id, ip_address as ip_raw, INET_NTOA(ip_address) as ip, session_data
               -> FROM Sessions
               -> WHERE ip_address BETWEEN
               -> INET_ATON('192.168.0.1') AND INET_ATON('192.168.0.255');
           +------------+------------+-------------+-------------------+
           | session_id | ip_raw     | ip          | session_data      |
           +------------+------------+-------------+-------------------+
           |          1 | 3232235522 | 192.168.0.2 | some session data |
           +------------+------------+-------------+-------------------+




09/18/08                                      zendcon08 - join-fu                                                           7
how much storage space is consumed?

    ●
        With this definition, how many bytes will the
        “a” column consume per row?

  CREATE TABLE t1 (
    a INT(1) UNSIGNED NOT NULL
  );

    ●
        The number in parentheses is the ZEROFILL
        argument, not the storage space
    ●
        INT takes 4 bytes of space
           –   Regardless of the UNSIGNED or NOT NULL

09/18/08             zendcon08 - legend of drunken query master   8
divide et impera

    ●
        Vertical partitioning
           –   Split tables with many columns
               into multiple tables
    ●   Horizontal partitioning
           –   Split table with many rows into
                                                   Niccolò Machiavelli
               multiple tables                     The Art of War, (1519-1520):

                                                   “A Captain ought, among all the other
    ●   Partitioning in MySQL 5.1 is               actions of his, endeavor with every
                                                   art to divide the forces of the
        transparent horizontal partitioning        enemy, either by making him
                                                   suspicious of his men in whom he
        within the DB...                           trusted, or by giving him cause that he
                                                   has to separate his forces, and,
                                                   because of this, become weaker.”


    ...and it's got issues at the moment.

09/18/08                     zendcon08 - join-fu                                             9
vertical partitioning
                                                             CREATE TABLE Users (
         CREATE TABLE Users (                                  user_id INT NOT NULL AUTO_INCREMENT
           user_id INT NOT NULL AUTO_INCREMENT               , email VARCHAR(80) NOT NULL
         , email VARCHAR(80) NOT NULL                        , display_name VARCHAR(50) NOT NULL
         , display_name VARCHAR(50) NOT NULL                 , password CHAR(41) NOT NULL
         , password CHAR(41) NOT NULL                        , PRIMARY KEY (user_id)
         , first_name VARCHAR(25) NOT NULL                   , UNIQUE INDEX (email)
         , last_name VARCHAR(25) NOT NULL                    ) ENGINE=InnoDB; CREATE TABLE UserExtra (
         , address VARCHAR(80) NOT NULL                                         user_id INT NOT NULL
         , city VARCHAR(30) NOT NULL                                          , first_name VARCHAR(25) NOT NULL
         , province CHAR(2) NOT NULL                                          , last_name VARCHAR(25) NOT NULL
         , postcode CHAR(7) NOT NULL                                          , address VARCHAR(80) NOT NULL
         , interests TEXT NULL                                                , city VARCHAR(30) NOT NULL
         , bio TEXT NULL                                                      , province CHAR(2) NOT NULL
         , signature TEXT NULL                                                , postcode CHAR(7) NOT NULL
         , skills TEXT NULL                                                   , interests TEXT NULL
         , PRIMARY KEY (user_id)                                              , bio TEXT NULL
         , UNIQUE INDEX (email)                                               , signature TEXT NULL
         ) ENGINE=InnoDB;                                                     , skills TEXT NULL
                                                                              , PRIMARY KEY (user_id)
                                                                              , FULLTEXT KEY (interests, skills)
                                                                              ) ENGINE=MyISAM;

     ●
           Mixing frequently and infrequently accessed attributes in a
           single table?
     ●
           Space in buffer pool at a premium?
            –   Splitting the table allows main records to consume the buffer
                pages without the extra data taking up space in memory
     ●     Need FULLTEXT on your text columns?
09/18/08                                    zendcon08 - join-fu                                                    10
the MySQL query cache
                                                                                      ●   You must understand your
                                                                 Query
                                                                                          application's read/write patterns
                        Clients
                                                                 Cache
                                                                                      ●
                                                                                          Internal query cache design is a
            Query
                                                                                          compromise between CPU usage
            Cache                             Net I/O
                                                                 Parser                   and read performance
                                                                                      ●   Stores the MYSQL_RESULT of a
                       “Packaging”                                                        SELECT along with a hash of the
                                                                Optimizer
                                                                                          SELECT SQL statement

                           Pluggable Storage Engine API
                                                                                      ●
                                                                                          Any modification to any table
                                                                                          involved in the SELECT
                                                                                          invalidates the stored result
   MyISAM     InnoDB     MEMORY      Falcon    Archive   PBXT     SolidDB
                                                                            Cluster
                                                                             (Ndb)
                                                                                      ●
                                                                                          Write applications to be aware
                                                                                          of the query cache
                                                                                           –   Use SELECT SQL_NO_CACHE

09/18/08                                                 zendcon08 - join-fu                                               11
vertical partitioning ... continued
                                                         CREATE TABLE Products (
CREATE TABLE Products (                                    product_id INT NOT NULL
  product_id INT NOT NULL                                , name VARCHAR(80) NOT NULL
, name VARCHAR(80) NOT NULL                              , unit_cost DECIMAL(7,2) NOT NULL
, unit_cost DECIMAL(7,2) NOT NULL                        , description TEXT NULL
, description TEXT NULL                                  , image_path TEXT NULL
, image_path TEXT NULL                                   , PRIMARY KEY (product_id)
, num_views INT UNSIGNED NOT NULL                        , INDEX (name(20))
, num_in_stock INT UNSIGNED NOT NULL                     ) ENGINE=InnoDB; ProductCounts (
                                                             CREATE TABLE
, num_on_order INT UNSIGNED NOT NULL                              product_id INT NOT NULL
, PRIMARY KEY (product_id)                                    ,   num_views INT UNSIGNED NOT NULL
, INDEX (name(20))                                            ,   num_in_stock INT UNSIGNED NOT NULL
) ENGINE=InnoDB;                                              ,   num_on_order INT UNSIGNED NOT NULL
                                                              ,   PRIMARY KEY (product_id)
// Getting a simple COUNT of products                         )   ENGINE=InnoDB;
// easy on MyISAM, terrible on InnoDB
SELECT COUNT(*)                                               CREATE TABLE TableCounts (
FROM Products;                                                  total_products INT UNSIGNED NOT NULL
                                                              ) ENGINE=MEMORY;



     ●     Mixing static attributes with frequently updated fields in a single table?
            –   Thrashing occurs with query cache. Each time an update occurs on any
                record in the table, all queries referencing the table are invalidated in
                the query cache
     ●     Doing COUNT(*) with no WHERE on an indexed field on an InnoDB table?
            –   Complications with versioning make full table counts very slow

09/18/08                                zendcon08 - join-fu                                            12
Don't be afraid
       of SQL!

            It (usually)
           doesn't bite.



09/18/08         zendcon08 - legend of drunken query master   13
coding like a join-fu master

                                        ●
                                            Be consistent (for crying out
                                            loud)
                                        ●   Use ANSI SQL coding style
                                        ●
                                            Do not think in terms of
                                            iterators, for loops, while
                                            loops, etc
                                        ●
                                            Instead, think in terms of sets
Did you know?

                                            Break complex SQL statements
from Wikipedia:
                                        ●
Join-fu is a close cousin to Jun Fan
Gung Fu, the method of martial arts
Bruce Lee began teaching in 1959.           (or business requests) into
OK, not really.                             smaller, manageable chunks
09/18/08                               zendcon08 - join-fu                    14
join-fu guidelines

                                        ●   Always try variations on a
                                            theme
                                        ●   Beware of join hints
                                              –   Can get “out of date”
See, even bears practice join-fu.
                                        ●
                                            Just because it can be done
                                            in a single SQL statement
                                            doesn't mean it should
                                        ●
                                            Always test and benchmark
                                            your solutions
                                              –   Use http_load (simple and
                                                  effective for web stuff)
09/18/08                            zendcon08 - join-fu                       15
SQL coding consistency
                                                                 Inconsistent SQL
                                                                 code makes baby
 ●     Tabs and spacing                                             kittens cry.


 ●     Upper and lower case
 ●
       Keywords, function names
 ●
       Some columns aliased, some not
 SELECT
   a.first_name, a.last_name, COUNT(*) as num_rentals
 FROM actor a
  INNER JOIN film f
                                                            ●
                                                                Consider your
   ON a.actor_id = fa.actor_id
 GROUP BY a.actor_id                                            teammates
 ORDER BY num_rentals DESC, a.last_name, a.first_name

                                                                Like your
 LIMIT 10;
                                                            ●
 vs.

 select first_name, a.last_name,
 count(*) AS num_rentals
                                                                programming code,
 FROM actor a join film f on a.actor_id = fa.actor_id
  group by a.actor_id order by                                  SQL is meant to be
 num_rentals DESC, a.last_name, a.first_name
 LIMIT 10;
                                                                read, not written
09/18/08                  zendcon08 - legend of drunken query master                 16
ANSI vs. Theta SQL coding style

 SELECT
   a.first_name, a.last_name, COUNT(*) as num_rentals                             ANSI STYLE
 FROM actor a
  INNER JOIN film f
   ON a.actor_id = fa.actor_id                                               Explicitly declare JOIN 
  INNER JOIN film_actor fa                                                  conditions using the ON 
   ON fa.film_id = f.film_id                                                          clause
  INNER JOIN inventory I
   ON f.film_id = i.film_id
  INNER JOIN rental r
   ON r.inventory_id = i.inventory_id
 GROUP BY a.actor_id
 ORDER BY num_rentals DESC, a.last_name, a.first_name
 LIMIT 10;



                                      SELECT
                                        a.first_name, a.last_name, COUNT(*) as num_rentals
                                      FROM actor a, film f, film_actor fa, inventory i, rental r
           Theta STYLE                WHERE a.actor_id = fa.actor_id
                                      AND fa.film_id = f.film_id
      Implicitly declare JOIN         AND f.film_id = i.film_id
                                      AND r.inventory_id
                                      GROUP BY a.actor_id= i.inventory_id
     conditions in the WHERE 
                                      GROUP
                                      ORDER BY num_rentals DESC, a.last_name, a.first_name
                                               a.actor_id
               clause                 ORDER BY
                                      LIMIT 10;num_rentals DESC, a.last_name, a.first_name
                                      LIMIT 10;




09/18/08                             zendcon08 - join-fu                                                17
why ANSI style's join-fu kicks Theta style's ass

    ●
        MySQL only supports the INNER and CROSS join
        for the Theta style
           –   But, MySQL supports the INNER, CROSS, LEFT, RIGHT,
               and NATURAL joins of the ANSI style
           –   Mixing and matching both styles can lead to hard-to-
               read SQL code
    ●
        It is supremely easy to miss a join condition with
        Theta style
           –   Especially when joining many tables together
           –   Leaving off a join condition by accident in the
               WHERE clause will lead to a cartesian product
09/18/08                     zendcon08 - join-fu                  18
EXPLAIN Basics

     ●     Provides the execution plan chosen by the MySQL
           optimizer for a specific SELECT statement
     ●     Simply append the word EXPLAIN to the beginning
           of your SELECT statement
     ●     Each row in output represents a set of information
           used in the SELECT
           –   A real schema table
           –   A virtual table (derived table) or temporary table
           –   A subquery in SELECT or WHERE
           –   A unioned set

09/18/08                   OSCON 2007 - Target Practice             19
EXPLAIN columns

     ●
           select_type - type of “set” the data in this row contains
     ●
           table - alias (or full table name if no alias) of the table or
           derived table from which the data in this set comes
     ●
           type - “access strategy” used to grab the data in this set
     ●
           possible_keys - keys available to optimizer for query
     ●
           keys - keys chosen by the optimizer
     ●     rows - estimate of the number of rows in this set
     ●
           Extra - information the optimizer chooses to give you
     ●
           ref - shows the column used in join relations



09/18/08                   OSCON 2007 - Target Practice                     20
Example #1 - the const access type

EXPLAIN SELECT * FROM rental WHERE rental_id = 13G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: rental
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
        Extra:
1 row in set (0.00 sec)




09/18/08              OSCON 2007 - Target Practice          21
Constants in the optimizer

     ●
           a field indexed with a unique non-nullable key
     ●     The access strategy of system is related to const
           and refers to when a table with only a single
           row is referenced in the SELECT
     ●
           Can be propogated across joined columns




09/18/08               OSCON 2007 - Target Practice         22
Example #2 - constant propogation
EXPLAIN SELECT r.*, c.first_name, c.last_name
FROM rental r INNER JOIN customer c
ON r.customer_id = c.customer_id WHERE r.rental_id = 13G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: r
         type: const
possible_keys: PRIMARY,idx_fk_customer_id
          key: PRIMARY
      key_len: 4
          ref: const
         rows: 1
        Extra:
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: c
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 2
          ref: const /* Here is where the propogation occurs...*/
         rows: 1
        Extra:
2 rows in set (0.00 sec)
09/18/08                  OSCON 2007 - Target Practice              23
Example #3 - the range access type

SELECT * FROM rental
WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16'G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: rental
         type: range
possible_keys: rental_date
          key: rental_date
      key_len: 8
          ref: NULL
         rows: 364
        Extra: Using where
1 row in set (0.00 sec)




09/18/08              OSCON 2007 - Target Practice          24
Considerations with range accesses

     ●
           Index must be available on the field operated
           upon by a range operator
     ●     If too many records are estimated to be
           returned by the condition, the range
           optimization won't be used
           –   index or full table scan will be used instead
     ●     The indexed field must not be operated on by a
           function call! (Important for all indexing)



09/18/08                   OSCON 2007 - Target Practice        25
The scan vs. seek dilemma

     ●
           A seek operation, generally speaking, jumps into a random
           place -- either on disk or in memory -- to fetch the data
           needed.
           –   Repeat for each piece of data needed from disk or
               memory
     ●     A scan operation, on the other hand, will jump to the start
           of a chunk of data, and sequentially read data -- either
           from disk or from memory -- until the end of the chunk of
           data
     ●
           For large amounts of data, scan operations tend to be more
           efficient than multiple seek operations



09/18/08                   OSCON 2007 - Target Practice              26
Example #4 - Full table scan

EXPLAIN SELECT * FROM rental
WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-21'G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: rental
         type: ALL
possible_keys: rental_date /* larger range forces scan choice */
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 16298
        Extra: Using where
1 row in set (0.00 sec)




09/18/08              OSCON 2007 - Target Practice            27
Why full table scans pop up

     ●
           No WHERE condition (duh.)
     ●     No index on any field in WHERE condition
     ●     Poor selectivity on an indexed field
     ●
           Too many records meet WHERE condition
     ●
           < MySQL 5.0 and using OR in a WHERE clause




09/18/08               OSCON 2007 - Target Practice     28
Example #5 - Full index scan

EXPLAIN SELECT rental_id, rental_date FROM rentalG
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: rental
         type: index
possible_keys: NULL
          key: rental_date
      key_len: 13
          ref: NULL                              CAUTION!
         rows: 16325
                                          Extra=“Using index” 
        Extra: Using index
                                           is NOT the same as 
1 row in set (0.00 sec)                        type=”index”




09/18/08              OSCON 2007 - Target Practice          29
indexed columns and functions don't mix
 mysql> EXPLAIN SELECT * FROM film WHERE title LIKE 'Tr%'G
 *************************** 1. row ***************************
            id: 1
   select_type: SIMPLE
         table: film
          type: range
 possible_keys: idx_title
           key: idx_title
       key_len: 767
           ref: NULL
          rows: 15
         Extra: Using where

   ●
       A fast range access strategy is chosen by the optimizer, and the
       index on title is used to winnow the query results down
 mysql> EXPLAIN SELECT * FROM film WHERE LEFT(title,2) = 'Tr' G
 *************************** 1. row ***************************
            id: 1
   select_type: SIMPLE
         table: film
          type: ALL
 possible_keys: NULL
           key: NULL
       key_len: NULL
           ref: NULL
          rows: 951
         Extra: Using where


   ●   A slow full table scan (the ALL access strategy) is used because
       a function (LEFT) is operating on the title column

09/18/08                                   zendcon08 - join-fu            30
solving multiple issues in a SELECT query

 SELECT * FROM Orders WHERE TO_DAYS(CURRENT_DATE()) – TO_DAYS(order_created) <= 7;

    ●
           First, we are operating on an indexed column (order_created) with a function – let's
           fix that:
 SELECT * FROM Orders WHERE order_created >= CURRENT_DATE() - INTERVAL 7 DAY;

    ●      Although we rewrote the WHERE expression to remove the operating function, we still
           have a non-deterministic function in the statement, which eliminates this query from
           being placed in the query cache – let's fix that:
 SELECT * FROM Orders WHERE order_created >= '2008-01-11' - INTERVAL 7 DAY;

    ●
           We replaced the function with a constant (probably using our application
           programming language). However, we are specifying SELECT * instead of the actual
           fields we need from the table.
    ●      What if there is a TEXT field in Orders called order_memo that we don't need to see?
           Well, having it included in the result means a larger result set which may not fit into
           the query cache and may force a disk-based temporary table
 SELECT order_id, customer_id, order_total, order_created
 FROM Orders WHERE order_created >= '2008-01-11' - INTERVAL 7 DAY;




09/18/08                              zendcon08 - join-fu                                         31
set-wise problem solving

   “Show the last payment information for each customer”
 CREATE TABLE `payment` (
   `payment_id` smallint(5) unsigned NOT NULL auto_increment,
   `customer_id` smallint(5) unsigned NOT NULL,
   `staff_id` tinyint(3) unsigned NOT NULL,
   `rental_id` int(11) default NULL,
   `amount` decimal(5,2) NOT NULL,
   `payment_date` datetime NOT NULL,
   `last_update` timestamp NOT NULL ... on update CURRENT_TIMESTAMP,
   PRIMARY KEY (`payment_id`),
   KEY `idx_fk_staff_id` (`staff_id`),
   KEY `idx_fk_customer_id` (`customer_id`),
   KEY `fk_payment_rental` (`rental_id`),
   CONSTRAINT `fk_payment_rental` FOREIGN KEY (`rental_id`)
     REFERENCES `rental` (`rental_id`),
   CONSTRAINT `fk_payment_customer` FOREIGN KEY (`customer_id`)
     REFERENCES `customer` (`customer_id`) ,
   CONSTRAINT `fk_payment_staff` FOREIGN KEY (`staff_id`)
     REFERENCES `staff` (`staff_id`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8

 http://forge.mysql.com/wiki/SakilaSampleDB
09/18/08                               zendcon08   - join-fu           32
thinking in terms of foreach loops...

           OK, for each customer, find the maximum date the
            payment was made and get that payment record(s)
  mysql> EXPLAIN SELECT
      -> p.*
      -> FROM payment p
      -> WHERE p.payment_date =
                                                                   ●
                                                                       A correlated
                                                                       subquery in the
      -> ( SELECT MAX(payment_date)
      ->   FROM payment
      ->   WHERE customer_id=p.customer_id


                                                                       WHERE clause is used
      -> )G
  *************************** 1. row ***************************
             id: 1
    select_type: PRIMARY
          table: p
           type: ALL
           rows: 16567
          Extra: Using where
                                                                   ●   It will be re-
                                                                       executed for each
  *************************** 2. row ***************************
             id: 2
    select_type: DEPENDENT SUBQUERY


                                                                       row in the primary
          table: payment
           type: ref
  possible_keys: idx_fk_customer_id
            key: idx_fk_customer_id
        key_len: 2
            ref: sakila.p.customer_id
           rows: 15
                                                                       table (payment)
  2 rows in set (0.00 sec)

                                                                   ●
                                                                       Produces 623 rows in
                                                                       an average of 1.03s
09/18/08                                    zendcon08 - join-fu                             33
what about adding an index?

           Will adding an index on (customer_id, payment_date)
                             make a difference?
mysql> EXPLAIN SELECT                                         mysql> EXPLAIN SELECT
    -> p.*                                                        -> p.*
    -> FROM payment p                                             -> FROM payment p
    -> WHERE p.payment_date =                                     -> WHERE p.payment_date =
    -> ( SELECT MAX(payment_date)                                 -> ( SELECT MAX(payment_date)
    ->   FROM payment                                             ->   FROM payment
    ->   WHERE customer_id=p.customer_id                          ->   WHERE customer_id=p.customer_id
    -> )G                                                        -> )G
*************************** 1. row ************************   *************************** 1. row ***************************
           id: 1                                                         id: 1
  select_type: PRIMARY                                          select_type: PRIMARY
        table: p                                                      table: p
         type: ALL                                                     type: ALL
         rows: 16567                                                   rows: 15485
        Extra: Using where                                            Extra: Using where
*************************** 2. row ************************   *************************** 2. row ***************************
           id: 2                                                         id: 2
  select_type: DEPENDENT SUBQUERY                               select_type: DEPENDENT SUBQUERY
        table: payment                                                table: payment
         type: ref                                                     type: ref
possible_keys: idx_fk_customer_id                             possible_keys: idx_fk_customer_id,ix_customer_paydate
          key: idx_fk_customer_id                                       key: ix_customer_paydate
      key_len: 2                                                    key_len: 2
          ref: sakila.p.customer_id                                     ref: sakila.p.customer_id
         rows: 15                                                      rows: 14
                                                                      Extra: Using index
2 rows in set (0.00 sec)                                      2 rows in set (0.00 sec)


   ●
       Produces 623 rows in                                       ●
                                                                      Produces 623 rows in
       an average of 1.03s
09/18/08                                    zendcon08 - join-fu
                                                                      an average of 0.45s                                 34
thinking in terms of sets...

               OK, I have one set of last payments dates and another set
             containing payment information (so, how do I join these sets?)
  mysql> EXPLAIN SELECT
      -> p.*
      -> FROM (
      -> SELECT customer_id, MAX(payment_date) as last_order
                                                                       ●
                                                                           A derived table, or
                                                                           subquery in the
      -> FROM payment
      -> GROUP BY customer_id
      -> ) AS last_orders
      -> INNER JOIN payment p


                                                                           FROM clause, is used
      -> ON p.customer_id = last_orders.customer_id
      -> AND p.payment_date = last_orders.last_orderG
  *************************** 1. row ***************************
             id: 1
    select_type: PRIMARY


                                                                           The derived table
          table: <derived2>
           type: ALL                                                   ●
           rows: 599
  *************************** 2. row ***************************
             id: 1
    select_type: PRIMARY
          table: p
                                                                           represents a set:
                                                                           last payment dates
           type: ref
  possible_keys: idx_fk_customer_id,ix_customer_paydate
            key: ix_customer_paydate
        key_len: 10
            ref: last_orders.customer_id,last_orders.last_order
           rows: 1
  *************************** 3. row ***************************
             id: 2
                                                                           of customers
    select_type: DERIVED
          table: payment
           type: range
            key: ix_customer_paydate
                                                                       ●
                                                                           Produces 623 rows in
                                                                           an average of 0.03s
        key_len: 2
           rows: 1107
          Extra: Using index for group-by
  3 rows in set (0.02 sec)


09/18/08                                         zendcon08 - join-fu                            35
working with “mapping” or N:M tables

    CREATE TABLE Project (                                       CREATE TABLE Tags (
      project_id INT UNSIGNED NOT NULL AUTO_INCREMENT              tag_id INT UNSIGNED NOT NULL AUTO_INCREMENT
    , name VARCHAR(50) NOT NULL                                  , tag_text VARCHAR(50) NOT NULL
    , url TEXT NOT NULL                                          , PRIMARY KEY (tag_id)
    , PRIMARY KEY (project_id)                                   , INDEX (tag_text)
    ) ENGINE=MyISAM;                                             ) ENGINE=MyISAM;

                                    CREATE TABLE Tag2Project (
                                      tag INT UNSIGNED NOT NULL
                                    , project INT UNSIGNED NOT NULL
                                    , PRIMARY KEY (tag, project)
                                    , INDEX rv_primary (project, tag)
                                    ) ENGINE=MyISAM;



    ●
        The next few slides will walk through examples
        of querying across the above relationship
           –   dealing with OR conditions
           –   dealing with AND conditions


09/18/08                                  zendcon08 - join-fu                                                    36
dealing with OR conditions

       Grab all project names which are tagged with “mysql” OR “php”
    mysql> SELECT p.name FROM Project p
        -> INNER JOIN Tag2Project t2p
        -> ON p.project_id = t2p.project
        -> INNER JOIN Tag t
        -> ON t2p.tag = t.tag_id
        -> WHERE t.tag_text IN ('mysql','php');
    +---------------------------------------------------------+
    | name                                                    |
    +---------------------------------------------------------+
    | phpMyAdmin                                              |
    ...
    | MySQL Stored Procedures Auto Generator                  |
    +---------------------------------------------------------+
    90 rows in set (0.05 sec)

    +----+-------------+-------+--------+----------------------+--------------+---------+-------------+------+-------------+
    | id | select_type | table | type   | possible_keys        | key          | key_len | ref         | rows | Extra       |
    +----+-------------+-------+--------+----------------------+--------------+---------+-------------+------+-------------+
    | 1 | SIMPLE       | t     | range | PRIMARY,uix_tag_text | uix_tag_text | 52       | NULL        |    2 | Using where |
    | 1 | SIMPLE       | t2p   | ref    | PRIMARY,rv_primary   | PRIMARY      | 4       | t.tag_id    |   10 | Using index |
    | 1 | SIMPLE       | p     | eq_ref | PRIMARY              | PRIMARY      | 4       | t2p.project |    1 |             |
    +----+-------------+-------+--------+----------------------+--------------+---------+-------------+------+-------------+
    3 rows in set (0.00 sec)




   ●
       Note the order in which the optimizer chose to join the tables is
       exactly the opposite of how we wrote our SELECT


09/18/08                                        zendcon08 - join-fu                                                            37
dealing with AND conditions

       Grab all project names which are tagged with “storage engine”
                               AND “plugin”
   ●
       A little more complex, let's               mysql> SELECT p.name FROM Project p
                                                      -> INNER JOIN (
       grab the project names                         -> SELECT t2p.project
                                                      -> FROM Tag2Project t2p

       which match both the
                                                      -> INNER JOIN Tag t
                                                      -> ON t2p.tag = t.tag_id
                                                      -> WHERE t.tag_text IN ('plugin','storage engine')

       “mysql” tag and the “php”                      -> GROUP BY t2p.project
                                                      -> HAVING COUNT(*) = 2
                                                      -> ) AS projects_having_all_tags
       tag                                            -> ON p.project_id = projects_having_all_tags.project;
                                                  +-----------------------------------+
                                                  | name                              |
                                                  +-----------------------------------+
   ●
       Here is perhaps the most                   | Automatic data revision           |
                                                  | memcache storage engine for MySQL |
                                                  +-----------------------------------+
       common method – using a                    2 rows in set (0.01 sec)


       HAVING COUNT(*) against a
       GROUP BY on the
       relationship table
   ●
       EXPLAIN on next page...

09/18/08                    zendcon08 - join-fu                                                                38
the dang filesort

   ●
       The EXPLAIN plan shows         *************************** 1. row ***************************
                                                 id: 1
                                        select_type: PRIMARY
       the execution plan using               table: <derived2>
                                               type: ALL

       a derived table
                                               rows: 2
                                      *************************** 2. row ***************************
                                                 id: 1

       containing the project           select_type: PRIMARY
                                              table: p
                                               type: eq_ref
       IDs having records in the      possible_keys: PRIMARY
                                                key: PRIMARY

       Tag2Project table with
                                            key_len: 4
                                                ref: projects_having_all_tags.project
                                               rows: 1

       both the “plugin” and          *************************** 3. row ***************************
                                                 id: 2
                                        select_type: DERIVED
       “storage engine” tags                  table: t
                                               type: range
                                      possible_keys: PRIMARY,uix_tag_text
                                                key: uix_tag_text
   ●   Note that a filesort is              key_len: 52
                                               rows: 2
                                              Extra: Using where; Using index; Using temporary; Using filesort
       needed on the Tag table        *************************** 4. row ***************************
                                                 id: 2

       rows since we use the            select_type: DERIVED
                                              table: t2p
                                               type: ref

       index on tag_text but          possible_keys: PRIMARY
                                                key: PRIMARY
                                            key_len: 4
       need a sorted list of                    ref: mysqlforge.t.tag_id
                                               rows: 1

       tag_id values to use in
                                              Extra: Using index
                                      4 rows in set (0.00 sec)


       the join
09/18/08                   zendcon08 - join-fu                                                                   39
removing the filesort using CROSS JOIN

    ●
        We can use a CROSS JOIN technique to remove the filesort
           –   We winnow down two copies of the Tag table (t1 and t2) by
               supplying constants in the WHERE condition
    ●
        This means no need for a sorted list of tag IDs since we already
        have the two tag IDs available from the CROSS JOINs...so no
        more filesort
 mysql> EXPLAIN SELECT p.name
     -> FROM Project p
     -> CROSS JOIN Tag t1
     -> CROSS JOIN Tag t2
     -> INNER JOIN Tag2Project t2p
     -> ON p.project_id = t2p.project
     -> AND t2p.tag = t1.tag_id
     -> INNER JOIN Tag2Project t2p2
     -> ON t2p.project = t2p2.project
     -> AND t2p2.tag = t2.tag_id
     -> WHERE t1.tag_text = quot;pluginquot;
     -> AND t2.tag_text = quot;storage enginequot;;
 +----+-------------+-------+--------+----------------------+--------------+---------+------------------------------+------+-------------+
 | id | select_type | table | type   | possible_keys        | key          | key_len | ref                          | rows | Extra       |
 +----+-------------+-------+--------+----------------------+--------------+---------+------------------------------+------+-------------+
 | 1 | SIMPLE       | t1    | const | PRIMARY,uix_tag_text | uix_tag_text | 52       | const                        |    1 | Using index |
 | 1 | SIMPLE       | t2    | const | PRIMARY,uix_tag_text | uix_tag_text | 52       | const                        |    1 | Using index |
 | 1 | SIMPLE       | t2p   | ref    | PRIMARY,rv_primary   | PRIMARY      | 4       | const                        |    9 | Using index |
 | 1 | SIMPLE       | t2p2 | eq_ref | PRIMARY,rv_primary    | PRIMARY      | 8       | const,mysqlforge.t2p.project |    1 | Using index |
 | 1 | SIMPLE       | p     | eq_ref | PRIMARY              | PRIMARY      | 4       | mysqlforge.t2p2.project      |    1 | Using where |
 +----+-------------+-------+--------+----------------------+--------------+---------+------------------------------+------+-------------+
 5 rows in set (0.00 sec)


09/18/08                                         zendcon08 - join-fu                                                                     40
another technique for dealing with ANDs
●
    Do two separate queries – one which grabs tag_id                                mysql> SELECT t.tag_id FROM Tag t
                                                                                         > WHERE tag_text IN (quot;pluginquot;,quot;storage enginequot;);
    values based on the tag text and another which does                             +--------+
                                                                                    | tag_id |
    a self-join after the application has the tag_id values                         +--------+
                                                                                    |    173 |
    in memory                                                                       |    259 |
                                                                                    +--------+
                                                                                    2 rows in set (0.00 sec)

Benefit #1                                                                                  mysql> SELECT p.name FROM Project p
                                                                                                -> INNER JOIN Tag2Project t2p
                                                                                                -> ON p.project_id = t2p.project
●
    If we assume the Tag2Project table is updated 10X                                           -> AND t2p.tag = 173
                                                                                                -> INNER JOIN Tag2Project t2p2
    more than the Tag table is updated, the first query                                         -> ON t2p.project = t2p2.project
                                                                                                -> AND t2p2.tag = 259;
    on Tag will be cached more effectively in the query                                     +-----------------------------------+
                                                                                            | name                              |
    cache                                                                                   +-----------------------------------+
                                                                                            | Automatic data revision           |
                                                                                            | memcache storage engine for MySQL |

Benefit #2                                                                                  +-----------------------------------+
                                                                                            2 rows in set (0.00 sec)


●
    The EXPLAIN on the self-join query is much better
    than the HAVING COUNT(*) derived table solution
     mysql> EXPLAIN SELECT p.name FROM Project p
         -> INNER JOIN Tag2Project t2p
         -> ON p.project_id = t2p.project
         -> AND t2p.tag = 173
         -> INNER JOIN Tag2Project t2p2
         -> ON t2p.project = t2p2.project
         -> AND t2p2.tag = 259;
     +----+-------------+-------+--------+--------------------+---------+---------+------------------------------+------+-------------+
     | id | select_type | table | type   | possible_keys      | key     | key_len | ref                          | rows | Extra       |
     +----+-------------+-------+--------+--------------------+---------+---------+------------------------------+------+-------------+
     | 1 | SIMPLE       | t2p   | ref    | PRIMARY,rv_primary | PRIMARY | 4       | const                        |    9 | Using index |
     | 1 | SIMPLE       | t2p2 | eq_ref | PRIMARY,rv_primary | PRIMARY | 8        | const,mysqlforge.t2p.project |    1 | Using index |
     | 1 | SIMPLE       | p     | eq_ref | PRIMARY            | PRIMARY | 4       | mysqlforge.t2p2.project      |    1 | Using where |
     +----+-------------+-------+--------+--------------------+---------+---------+------------------------------+------+-------------+
     3 rows in set (0.00 sec)
09/18/08                                        zendcon08 - join-fu                                                                       41
understanding LEFT-join-fu

    CREATE TABLE Project (                                       CREATE TABLE Tags (
      project_id INT UNSIGNED NOT NULL AUTO_INCREMENT              tag_id INT UNSIGNED NOT NULL AUTO_INCREMENT
    , name VARCHAR(50) NOT NULL                                  , tag_text VARCHAR(50) NOT NULL
    , url TEXT NOT NULL                                          , PRIMARY KEY (tag_id)
    , PRIMARY KEY (project_id)                                   , INDEX (tag_text)
    ) ENGINE=MyISAM;                                             ) ENGINE=MyISAM;

                                    CREATE TABLE Tag2Project (
                                      tag INT UNSIGNED NOT NULL
                                    , project INT UNSIGNED NOT NULL
                                    , PRIMARY KEY (tag, project)
                                    , INDEX rv_primary (project, tag)
                                    ) ENGINE=MyISAM;



    ●
        Get the tag phrases which are not related to any
        project
    ●   Get the tag phrases which are not related to any
        project OR the tag phrase is related to project
        #75
    ●
        Get the tag phrases not related to project #75
09/18/08                                  zendcon08 - join-fu                                                    42
LEFT join-fu: starting simple...the NOT EXISTS

                                                                           Get the tag phrases
  mysql> EXPLAIN SELECT
      ->   t.tag_text                                                  ●
      -> FROM Tag t
      -> LEFT JOIN Tag2Project t2p
      -> ON t.tag_id = t2p.tag
      -> WHERE t2p.project IS NULL
      -> GROUP BY t.tag_textG
                                                                           which are not
                                                                           related to any
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: t

                                                                           project
           type: index
  possible_keys: NULL
            key: uix_tag_text
        key_len: 52
           rows: 1126
          Extra: Using index
  *************************** 2. row ***************************
             id: 1
    select_type: SIMPLE
                                                                       ●
                                                                           LEFT JOIN ... WHERE
          table: t2p
           type: ref
  possible_keys: PRIMARY
                                                                            x IS NULL
            key: PRIMARY


                                                                           WHERE x IS NOT
        key_len: 4
            ref: mysqlforge.t.tag_id                                   ●
           rows: 1
          Extra: Using where; Using index; Not exists


                                                                           NULL would yield tag
  2 rows in set (0.00 sec)

  mysql> SELECT
      ->   t.tag_text
      -> FROM Tag t
      -> LEFT JOIN Tag2Project t2p
      -> ON t.tag_id = t2p.tag
                                                                           phrases that are
                                                                           related to a project
      -> WHERE t2p.project IS NULL
      -> GROUP BY t.tag_text;
  +--------------------------------------+
  | tag_text                             |
  +--------------------------------------+

                                                                               But, then, you'd want
  <snip>
  +--------------------------------------+
  153 rows in set (0.01 sec)
                                                                           –

09/18/08                                         zendcon08 - join-fu
                                                                               to use an INNER JOIN43
LEFT join-fu: a little harder

                                                                           Get the tag phrases
  mysql> EXPLAIN SELECT
      ->   t.tag_text                                                  ●
      -> FROM Tag t
      -> LEFT JOIN Tag2Project t2p
      -> ON t.tag_id = t2p.tag
      -> WHERE t2p.project IS NULL
      -> OR t2p.project = 75
                                                                           which are not
                                                                           related to any
      -> GROUP BY t.tag_textG
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE

                                                                           project OR the tag
          table: t
           type: index
            key: uix_tag_text
        key_len: 52
            ref: NULL
           rows: 1126
          Extra: Using index
                                                                           phrase is related to
                                                                           project #75
  *************************** 2. row ***************************
             id: 1
    select_type: SIMPLE
          table: t2p
           type: ref


                                                                           No more NOT EXISTS
  possible_keys: PRIMARY
            key: PRIMARY                                               ●
        key_len: 4
            ref: mysqlforge.t.tag_id
           rows: 1
          Extra: Using where; Using index
  2 rows in set (0.00 sec)
                                                                           optimization :(
  mysql> SELECT
      ->   t.tag_text
      -> FROM Tag t
      -> LEFT JOIN Tag2Project t2p
                                                                       ●   But, isn't this
                                                                           essentially a UNION?
      -> ON t.tag_id = t2p.tag
      -> WHERE t2p.project IS NULL
      -> OR t2p.project = 75
      -> GROUP BY t.tag_text;
  +--------------------------------------+
  | tag_text                             |
  +--------------------------------------+
  <snip>
  +--------------------------------------+
  184 rows in set (0.00 sec)
09/18/08                                         zendcon08 - join-fu                              44
LEFT join-fu: a UNION returns us to optimization
  mysql> EXPLAIN SELECT                                                *************************** 4. row ***************************
      ->   t.tag_text                                                             id: 2
      -> FROM Tag t                                                      select_type: UNION
      -> LEFT JOIN Tag2Project t2p                                             table: t
      -> ON t.tag_id = t2p.tag                                                  type: eq_ref
      -> WHERE t2p.project IS NULL                                     possible_keys: PRIMARY
      -> GROUP BY t.tag_text                                                     key: PRIMARY
      -> UNION ALL                                                           key_len: 4
      -> SELECT                                                                  ref: mysqlforge.t2p.tag
      ->   t.tag_text                                                           rows: 1
      -> FROM Tag t                                                            Extra:
      -> INNER JOIN Tag2Project t2p                                    *************************** 5. row ***************************
      -> ON t.tag_id = t2p.tag                                                    id: NULL
      -> WHERE t2p.project = 75G                                        select_type: UNION RESULT
  *************************** 1. row ***************************               table: <union1,2>
             id: 1                                                     5 rows in set (0.00 sec)
    select_type: PRIMARY
          table: t
           type: index                                                 mysql> SELECT
            key: uix_tag_text                                              ->   t.tag_text
        key_len: 52                                                        -> FROM Tag t
           rows: 1126                                                      -> LEFT JOIN Tag2Project t2p
          Extra: Using index                                               -> ON t.tag_id = t2p.tag
  *************************** 2. row ***************************           -> WHERE t2p.project IS NULL
             id: 1                                                         -> GROUP BY t.tag_text
    select_type: PRIMARY                                                   -> UNION ALL
          table: t2p                                                       -> SELECT
           type: ref                                                       ->   t.tag_text
            key: PRIMARY                                                   -> FROM Tag t
        key_len: 4                                                         -> INNER JOIN Tag2Project t2p
            ref: mysqlforge.t.tag_id                                       -> ON t.tag_id = t2p.tag
           rows: 1                                                         -> WHERE t2p.project = 75;
          Extra: Using where; Using index; Not exists                  +--------------------------------------+
  *************************** 3. row ***************************       | tag_text                             |
             id: 2                                                     +--------------------------------------+
    select_type: UNION                                                 <snip>
          table: t2p                                                   +--------------------------------------+
           type: ref                                                   184 rows in set (0.00 sec)
  possible_keys: PRIMARY,rv_primary
            key: rv_primary
        key_len: 4
            ref: const
           rows: 31
          Extra: Using index


09/18/08                                         zendcon08 - join-fu                                                                    45
LEFT join-fu: the trickiest part...

                                                                           Get the tag phrases
  mysql> SELECT
      ->   t.tag_text                                                  ●
      -> FROM Tag t
      -> LEFT JOIN Tag2Project t2p
      -> ON t.tag_id = t2p.tag
      -> WHERE t2p.tag IS NULL
      -> AND t2p.project= 75
                                                                           which are not
                                                                           related to project
      -> GROUP BY t.tag_text;
  Empty set (0.00 sec)

  mysql> EXPLAIN SELECT

                                                                           #75
      ->   t.tag_text
      -> FROM Tag t
      -> LEFT JOIN Tag2Project t2p
      -> ON t.tag_id = t2p.tag
      -> WHERE t2p.tag IS NULL
      -> AND t2p.project= 75
      -> GROUP BY t.tag_textG
  *************************** 1. row ***************************
             id: 1
                                                                       ●
                                                                           Shown to the left is
    select_type: SIMPLE
          table: NULL
           type: NULL
                                                                           the most common
                                                                           mistake made with
  possible_keys: NULL
            key: NULL
        key_len: NULL
            ref: NULL
           rows: NULL
          Extra: Impossible WHERE noticed after reading const tables
  1 row in set (0.00 sec)
                                                                           LEFT JOINs
                                                                       ●   The problem is
                                                                           where the filter on
                                                                           project_id is done...
09/18/08                                         zendcon08 - join-fu                              46
LEFT join-fu: the trickiest part...fixed

                                                                           Filters on the LEFT
  mysql> EXPLAIN SELECT
      ->   t.tag_text                                                  ●
      -> FROM Tag t
      -> LEFT JOIN Tag2Project t2p
      -> ON t.tag_id = t2p.tag
      -> AND t2p.project= 75
      -> WHERE t2p.tag IS NULL
                                                                           joined set must be
                                                                           placed in the ON
      -> GROUP BY t.tag_textG
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE

                                                                           clause
          table: t
           type: index
  possible_keys: NULL
            key: uix_tag_text
        key_len: 52
           rows: 1126
          Extra: Using index
  *************************** 2. row ***************************
             id: 1
                                                                       ●
                                                                           Filter is applied
    select_type: SIMPLE
          table: t2p
           type: eq_ref
                                                                           before the LEFT JOIN
                                                                           and NOT EXISTs is
  possible_keys: PRIMARY,rv_primary
            key: rv_primary
        key_len: 8
            ref: const,mysqlforge.t.tag_id
           rows: 1
          Extra: Using where; Using index; Not exists
  2 rows in set (0.00 sec)
                                                                           evaluated, resulting
  mysql> SELECT
      ->   t.tag_text
      -> FROM Tag t
                                                                           in fewer rows in the
      -> LEFT JOIN Tag2Project t2p
      -> ON t.tag_id = t2p.tag
      -> AND t2p.project= 75
      -> WHERE t2p.tag IS NULL
                                                                           evaluation, and
      -> GROUP BY t.tag_text;
  +-----------------+
  | tag_text        |
                                                                           better performance
  +-----------------+
  <snip>
  +----------------------------------------+
  674 rows in set (0.01 sec)
09/18/08                                         zendcon08 - join-fu                             47
resources and thank you!

●
    PlanetMySQL
    –   300+ writers on MySQL topics
    –   http://planetmysql.com
●   MySQL Forge
    –   Code snippets, project listings,
        wiki, worklog                      Baron Schwartz
                                           http://xaprb.com
    –   http://forge.mysql.org             MySQL performance guru and co-
                                           author of High Performance MySQL,
                                           2nd Edition (O'Reilly, 2008)

                                           “xarpb” is Baron spelled on a Dvorak
                                           keyboard...

More Related Content

What's hot

Introduction Cloud Computing
Introduction Cloud ComputingIntroduction Cloud Computing
Introduction Cloud ComputingRoel Honning
 
Introduction to Parallel Distributed Computer Systems
Introduction to Parallel Distributed Computer SystemsIntroduction to Parallel Distributed Computer Systems
Introduction to Parallel Distributed Computer SystemsMrMaKKaWi
 
Data in Motion vs Data at Rest
Data in Motion vs Data at RestData in Motion vs Data at Rest
Data in Motion vs Data at RestInternap
 
Architecture of Mobile Computing
Architecture of Mobile ComputingArchitecture of Mobile Computing
Architecture of Mobile ComputingJAINIK PATEL
 
Ant Colony Optimization for Load Balancing in Cloud
Ant  Colony Optimization for Load Balancing in CloudAnt  Colony Optimization for Load Balancing in Cloud
Ant Colony Optimization for Load Balancing in CloudChanda Korat
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File Systemtutchiio
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)DheerajPachauri
 
Difference between molap, rolap and holap in ssas
Difference between molap, rolap and holap  in ssasDifference between molap, rolap and holap  in ssas
Difference between molap, rolap and holap in ssasUmar Ali
 
Design Goals of Distributed System
Design Goals of Distributed SystemDesign Goals of Distributed System
Design Goals of Distributed SystemAshish KC
 
MRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingMRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingRoger Rafanell Mas
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...EUDAT
 
Next Generation Network Automation
Next Generation Network AutomationNext Generation Network Automation
Next Generation Network AutomationLaurent Ciavaglia
 
Synchronization in distributed systems
Synchronization in distributed systems Synchronization in distributed systems
Synchronization in distributed systems SHATHAN
 
Ports and protocols
Ports and protocolsPorts and protocols
Ports and protocolssiva rama
 

What's hot (20)

Introduction Cloud Computing
Introduction Cloud ComputingIntroduction Cloud Computing
Introduction Cloud Computing
 
Introduction to Parallel Distributed Computer Systems
Introduction to Parallel Distributed Computer SystemsIntroduction to Parallel Distributed Computer Systems
Introduction to Parallel Distributed Computer Systems
 
Data in Motion vs Data at Rest
Data in Motion vs Data at RestData in Motion vs Data at Rest
Data in Motion vs Data at Rest
 
Grid computing
Grid computingGrid computing
Grid computing
 
Architecture of Mobile Computing
Architecture of Mobile ComputingArchitecture of Mobile Computing
Architecture of Mobile Computing
 
Ant Colony Optimization for Load Balancing in Cloud
Ant  Colony Optimization for Load Balancing in CloudAnt  Colony Optimization for Load Balancing in Cloud
Ant Colony Optimization for Load Balancing in Cloud
 
GFS - Google File System
GFS - Google File SystemGFS - Google File System
GFS - Google File System
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)
 
Difference between molap, rolap and holap in ssas
Difference between molap, rolap and holap  in ssasDifference between molap, rolap and holap  in ssas
Difference between molap, rolap and holap in ssas
 
Design Goals of Distributed System
Design Goals of Distributed SystemDesign Goals of Distributed System
Design Goals of Distributed System
 
Distributed deadlock
Distributed deadlockDistributed deadlock
Distributed deadlock
 
Network Management System and Protocol
Network Management System and Protocol Network Management System and Protocol
Network Management System and Protocol
 
Data Dissemination and Synchronization
Data Dissemination and SynchronizationData Dissemination and Synchronization
Data Dissemination and Synchronization
 
MRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud ComputingMRI Energy-Efficient Cloud Computing
MRI Energy-Efficient Cloud Computing
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
 
Next Generation Network Automation
Next Generation Network AutomationNext Generation Network Automation
Next Generation Network Automation
 
Synchronization in distributed systems
Synchronization in distributed systems Synchronization in distributed systems
Synchronization in distributed systems
 
Ports and protocols
Ports and protocolsPorts and protocols
Ports and protocols
 
Microkernel
MicrokernelMicrokernel
Microkernel
 
Data centerefficiency
Data centerefficiencyData centerefficiency
Data centerefficiency
 

Viewers also liked

MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsOSSCube
 
Webinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuningWebinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuning晓 周
 
Performance Tuning Best Practices
Performance Tuning Best PracticesPerformance Tuning Best Practices
Performance Tuning Best Practiceswebhostingguy
 
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)Aurimas Mikalauskas
 
PHP mysql Mysql joins
PHP mysql  Mysql joinsPHP mysql  Mysql joins
PHP mysql Mysql joinsMudasir Syed
 
Database normalization
Database normalizationDatabase normalization
Database normalizationJignesh Jain
 
LAPORAN HASIL PRAKTEK PEMROGRAMAN KOMPUTER (DLPHI 7)
LAPORAN HASIL PRAKTEK  PEMROGRAMAN KOMPUTER (DLPHI 7)LAPORAN HASIL PRAKTEK  PEMROGRAMAN KOMPUTER (DLPHI 7)
LAPORAN HASIL PRAKTEK PEMROGRAMAN KOMPUTER (DLPHI 7)YOHANIS SAHABAT
 
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisationMySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisationMark Swarbrick
 
浅析My sql事务隔离级别与锁 seanlook
浅析My sql事务隔离级别与锁 seanlook浅析My sql事务隔离级别与锁 seanlook
浅析My sql事务隔离级别与锁 seanlook晓 周
 
MySQL Server Settings Tuning
MySQL Server Settings TuningMySQL Server Settings Tuning
MySQL Server Settings Tuningguest5ca94b
 
MySQL Performance Tuning für Entwickler
MySQL Performance Tuning für EntwicklerMySQL Performance Tuning für Entwickler
MySQL Performance Tuning für EntwicklerFromDual GmbH
 
MySQL Performance Tuning Variables
MySQL Performance Tuning VariablesMySQL Performance Tuning Variables
MySQL Performance Tuning VariablesFromDual GmbH
 
MySQL Performance Tuning
MySQL Performance TuningMySQL Performance Tuning
MySQL Performance TuningFromDual GmbH
 

Viewers also liked (20)

MySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 TipsMySQL Performance Tuning: Top 10 Tips
MySQL Performance Tuning: Top 10 Tips
 
Webinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuningWebinar 2013 advanced_query_tuning
Webinar 2013 advanced_query_tuning
 
Joinfu Part Two
Joinfu Part TwoJoinfu Part Two
Joinfu Part Two
 
Performance Tuning Best Practices
Performance Tuning Best PracticesPerformance Tuning Best Practices
Performance Tuning Best Practices
 
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
MySQL Performance Tuning. Part 1: MySQL Configuration (includes MySQL 5.7)
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
Mysql count
Mysql countMysql count
Mysql count
 
PHP mysql Mysql joins
PHP mysql  Mysql joinsPHP mysql  Mysql joins
PHP mysql Mysql joins
 
MySQL JOIN & UNION
MySQL JOIN & UNIONMySQL JOIN & UNION
MySQL JOIN & UNION
 
Types of Error in PHP
Types of Error in PHPTypes of Error in PHP
Types of Error in PHP
 
Database normalization
Database normalizationDatabase normalization
Database normalization
 
LAPORAN HASIL PRAKTEK PEMROGRAMAN KOMPUTER (DLPHI 7)
LAPORAN HASIL PRAKTEK  PEMROGRAMAN KOMPUTER (DLPHI 7)LAPORAN HASIL PRAKTEK  PEMROGRAMAN KOMPUTER (DLPHI 7)
LAPORAN HASIL PRAKTEK PEMROGRAMAN KOMPUTER (DLPHI 7)
 
Performance Tuning
Performance TuningPerformance Tuning
Performance Tuning
 
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisationMySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
 
浅析My sql事务隔离级别与锁 seanlook
浅析My sql事务隔离级别与锁 seanlook浅析My sql事务隔离级别与锁 seanlook
浅析My sql事务隔离级别与锁 seanlook
 
MySQL Server Settings Tuning
MySQL Server Settings TuningMySQL Server Settings Tuning
MySQL Server Settings Tuning
 
MySQL Performance Tuning für Entwickler
MySQL Performance Tuning für EntwicklerMySQL Performance Tuning für Entwickler
MySQL Performance Tuning für Entwickler
 
Perf Tuning Short
Perf Tuning ShortPerf Tuning Short
Perf Tuning Short
 
MySQL Performance Tuning Variables
MySQL Performance Tuning VariablesMySQL Performance Tuning Variables
MySQL Performance Tuning Variables
 
MySQL Performance Tuning
MySQL Performance TuningMySQL Performance Tuning
MySQL Performance Tuning
 

Similar to Join-fu: The Art of SQL Tuning for MySQL

SQL Query Tuning: The Legend of Drunken Query Master
SQL Query Tuning: The Legend of Drunken Query MasterSQL Query Tuning: The Legend of Drunken Query Master
SQL Query Tuning: The Legend of Drunken Query MasterZendCon
 
QCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AIQCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AILex Yu
 
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...npinto
 
Hong Qiangning in QConBeijing
Hong Qiangning in QConBeijingHong Qiangning in QConBeijing
Hong Qiangning in QConBeijingshen liu
 
MySQL cluster 72 in the Cloud
MySQL cluster 72 in the CloudMySQL cluster 72 in the Cloud
MySQL cluster 72 in the CloudMarco Tusa
 
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...The Linux Foundation
 
Asynchronous Io Programming
Asynchronous Io ProgrammingAsynchronous Io Programming
Asynchronous Io Programmingl xf
 
Introduction to XtraDB Cluster
Introduction to XtraDB ClusterIntroduction to XtraDB Cluster
Introduction to XtraDB Clusteryoku0825
 
豆瓣技术架构的发展历程 @ QCon Beijing 2009
豆瓣技术架构的发展历程 @ QCon Beijing 2009豆瓣技术架构的发展历程 @ QCon Beijing 2009
豆瓣技术架构的发展历程 @ QCon Beijing 2009Qiangning Hong
 
Kickin' Ass with Cache-Fu (with notes)
Kickin' Ass with Cache-Fu (with notes)Kickin' Ass with Cache-Fu (with notes)
Kickin' Ass with Cache-Fu (with notes)err
 
Deep Learning Computer Build
Deep Learning Computer BuildDeep Learning Computer Build
Deep Learning Computer BuildPetteriTeikariPhD
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationPercona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationmysqlops
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementKyong-Ha Lee
 
Exploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient WorkflowsExploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient Workflowsjasonajohnson
 
Acunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFPAcunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFPAcunu
 
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010Arun Gupta
 
Multicore processing
Multicore processingMulticore processing
Multicore processingguestc0be34a
 

Similar to Join-fu: The Art of SQL Tuning for MySQL (20)

SQL Query Tuning: The Legend of Drunken Query Master
SQL Query Tuning: The Legend of Drunken Query MasterSQL Query Tuning: The Legend of Drunken Query Master
SQL Query Tuning: The Legend of Drunken Query Master
 
QCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AIQCon2016--Drive Best Spark Performance on AI
QCon2016--Drive Best Spark Performance on AI
 
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
IAP09 CUDA@MIT 6.963 - Lecture 01: GPU Computing using CUDA (David Luebke, NV...
 
Hong Qiangning in QConBeijing
Hong Qiangning in QConBeijingHong Qiangning in QConBeijing
Hong Qiangning in QConBeijing
 
MySQL cluster 72 in the Cloud
MySQL cluster 72 in the CloudMySQL cluster 72 in the Cloud
MySQL cluster 72 in the Cloud
 
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
 
Mysql talk
Mysql talkMysql talk
Mysql talk
 
Asynchronous Io Programming
Asynchronous Io ProgrammingAsynchronous Io Programming
Asynchronous Io Programming
 
The Smug Mug Tale
The Smug Mug TaleThe Smug Mug Tale
The Smug Mug Tale
 
Introduction to XtraDB Cluster
Introduction to XtraDB ClusterIntroduction to XtraDB Cluster
Introduction to XtraDB Cluster
 
豆瓣技术架构的发展历程 @ QCon Beijing 2009
豆瓣技术架构的发展历程 @ QCon Beijing 2009豆瓣技术架构的发展历程 @ QCon Beijing 2009
豆瓣技术架构的发展历程 @ QCon Beijing 2009
 
Kickin' Ass with Cache-Fu (with notes)
Kickin' Ass with Cache-Fu (with notes)Kickin' Ass with Cache-Fu (with notes)
Kickin' Ass with Cache-Fu (with notes)
 
Deep Learning Computer Build
Deep Learning Computer BuildDeep Learning Computer Build
Deep Learning Computer Build
 
Percona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replicationPercona Live 2012PPT: introduction-to-mysql-replication
Percona Live 2012PPT: introduction-to-mysql-replication
 
MapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvementMapReduce: A useful parallel tool that still has room for improvement
MapReduce: A useful parallel tool that still has room for improvement
 
Exploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient WorkflowsExploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient Workflows
 
XS Boston 2008 Network Topology
XS Boston 2008 Network TopologyXS Boston 2008 Network Topology
XS Boston 2008 Network Topology
 
Acunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFPAcunu & OCaml: Experience Report, CUFP
Acunu & OCaml: Experience Report, CUFP
 
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
Running your Java EE 6 applications in the Cloud @ Silicon Valley Code Camp 2010
 
Multicore processing
Multicore processingMulticore processing
Multicore processing
 

More from ZendCon

Framework Shootout
Framework ShootoutFramework Shootout
Framework ShootoutZendCon
 
Zend_Tool: Practical use and Extending
Zend_Tool: Practical use and ExtendingZend_Tool: Practical use and Extending
Zend_Tool: Practical use and ExtendingZendCon
 
PHP on IBM i Tutorial
PHP on IBM i TutorialPHP on IBM i Tutorial
PHP on IBM i TutorialZendCon
 
PHP on Windows - What's New
PHP on Windows - What's NewPHP on Windows - What's New
PHP on Windows - What's NewZendCon
 
PHP and Platform Independance in the Cloud
PHP and Platform Independance in the CloudPHP and Platform Independance in the Cloud
PHP and Platform Independance in the CloudZendCon
 
I18n with PHP 5.3
I18n with PHP 5.3I18n with PHP 5.3
I18n with PHP 5.3ZendCon
 
Cloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayCloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayZendCon
 
Planning for Synchronization with Browser-Local Databases
Planning for Synchronization with Browser-Local DatabasesPlanning for Synchronization with Browser-Local Databases
Planning for Synchronization with Browser-Local DatabasesZendCon
 
Magento - a Zend Framework Application
Magento - a Zend Framework ApplicationMagento - a Zend Framework Application
Magento - a Zend Framework ApplicationZendCon
 
Enterprise-Class PHP Security
Enterprise-Class PHP SecurityEnterprise-Class PHP Security
Enterprise-Class PHP SecurityZendCon
 
PHP and IBM i - Database Alternatives
PHP and IBM i - Database AlternativesPHP and IBM i - Database Alternatives
PHP and IBM i - Database AlternativesZendCon
 
Zend Core on IBM i - Security Considerations
Zend Core on IBM i - Security ConsiderationsZend Core on IBM i - Security Considerations
Zend Core on IBM i - Security ConsiderationsZendCon
 
Application Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server TracingApplication Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server TracingZendCon
 
Insights from the Experts: How PHP Leaders Are Transforming High-Impact PHP A...
Insights from the Experts: How PHP Leaders Are Transforming High-Impact PHP A...Insights from the Experts: How PHP Leaders Are Transforming High-Impact PHP A...
Insights from the Experts: How PHP Leaders Are Transforming High-Impact PHP A...ZendCon
 
Solving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and ScalabilitySolving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and ScalabilityZendCon
 
Joe Staner Zend Con 2008
Joe Staner Zend Con 2008Joe Staner Zend Con 2008
Joe Staner Zend Con 2008ZendCon
 
Tiery Eyed
Tiery EyedTiery Eyed
Tiery EyedZendCon
 
Make your PHP Application Software-as-a-Service (SaaS) Ready with the Paralle...
Make your PHP Application Software-as-a-Service (SaaS) Ready with the Paralle...Make your PHP Application Software-as-a-Service (SaaS) Ready with the Paralle...
Make your PHP Application Software-as-a-Service (SaaS) Ready with the Paralle...ZendCon
 
DB2 Storage Engine for MySQL and Open Source Applications Session
DB2 Storage Engine for MySQL and Open Source Applications SessionDB2 Storage Engine for MySQL and Open Source Applications Session
DB2 Storage Engine for MySQL and Open Source Applications SessionZendCon
 
Digital Identity
Digital IdentityDigital Identity
Digital IdentityZendCon
 

More from ZendCon (20)

Framework Shootout
Framework ShootoutFramework Shootout
Framework Shootout
 
Zend_Tool: Practical use and Extending
Zend_Tool: Practical use and ExtendingZend_Tool: Practical use and Extending
Zend_Tool: Practical use and Extending
 
PHP on IBM i Tutorial
PHP on IBM i TutorialPHP on IBM i Tutorial
PHP on IBM i Tutorial
 
PHP on Windows - What's New
PHP on Windows - What's NewPHP on Windows - What's New
PHP on Windows - What's New
 
PHP and Platform Independance in the Cloud
PHP and Platform Independance in the CloudPHP and Platform Independance in the Cloud
PHP and Platform Independance in the Cloud
 
I18n with PHP 5.3
I18n with PHP 5.3I18n with PHP 5.3
I18n with PHP 5.3
 
Cloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go AwayCloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go Away
 
Planning for Synchronization with Browser-Local Databases
Planning for Synchronization with Browser-Local DatabasesPlanning for Synchronization with Browser-Local Databases
Planning for Synchronization with Browser-Local Databases
 
Magento - a Zend Framework Application
Magento - a Zend Framework ApplicationMagento - a Zend Framework Application
Magento - a Zend Framework Application
 
Enterprise-Class PHP Security
Enterprise-Class PHP SecurityEnterprise-Class PHP Security
Enterprise-Class PHP Security
 
PHP and IBM i - Database Alternatives
PHP and IBM i - Database AlternativesPHP and IBM i - Database Alternatives
PHP and IBM i - Database Alternatives
 
Zend Core on IBM i - Security Considerations
Zend Core on IBM i - Security ConsiderationsZend Core on IBM i - Security Considerations
Zend Core on IBM i - Security Considerations
 
Application Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server TracingApplication Diagnosis with Zend Server Tracing
Application Diagnosis with Zend Server Tracing
 
Insights from the Experts: How PHP Leaders Are Transforming High-Impact PHP A...
Insights from the Experts: How PHP Leaders Are Transforming High-Impact PHP A...Insights from the Experts: How PHP Leaders Are Transforming High-Impact PHP A...
Insights from the Experts: How PHP Leaders Are Transforming High-Impact PHP A...
 
Solving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and ScalabilitySolving the C20K problem: Raising the bar in PHP Performance and Scalability
Solving the C20K problem: Raising the bar in PHP Performance and Scalability
 
Joe Staner Zend Con 2008
Joe Staner Zend Con 2008Joe Staner Zend Con 2008
Joe Staner Zend Con 2008
 
Tiery Eyed
Tiery EyedTiery Eyed
Tiery Eyed
 
Make your PHP Application Software-as-a-Service (SaaS) Ready with the Paralle...
Make your PHP Application Software-as-a-Service (SaaS) Ready with the Paralle...Make your PHP Application Software-as-a-Service (SaaS) Ready with the Paralle...
Make your PHP Application Software-as-a-Service (SaaS) Ready with the Paralle...
 
DB2 Storage Engine for MySQL and Open Source Applications Session
DB2 Storage Engine for MySQL and Open Source Applications SessionDB2 Storage Engine for MySQL and Open Source Applications Session
DB2 Storage Engine for MySQL and Open Source Applications Session
 
Digital Identity
Digital IdentityDigital Identity
Digital Identity
 

Recently uploaded

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Join-fu: The Art of SQL Tuning for MySQL

  • 1. Join-fu: The Art of SQL Jay Pipes Community Relations Manager MySQL jay@mysql.com http://jpipes.com These slides released under the Creative Commons Attribution­Noncommercial­Share Alike 3.0 License
  • 2. before we start ● Who am I? – Just some dude who works at MySQL (eh...Sun) – Oh, I co-wrote a book on MySQL – Active PHP/MySQL community member – Other than that, semi-normal geek, married, 2 dogs, 2 cats, blah blah ● This talk is about how to code your app to get the best performance out of your (My)SQL 09/18/08 zendcon08 - join-fu 2
  • 3. system architecture of MySQL Clients Query Cache Query Cache Net I/O Parser “Packaging” Optimizer Pluggable Storage Engine API Cluster MyISAM InnoDB MEMORY Falcon Archive PBXT SolidDB (Ndb) 09/18/08 zendcon08 - join-fu 3
  • 4. the schema ● Basic foundation of performance ● Normalize first, de-normalize later ● Smaller, smaller, smaller ● Divide and conquer The Leaning Tower of Pisa from Wikipedia: ● Understand benefits and “Although intended to stand vertically, the tower began leaning to the southeast disadvantages of different soon after the onset of construction in 1173 due to a poorly laid foundation and loose substrate that has allowed the storage engines foundation to shift direction.” 09/18/08 zendcon08 - join-fu 4
  • 5. taking normalization way too far Hmm...... DateDate? http://thedailywtf.com/forums/thread/75982.aspx 09/18/08 zendcon08 - join-fu 5
  • 6. smaller, smaller, smaller The more records you can fit into a single page of  memory/disk, the faster your seeks and scans will be ● Do you really need that BIGINT? ● Use INT UNSIGNED for IPv4 addresses ● Use VARCHAR carefully – Converted to CHAR when used in a temporary table The Pygmy Marmoset world's smallest monkey ● Use TEXT sparingly This picture is a cheap stunt intended to induce kind feelings – Consider separate tables for the presenter. ● Use BLOBs very sparingly Oh, and I totally want one of these guys for a pet. – Use the filesystem for what it was intended 09/18/08 zendcon08 - join-fu 6
  • 7. handling IPv4 addresses CREATE TABLE Sessions ( session_id INT UNSIGNED NOT NULL AUTO_INCREMENT , ip_address INT UNSIGNED NOT NULL // Compare to CHAR(15)... , session_data TEXT NOT NULL , PRIMARY KEY (session_id) , INDEX (ip_address) // Insert a new dummy record ) ENGINE=InnoDB; INSERT INTO Sessions VALUES (NULL, INET_ATON('192.168.0.2'), 'some session data'); INSERT INTO Session VALUES (NULL, 3232235522, 'some session data'); // Find all sessions coming from a local subnet SELECT session_id , ip_address as ip_raw , INET_NTOA(ip_address) as ip , session_data FROM Sessions WHERE ip_address BETWEEN INET_ATON('192.168.0.1') WHERE ip_address BETWEEN 3232235521 AND 3232235775 AND INET_ATON('192.168.0.255'); mysql> SELECT session_id, ip_address as ip_raw, INET_NTOA(ip_address) as ip, session_data -> FROM Sessions -> WHERE ip_address BETWEEN -> INET_ATON('192.168.0.1') AND INET_ATON('192.168.0.255'); +------------+------------+-------------+-------------------+ | session_id | ip_raw | ip | session_data | +------------+------------+-------------+-------------------+ | 1 | 3232235522 | 192.168.0.2 | some session data | +------------+------------+-------------+-------------------+ 09/18/08 zendcon08 - join-fu 7
  • 8. how much storage space is consumed? ● With this definition, how many bytes will the “a” column consume per row? CREATE TABLE t1 ( a INT(1) UNSIGNED NOT NULL ); ● The number in parentheses is the ZEROFILL argument, not the storage space ● INT takes 4 bytes of space – Regardless of the UNSIGNED or NOT NULL 09/18/08 zendcon08 - legend of drunken query master 8
  • 9. divide et impera ● Vertical partitioning – Split tables with many columns into multiple tables ● Horizontal partitioning – Split table with many rows into Niccolò Machiavelli multiple tables The Art of War, (1519-1520): “A Captain ought, among all the other ● Partitioning in MySQL 5.1 is actions of his, endeavor with every art to divide the forces of the transparent horizontal partitioning enemy, either by making him suspicious of his men in whom he within the DB... trusted, or by giving him cause that he has to separate his forces, and, because of this, become weaker.” ...and it's got issues at the moment. 09/18/08 zendcon08 - join-fu 9
  • 10. vertical partitioning CREATE TABLE Users ( CREATE TABLE Users ( user_id INT NOT NULL AUTO_INCREMENT user_id INT NOT NULL AUTO_INCREMENT , email VARCHAR(80) NOT NULL , email VARCHAR(80) NOT NULL , display_name VARCHAR(50) NOT NULL , display_name VARCHAR(50) NOT NULL , password CHAR(41) NOT NULL , password CHAR(41) NOT NULL , PRIMARY KEY (user_id) , first_name VARCHAR(25) NOT NULL , UNIQUE INDEX (email) , last_name VARCHAR(25) NOT NULL ) ENGINE=InnoDB; CREATE TABLE UserExtra ( , address VARCHAR(80) NOT NULL user_id INT NOT NULL , city VARCHAR(30) NOT NULL , first_name VARCHAR(25) NOT NULL , province CHAR(2) NOT NULL , last_name VARCHAR(25) NOT NULL , postcode CHAR(7) NOT NULL , address VARCHAR(80) NOT NULL , interests TEXT NULL , city VARCHAR(30) NOT NULL , bio TEXT NULL , province CHAR(2) NOT NULL , signature TEXT NULL , postcode CHAR(7) NOT NULL , skills TEXT NULL , interests TEXT NULL , PRIMARY KEY (user_id) , bio TEXT NULL , UNIQUE INDEX (email) , signature TEXT NULL ) ENGINE=InnoDB; , skills TEXT NULL , PRIMARY KEY (user_id) , FULLTEXT KEY (interests, skills) ) ENGINE=MyISAM; ● Mixing frequently and infrequently accessed attributes in a single table? ● Space in buffer pool at a premium? – Splitting the table allows main records to consume the buffer pages without the extra data taking up space in memory ● Need FULLTEXT on your text columns? 09/18/08 zendcon08 - join-fu 10
  • 11. the MySQL query cache ● You must understand your Query application's read/write patterns Clients Cache ● Internal query cache design is a Query compromise between CPU usage Cache Net I/O Parser and read performance ● Stores the MYSQL_RESULT of a “Packaging” SELECT along with a hash of the Optimizer SELECT SQL statement Pluggable Storage Engine API ● Any modification to any table involved in the SELECT invalidates the stored result MyISAM InnoDB MEMORY Falcon Archive PBXT SolidDB Cluster (Ndb) ● Write applications to be aware of the query cache – Use SELECT SQL_NO_CACHE 09/18/08 zendcon08 - join-fu 11
  • 12. vertical partitioning ... continued CREATE TABLE Products ( CREATE TABLE Products ( product_id INT NOT NULL product_id INT NOT NULL , name VARCHAR(80) NOT NULL , name VARCHAR(80) NOT NULL , unit_cost DECIMAL(7,2) NOT NULL , unit_cost DECIMAL(7,2) NOT NULL , description TEXT NULL , description TEXT NULL , image_path TEXT NULL , image_path TEXT NULL , PRIMARY KEY (product_id) , num_views INT UNSIGNED NOT NULL , INDEX (name(20)) , num_in_stock INT UNSIGNED NOT NULL ) ENGINE=InnoDB; ProductCounts ( CREATE TABLE , num_on_order INT UNSIGNED NOT NULL product_id INT NOT NULL , PRIMARY KEY (product_id) , num_views INT UNSIGNED NOT NULL , INDEX (name(20)) , num_in_stock INT UNSIGNED NOT NULL ) ENGINE=InnoDB; , num_on_order INT UNSIGNED NOT NULL , PRIMARY KEY (product_id) // Getting a simple COUNT of products ) ENGINE=InnoDB; // easy on MyISAM, terrible on InnoDB SELECT COUNT(*) CREATE TABLE TableCounts ( FROM Products; total_products INT UNSIGNED NOT NULL ) ENGINE=MEMORY; ● Mixing static attributes with frequently updated fields in a single table? – Thrashing occurs with query cache. Each time an update occurs on any record in the table, all queries referencing the table are invalidated in the query cache ● Doing COUNT(*) with no WHERE on an indexed field on an InnoDB table? – Complications with versioning make full table counts very slow 09/18/08 zendcon08 - join-fu 12
  • 13. Don't be afraid of SQL! It (usually) doesn't bite. 09/18/08 zendcon08 - legend of drunken query master 13
  • 14. coding like a join-fu master ● Be consistent (for crying out loud) ● Use ANSI SQL coding style ● Do not think in terms of iterators, for loops, while loops, etc ● Instead, think in terms of sets Did you know? Break complex SQL statements from Wikipedia: ● Join-fu is a close cousin to Jun Fan Gung Fu, the method of martial arts Bruce Lee began teaching in 1959. (or business requests) into OK, not really. smaller, manageable chunks 09/18/08 zendcon08 - join-fu 14
  • 15. join-fu guidelines ● Always try variations on a theme ● Beware of join hints – Can get “out of date” See, even bears practice join-fu. ● Just because it can be done in a single SQL statement doesn't mean it should ● Always test and benchmark your solutions – Use http_load (simple and effective for web stuff) 09/18/08 zendcon08 - join-fu 15
  • 16. SQL coding consistency Inconsistent SQL code makes baby ● Tabs and spacing kittens cry. ● Upper and lower case ● Keywords, function names ● Some columns aliased, some not SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM actor a INNER JOIN film f ● Consider your ON a.actor_id = fa.actor_id GROUP BY a.actor_id teammates ORDER BY num_rentals DESC, a.last_name, a.first_name Like your LIMIT 10; ● vs. select first_name, a.last_name, count(*) AS num_rentals programming code, FROM actor a join film f on a.actor_id = fa.actor_id group by a.actor_id order by SQL is meant to be num_rentals DESC, a.last_name, a.first_name LIMIT 10; read, not written 09/18/08 zendcon08 - legend of drunken query master 16
  • 17. ANSI vs. Theta SQL coding style SELECT a.first_name, a.last_name, COUNT(*) as num_rentals ANSI STYLE FROM actor a INNER JOIN film f ON a.actor_id = fa.actor_id Explicitly declare JOIN  INNER JOIN film_actor fa conditions using the ON  ON fa.film_id = f.film_id clause INNER JOIN inventory I ON f.film_id = i.film_id INNER JOIN rental r ON r.inventory_id = i.inventory_id GROUP BY a.actor_id ORDER BY num_rentals DESC, a.last_name, a.first_name LIMIT 10; SELECT a.first_name, a.last_name, COUNT(*) as num_rentals FROM actor a, film f, film_actor fa, inventory i, rental r Theta STYLE WHERE a.actor_id = fa.actor_id AND fa.film_id = f.film_id Implicitly declare JOIN  AND f.film_id = i.film_id AND r.inventory_id GROUP BY a.actor_id= i.inventory_id conditions in the WHERE  GROUP ORDER BY num_rentals DESC, a.last_name, a.first_name a.actor_id clause ORDER BY LIMIT 10;num_rentals DESC, a.last_name, a.first_name LIMIT 10; 09/18/08 zendcon08 - join-fu 17
  • 18. why ANSI style's join-fu kicks Theta style's ass ● MySQL only supports the INNER and CROSS join for the Theta style – But, MySQL supports the INNER, CROSS, LEFT, RIGHT, and NATURAL joins of the ANSI style – Mixing and matching both styles can lead to hard-to- read SQL code ● It is supremely easy to miss a join condition with Theta style – Especially when joining many tables together – Leaving off a join condition by accident in the WHERE clause will lead to a cartesian product 09/18/08 zendcon08 - join-fu 18
  • 19. EXPLAIN Basics ● Provides the execution plan chosen by the MySQL optimizer for a specific SELECT statement ● Simply append the word EXPLAIN to the beginning of your SELECT statement ● Each row in output represents a set of information used in the SELECT – A real schema table – A virtual table (derived table) or temporary table – A subquery in SELECT or WHERE – A unioned set 09/18/08 OSCON 2007 - Target Practice 19
  • 20. EXPLAIN columns ● select_type - type of “set” the data in this row contains ● table - alias (or full table name if no alias) of the table or derived table from which the data in this set comes ● type - “access strategy” used to grab the data in this set ● possible_keys - keys available to optimizer for query ● keys - keys chosen by the optimizer ● rows - estimate of the number of rows in this set ● Extra - information the optimizer chooses to give you ● ref - shows the column used in join relations 09/18/08 OSCON 2007 - Target Practice 20
  • 21. Example #1 - the const access type EXPLAIN SELECT * FROM rental WHERE rental_id = 13G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: const possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: const rows: 1 Extra: 1 row in set (0.00 sec) 09/18/08 OSCON 2007 - Target Practice 21
  • 22. Constants in the optimizer ● a field indexed with a unique non-nullable key ● The access strategy of system is related to const and refers to when a table with only a single row is referenced in the SELECT ● Can be propogated across joined columns 09/18/08 OSCON 2007 - Target Practice 22
  • 23. Example #2 - constant propogation EXPLAIN SELECT r.*, c.first_name, c.last_name FROM rental r INNER JOIN customer c ON r.customer_id = c.customer_id WHERE r.rental_id = 13G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: r type: const possible_keys: PRIMARY,idx_fk_customer_id key: PRIMARY key_len: 4 ref: const rows: 1 Extra: *************************** 2. row *************************** id: 1 select_type: SIMPLE table: c type: const possible_keys: PRIMARY key: PRIMARY key_len: 2 ref: const /* Here is where the propogation occurs...*/ rows: 1 Extra: 2 rows in set (0.00 sec) 09/18/08 OSCON 2007 - Target Practice 23
  • 24. Example #3 - the range access type SELECT * FROM rental WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-16'G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: range possible_keys: rental_date key: rental_date key_len: 8 ref: NULL rows: 364 Extra: Using where 1 row in set (0.00 sec) 09/18/08 OSCON 2007 - Target Practice 24
  • 25. Considerations with range accesses ● Index must be available on the field operated upon by a range operator ● If too many records are estimated to be returned by the condition, the range optimization won't be used – index or full table scan will be used instead ● The indexed field must not be operated on by a function call! (Important for all indexing) 09/18/08 OSCON 2007 - Target Practice 25
  • 26. The scan vs. seek dilemma ● A seek operation, generally speaking, jumps into a random place -- either on disk or in memory -- to fetch the data needed. – Repeat for each piece of data needed from disk or memory ● A scan operation, on the other hand, will jump to the start of a chunk of data, and sequentially read data -- either from disk or from memory -- until the end of the chunk of data ● For large amounts of data, scan operations tend to be more efficient than multiple seek operations 09/18/08 OSCON 2007 - Target Practice 26
  • 27. Example #4 - Full table scan EXPLAIN SELECT * FROM rental WHERE rental_date BETWEEN '2005-06-14' AND '2005-06-21'G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: ALL possible_keys: rental_date /* larger range forces scan choice */ key: NULL key_len: NULL ref: NULL rows: 16298 Extra: Using where 1 row in set (0.00 sec) 09/18/08 OSCON 2007 - Target Practice 27
  • 28. Why full table scans pop up ● No WHERE condition (duh.) ● No index on any field in WHERE condition ● Poor selectivity on an indexed field ● Too many records meet WHERE condition ● < MySQL 5.0 and using OR in a WHERE clause 09/18/08 OSCON 2007 - Target Practice 28
  • 29. Example #5 - Full index scan EXPLAIN SELECT rental_id, rental_date FROM rentalG *************************** 1. row *************************** id: 1 select_type: SIMPLE table: rental type: index possible_keys: NULL key: rental_date key_len: 13 ref: NULL CAUTION! rows: 16325 Extra=“Using index”  Extra: Using index is NOT the same as  1 row in set (0.00 sec) type=”index” 09/18/08 OSCON 2007 - Target Practice 29
  • 30. indexed columns and functions don't mix mysql> EXPLAIN SELECT * FROM film WHERE title LIKE 'Tr%'G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: range possible_keys: idx_title key: idx_title key_len: 767 ref: NULL rows: 15 Extra: Using where ● A fast range access strategy is chosen by the optimizer, and the index on title is used to winnow the query results down mysql> EXPLAIN SELECT * FROM film WHERE LEFT(title,2) = 'Tr' G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: film type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 951 Extra: Using where ● A slow full table scan (the ALL access strategy) is used because a function (LEFT) is operating on the title column 09/18/08 zendcon08 - join-fu 30
  • 31. solving multiple issues in a SELECT query SELECT * FROM Orders WHERE TO_DAYS(CURRENT_DATE()) – TO_DAYS(order_created) <= 7; ● First, we are operating on an indexed column (order_created) with a function – let's fix that: SELECT * FROM Orders WHERE order_created >= CURRENT_DATE() - INTERVAL 7 DAY; ● Although we rewrote the WHERE expression to remove the operating function, we still have a non-deterministic function in the statement, which eliminates this query from being placed in the query cache – let's fix that: SELECT * FROM Orders WHERE order_created >= '2008-01-11' - INTERVAL 7 DAY; ● We replaced the function with a constant (probably using our application programming language). However, we are specifying SELECT * instead of the actual fields we need from the table. ● What if there is a TEXT field in Orders called order_memo that we don't need to see? Well, having it included in the result means a larger result set which may not fit into the query cache and may force a disk-based temporary table SELECT order_id, customer_id, order_total, order_created FROM Orders WHERE order_created >= '2008-01-11' - INTERVAL 7 DAY; 09/18/08 zendcon08 - join-fu 31
  • 32. set-wise problem solving “Show the last payment information for each customer” CREATE TABLE `payment` ( `payment_id` smallint(5) unsigned NOT NULL auto_increment, `customer_id` smallint(5) unsigned NOT NULL, `staff_id` tinyint(3) unsigned NOT NULL, `rental_id` int(11) default NULL, `amount` decimal(5,2) NOT NULL, `payment_date` datetime NOT NULL, `last_update` timestamp NOT NULL ... on update CURRENT_TIMESTAMP, PRIMARY KEY (`payment_id`), KEY `idx_fk_staff_id` (`staff_id`), KEY `idx_fk_customer_id` (`customer_id`), KEY `fk_payment_rental` (`rental_id`), CONSTRAINT `fk_payment_rental` FOREIGN KEY (`rental_id`) REFERENCES `rental` (`rental_id`), CONSTRAINT `fk_payment_customer` FOREIGN KEY (`customer_id`) REFERENCES `customer` (`customer_id`) , CONSTRAINT `fk_payment_staff` FOREIGN KEY (`staff_id`) REFERENCES `staff` (`staff_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 http://forge.mysql.com/wiki/SakilaSampleDB 09/18/08 zendcon08 - join-fu 32
  • 33. thinking in terms of foreach loops... OK, for each customer, find the maximum date the payment was made and get that payment record(s) mysql> EXPLAIN SELECT -> p.* -> FROM payment p -> WHERE p.payment_date = ● A correlated subquery in the -> ( SELECT MAX(payment_date) -> FROM payment -> WHERE customer_id=p.customer_id WHERE clause is used -> )G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: p type: ALL rows: 16567 Extra: Using where ● It will be re- executed for each *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY row in the primary table: payment type: ref possible_keys: idx_fk_customer_id key: idx_fk_customer_id key_len: 2 ref: sakila.p.customer_id rows: 15 table (payment) 2 rows in set (0.00 sec) ● Produces 623 rows in an average of 1.03s 09/18/08 zendcon08 - join-fu 33
  • 34. what about adding an index? Will adding an index on (customer_id, payment_date) make a difference? mysql> EXPLAIN SELECT mysql> EXPLAIN SELECT -> p.* -> p.* -> FROM payment p -> FROM payment p -> WHERE p.payment_date = -> WHERE p.payment_date = -> ( SELECT MAX(payment_date) -> ( SELECT MAX(payment_date) -> FROM payment -> FROM payment -> WHERE customer_id=p.customer_id -> WHERE customer_id=p.customer_id -> )G -> )G *************************** 1. row ************************ *************************** 1. row *************************** id: 1 id: 1 select_type: PRIMARY select_type: PRIMARY table: p table: p type: ALL type: ALL rows: 16567 rows: 15485 Extra: Using where Extra: Using where *************************** 2. row ************************ *************************** 2. row *************************** id: 2 id: 2 select_type: DEPENDENT SUBQUERY select_type: DEPENDENT SUBQUERY table: payment table: payment type: ref type: ref possible_keys: idx_fk_customer_id possible_keys: idx_fk_customer_id,ix_customer_paydate key: idx_fk_customer_id key: ix_customer_paydate key_len: 2 key_len: 2 ref: sakila.p.customer_id ref: sakila.p.customer_id rows: 15 rows: 14 Extra: Using index 2 rows in set (0.00 sec) 2 rows in set (0.00 sec) ● Produces 623 rows in ● Produces 623 rows in an average of 1.03s 09/18/08 zendcon08 - join-fu an average of 0.45s 34
  • 35. thinking in terms of sets... OK, I have one set of last payments dates and another set containing payment information (so, how do I join these sets?) mysql> EXPLAIN SELECT -> p.* -> FROM ( -> SELECT customer_id, MAX(payment_date) as last_order ● A derived table, or subquery in the -> FROM payment -> GROUP BY customer_id -> ) AS last_orders -> INNER JOIN payment p FROM clause, is used -> ON p.customer_id = last_orders.customer_id -> AND p.payment_date = last_orders.last_orderG *************************** 1. row *************************** id: 1 select_type: PRIMARY The derived table table: <derived2> type: ALL ● rows: 599 *************************** 2. row *************************** id: 1 select_type: PRIMARY table: p represents a set: last payment dates type: ref possible_keys: idx_fk_customer_id,ix_customer_paydate key: ix_customer_paydate key_len: 10 ref: last_orders.customer_id,last_orders.last_order rows: 1 *************************** 3. row *************************** id: 2 of customers select_type: DERIVED table: payment type: range key: ix_customer_paydate ● Produces 623 rows in an average of 0.03s key_len: 2 rows: 1107 Extra: Using index for group-by 3 rows in set (0.02 sec) 09/18/08 zendcon08 - join-fu 35
  • 36. working with “mapping” or N:M tables CREATE TABLE Project ( CREATE TABLE Tags ( project_id INT UNSIGNED NOT NULL AUTO_INCREMENT tag_id INT UNSIGNED NOT NULL AUTO_INCREMENT , name VARCHAR(50) NOT NULL , tag_text VARCHAR(50) NOT NULL , url TEXT NOT NULL , PRIMARY KEY (tag_id) , PRIMARY KEY (project_id) , INDEX (tag_text) ) ENGINE=MyISAM; ) ENGINE=MyISAM; CREATE TABLE Tag2Project ( tag INT UNSIGNED NOT NULL , project INT UNSIGNED NOT NULL , PRIMARY KEY (tag, project) , INDEX rv_primary (project, tag) ) ENGINE=MyISAM; ● The next few slides will walk through examples of querying across the above relationship – dealing with OR conditions – dealing with AND conditions 09/18/08 zendcon08 - join-fu 36
  • 37. dealing with OR conditions Grab all project names which are tagged with “mysql” OR “php” mysql> SELECT p.name FROM Project p -> INNER JOIN Tag2Project t2p -> ON p.project_id = t2p.project -> INNER JOIN Tag t -> ON t2p.tag = t.tag_id -> WHERE t.tag_text IN ('mysql','php'); +---------------------------------------------------------+ | name | +---------------------------------------------------------+ | phpMyAdmin | ... | MySQL Stored Procedures Auto Generator | +---------------------------------------------------------+ 90 rows in set (0.05 sec) +----+-------------+-------+--------+----------------------+--------------+---------+-------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+----------------------+--------------+---------+-------------+------+-------------+ | 1 | SIMPLE | t | range | PRIMARY,uix_tag_text | uix_tag_text | 52 | NULL | 2 | Using where | | 1 | SIMPLE | t2p | ref | PRIMARY,rv_primary | PRIMARY | 4 | t.tag_id | 10 | Using index | | 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | t2p.project | 1 | | +----+-------------+-------+--------+----------------------+--------------+---------+-------------+------+-------------+ 3 rows in set (0.00 sec) ● Note the order in which the optimizer chose to join the tables is exactly the opposite of how we wrote our SELECT 09/18/08 zendcon08 - join-fu 37
  • 38. dealing with AND conditions Grab all project names which are tagged with “storage engine” AND “plugin” ● A little more complex, let's mysql> SELECT p.name FROM Project p -> INNER JOIN ( grab the project names -> SELECT t2p.project -> FROM Tag2Project t2p which match both the -> INNER JOIN Tag t -> ON t2p.tag = t.tag_id -> WHERE t.tag_text IN ('plugin','storage engine') “mysql” tag and the “php” -> GROUP BY t2p.project -> HAVING COUNT(*) = 2 -> ) AS projects_having_all_tags tag -> ON p.project_id = projects_having_all_tags.project; +-----------------------------------+ | name | +-----------------------------------+ ● Here is perhaps the most | Automatic data revision | | memcache storage engine for MySQL | +-----------------------------------+ common method – using a 2 rows in set (0.01 sec) HAVING COUNT(*) against a GROUP BY on the relationship table ● EXPLAIN on next page... 09/18/08 zendcon08 - join-fu 38
  • 39. the dang filesort ● The EXPLAIN plan shows *************************** 1. row *************************** id: 1 select_type: PRIMARY the execution plan using table: <derived2> type: ALL a derived table rows: 2 *************************** 2. row *************************** id: 1 containing the project select_type: PRIMARY table: p type: eq_ref IDs having records in the possible_keys: PRIMARY key: PRIMARY Tag2Project table with key_len: 4 ref: projects_having_all_tags.project rows: 1 both the “plugin” and *************************** 3. row *************************** id: 2 select_type: DERIVED “storage engine” tags table: t type: range possible_keys: PRIMARY,uix_tag_text key: uix_tag_text ● Note that a filesort is key_len: 52 rows: 2 Extra: Using where; Using index; Using temporary; Using filesort needed on the Tag table *************************** 4. row *************************** id: 2 rows since we use the select_type: DERIVED table: t2p type: ref index on tag_text but possible_keys: PRIMARY key: PRIMARY key_len: 4 need a sorted list of ref: mysqlforge.t.tag_id rows: 1 tag_id values to use in Extra: Using index 4 rows in set (0.00 sec) the join 09/18/08 zendcon08 - join-fu 39
  • 40. removing the filesort using CROSS JOIN ● We can use a CROSS JOIN technique to remove the filesort – We winnow down two copies of the Tag table (t1 and t2) by supplying constants in the WHERE condition ● This means no need for a sorted list of tag IDs since we already have the two tag IDs available from the CROSS JOINs...so no more filesort mysql> EXPLAIN SELECT p.name -> FROM Project p -> CROSS JOIN Tag t1 -> CROSS JOIN Tag t2 -> INNER JOIN Tag2Project t2p -> ON p.project_id = t2p.project -> AND t2p.tag = t1.tag_id -> INNER JOIN Tag2Project t2p2 -> ON t2p.project = t2p2.project -> AND t2p2.tag = t2.tag_id -> WHERE t1.tag_text = quot;pluginquot; -> AND t2.tag_text = quot;storage enginequot;; +----+-------------+-------+--------+----------------------+--------------+---------+------------------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+----------------------+--------------+---------+------------------------------+------+-------------+ | 1 | SIMPLE | t1 | const | PRIMARY,uix_tag_text | uix_tag_text | 52 | const | 1 | Using index | | 1 | SIMPLE | t2 | const | PRIMARY,uix_tag_text | uix_tag_text | 52 | const | 1 | Using index | | 1 | SIMPLE | t2p | ref | PRIMARY,rv_primary | PRIMARY | 4 | const | 9 | Using index | | 1 | SIMPLE | t2p2 | eq_ref | PRIMARY,rv_primary | PRIMARY | 8 | const,mysqlforge.t2p.project | 1 | Using index | | 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | mysqlforge.t2p2.project | 1 | Using where | +----+-------------+-------+--------+----------------------+--------------+---------+------------------------------+------+-------------+ 5 rows in set (0.00 sec) 09/18/08 zendcon08 - join-fu 40
  • 41. another technique for dealing with ANDs ● Do two separate queries – one which grabs tag_id mysql> SELECT t.tag_id FROM Tag t > WHERE tag_text IN (quot;pluginquot;,quot;storage enginequot;); values based on the tag text and another which does +--------+ | tag_id | a self-join after the application has the tag_id values +--------+ | 173 | in memory | 259 | +--------+ 2 rows in set (0.00 sec) Benefit #1 mysql> SELECT p.name FROM Project p -> INNER JOIN Tag2Project t2p -> ON p.project_id = t2p.project ● If we assume the Tag2Project table is updated 10X -> AND t2p.tag = 173 -> INNER JOIN Tag2Project t2p2 more than the Tag table is updated, the first query -> ON t2p.project = t2p2.project -> AND t2p2.tag = 259; on Tag will be cached more effectively in the query +-----------------------------------+ | name | cache +-----------------------------------+ | Automatic data revision | | memcache storage engine for MySQL | Benefit #2 +-----------------------------------+ 2 rows in set (0.00 sec) ● The EXPLAIN on the self-join query is much better than the HAVING COUNT(*) derived table solution mysql> EXPLAIN SELECT p.name FROM Project p -> INNER JOIN Tag2Project t2p -> ON p.project_id = t2p.project -> AND t2p.tag = 173 -> INNER JOIN Tag2Project t2p2 -> ON t2p.project = t2p2.project -> AND t2p2.tag = 259; +----+-------------+-------+--------+--------------------+---------+---------+------------------------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+--------------------+---------+---------+------------------------------+------+-------------+ | 1 | SIMPLE | t2p | ref | PRIMARY,rv_primary | PRIMARY | 4 | const | 9 | Using index | | 1 | SIMPLE | t2p2 | eq_ref | PRIMARY,rv_primary | PRIMARY | 8 | const,mysqlforge.t2p.project | 1 | Using index | | 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | mysqlforge.t2p2.project | 1 | Using where | +----+-------------+-------+--------+--------------------+---------+---------+------------------------------+------+-------------+ 3 rows in set (0.00 sec) 09/18/08 zendcon08 - join-fu 41
  • 42. understanding LEFT-join-fu CREATE TABLE Project ( CREATE TABLE Tags ( project_id INT UNSIGNED NOT NULL AUTO_INCREMENT tag_id INT UNSIGNED NOT NULL AUTO_INCREMENT , name VARCHAR(50) NOT NULL , tag_text VARCHAR(50) NOT NULL , url TEXT NOT NULL , PRIMARY KEY (tag_id) , PRIMARY KEY (project_id) , INDEX (tag_text) ) ENGINE=MyISAM; ) ENGINE=MyISAM; CREATE TABLE Tag2Project ( tag INT UNSIGNED NOT NULL , project INT UNSIGNED NOT NULL , PRIMARY KEY (tag, project) , INDEX rv_primary (project, tag) ) ENGINE=MyISAM; ● Get the tag phrases which are not related to any project ● Get the tag phrases which are not related to any project OR the tag phrase is related to project #75 ● Get the tag phrases not related to project #75 09/18/08 zendcon08 - join-fu 42
  • 43. LEFT join-fu: starting simple...the NOT EXISTS Get the tag phrases mysql> EXPLAIN SELECT -> t.tag_text ● -> FROM Tag t -> LEFT JOIN Tag2Project t2p -> ON t.tag_id = t2p.tag -> WHERE t2p.project IS NULL -> GROUP BY t.tag_textG which are not related to any *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t project type: index possible_keys: NULL key: uix_tag_text key_len: 52 rows: 1126 Extra: Using index *************************** 2. row *************************** id: 1 select_type: SIMPLE ● LEFT JOIN ... WHERE table: t2p type: ref possible_keys: PRIMARY x IS NULL key: PRIMARY WHERE x IS NOT key_len: 4 ref: mysqlforge.t.tag_id ● rows: 1 Extra: Using where; Using index; Not exists NULL would yield tag 2 rows in set (0.00 sec) mysql> SELECT -> t.tag_text -> FROM Tag t -> LEFT JOIN Tag2Project t2p -> ON t.tag_id = t2p.tag phrases that are related to a project -> WHERE t2p.project IS NULL -> GROUP BY t.tag_text; +--------------------------------------+ | tag_text | +--------------------------------------+ But, then, you'd want <snip> +--------------------------------------+ 153 rows in set (0.01 sec) – 09/18/08 zendcon08 - join-fu to use an INNER JOIN43
  • 44. LEFT join-fu: a little harder Get the tag phrases mysql> EXPLAIN SELECT -> t.tag_text ● -> FROM Tag t -> LEFT JOIN Tag2Project t2p -> ON t.tag_id = t2p.tag -> WHERE t2p.project IS NULL -> OR t2p.project = 75 which are not related to any -> GROUP BY t.tag_textG *************************** 1. row *************************** id: 1 select_type: SIMPLE project OR the tag table: t type: index key: uix_tag_text key_len: 52 ref: NULL rows: 1126 Extra: Using index phrase is related to project #75 *************************** 2. row *************************** id: 1 select_type: SIMPLE table: t2p type: ref No more NOT EXISTS possible_keys: PRIMARY key: PRIMARY ● key_len: 4 ref: mysqlforge.t.tag_id rows: 1 Extra: Using where; Using index 2 rows in set (0.00 sec) optimization :( mysql> SELECT -> t.tag_text -> FROM Tag t -> LEFT JOIN Tag2Project t2p ● But, isn't this essentially a UNION? -> ON t.tag_id = t2p.tag -> WHERE t2p.project IS NULL -> OR t2p.project = 75 -> GROUP BY t.tag_text; +--------------------------------------+ | tag_text | +--------------------------------------+ <snip> +--------------------------------------+ 184 rows in set (0.00 sec) 09/18/08 zendcon08 - join-fu 44
  • 45. LEFT join-fu: a UNION returns us to optimization mysql> EXPLAIN SELECT *************************** 4. row *************************** -> t.tag_text id: 2 -> FROM Tag t select_type: UNION -> LEFT JOIN Tag2Project t2p table: t -> ON t.tag_id = t2p.tag type: eq_ref -> WHERE t2p.project IS NULL possible_keys: PRIMARY -> GROUP BY t.tag_text key: PRIMARY -> UNION ALL key_len: 4 -> SELECT ref: mysqlforge.t2p.tag -> t.tag_text rows: 1 -> FROM Tag t Extra: -> INNER JOIN Tag2Project t2p *************************** 5. row *************************** -> ON t.tag_id = t2p.tag id: NULL -> WHERE t2p.project = 75G select_type: UNION RESULT *************************** 1. row *************************** table: <union1,2> id: 1 5 rows in set (0.00 sec) select_type: PRIMARY table: t type: index mysql> SELECT key: uix_tag_text -> t.tag_text key_len: 52 -> FROM Tag t rows: 1126 -> LEFT JOIN Tag2Project t2p Extra: Using index -> ON t.tag_id = t2p.tag *************************** 2. row *************************** -> WHERE t2p.project IS NULL id: 1 -> GROUP BY t.tag_text select_type: PRIMARY -> UNION ALL table: t2p -> SELECT type: ref -> t.tag_text key: PRIMARY -> FROM Tag t key_len: 4 -> INNER JOIN Tag2Project t2p ref: mysqlforge.t.tag_id -> ON t.tag_id = t2p.tag rows: 1 -> WHERE t2p.project = 75; Extra: Using where; Using index; Not exists +--------------------------------------+ *************************** 3. row *************************** | tag_text | id: 2 +--------------------------------------+ select_type: UNION <snip> table: t2p +--------------------------------------+ type: ref 184 rows in set (0.00 sec) possible_keys: PRIMARY,rv_primary key: rv_primary key_len: 4 ref: const rows: 31 Extra: Using index 09/18/08 zendcon08 - join-fu 45
  • 46. LEFT join-fu: the trickiest part... Get the tag phrases mysql> SELECT -> t.tag_text ● -> FROM Tag t -> LEFT JOIN Tag2Project t2p -> ON t.tag_id = t2p.tag -> WHERE t2p.tag IS NULL -> AND t2p.project= 75 which are not related to project -> GROUP BY t.tag_text; Empty set (0.00 sec) mysql> EXPLAIN SELECT #75 -> t.tag_text -> FROM Tag t -> LEFT JOIN Tag2Project t2p -> ON t.tag_id = t2p.tag -> WHERE t2p.tag IS NULL -> AND t2p.project= 75 -> GROUP BY t.tag_textG *************************** 1. row *************************** id: 1 ● Shown to the left is select_type: SIMPLE table: NULL type: NULL the most common mistake made with possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: Impossible WHERE noticed after reading const tables 1 row in set (0.00 sec) LEFT JOINs ● The problem is where the filter on project_id is done... 09/18/08 zendcon08 - join-fu 46
  • 47. LEFT join-fu: the trickiest part...fixed Filters on the LEFT mysql> EXPLAIN SELECT -> t.tag_text ● -> FROM Tag t -> LEFT JOIN Tag2Project t2p -> ON t.tag_id = t2p.tag -> AND t2p.project= 75 -> WHERE t2p.tag IS NULL joined set must be placed in the ON -> GROUP BY t.tag_textG *************************** 1. row *************************** id: 1 select_type: SIMPLE clause table: t type: index possible_keys: NULL key: uix_tag_text key_len: 52 rows: 1126 Extra: Using index *************************** 2. row *************************** id: 1 ● Filter is applied select_type: SIMPLE table: t2p type: eq_ref before the LEFT JOIN and NOT EXISTs is possible_keys: PRIMARY,rv_primary key: rv_primary key_len: 8 ref: const,mysqlforge.t.tag_id rows: 1 Extra: Using where; Using index; Not exists 2 rows in set (0.00 sec) evaluated, resulting mysql> SELECT -> t.tag_text -> FROM Tag t in fewer rows in the -> LEFT JOIN Tag2Project t2p -> ON t.tag_id = t2p.tag -> AND t2p.project= 75 -> WHERE t2p.tag IS NULL evaluation, and -> GROUP BY t.tag_text; +-----------------+ | tag_text | better performance +-----------------+ <snip> +----------------------------------------+ 674 rows in set (0.01 sec) 09/18/08 zendcon08 - join-fu 47
  • 48. resources and thank you! ● PlanetMySQL – 300+ writers on MySQL topics – http://planetmysql.com ● MySQL Forge – Code snippets, project listings, wiki, worklog Baron Schwartz http://xaprb.com – http://forge.mysql.org MySQL performance guru and co- author of High Performance MySQL, 2nd Edition (O'Reilly, 2008) “xarpb” is Baron spelled on a Dvorak keyboard...