O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Recursive Query Throwdown

2.777 visualizações

Publicada em

MySQL 8 introduces support for ANSI SQL recursive queries with common table expressions, a powerful method for working with recursive data references. Until now, MySQL application developers have had to use workarounds for hierarchical data relationships. It's time to write SQL queries in a more standardized way, and be compatible with other brands of SQL implementations. But as always, the bottom line is: how does it perform? This presentation will briefly describe how to use recursive queries, and then test the performance and scalability of those queries against other solutions for hierarchical queries.

Publicada em: Software
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui

Recursive Query Throwdown

  1. 1. Recursive Query Throwdown in MySQL 8 BILL KARWIN PERCONA LIVE OPEN SOURCE DATABASE CONFERENCE 2017
  2. 2. Bill Karwin Software developer, consultant, trainer Using MySQL since 2000 Senior Database Architect at SchoolMessenger Author of SQL Antipatterns: Avoiding the Pitfalls of Database Programming Oracle ACE Director
  3. 3. How to Query a Tree? Hierarchical data § Organization charts § Categories and sub-categories § Parts explosion § Threaded discussions https://commons.wikimedia.org/wiki/File:Staff_Organisation_Diagram,_1896.jpg
  4. 4. Example: Threaded Comments
  5. 5. Adjacency List Example Data comment_id parent_id author comment 1 NULL Fran What’s the cause of this bug? 2 1 Ollie I think it’s a null pointer. 3 2 Fran No, I checked for that. 4 1 Kukla We need to check valid input. 5 4 Ollie Yes, that’s a bug. 6 4 Fran Yes, please add a check 7 6 Kukla That fixed it.
  6. 6. Can’t Easily Query Deep Trees SELECT * FROM Comments c1 LEFT JOIN Comments c2 ON (c2.parent_id = c1.comment_id) LEFT JOIN Comments c3 ON (c3.parent_id = c2.comment_id) LEFT JOIN Comments c4 ON (c4.parent_id = c3.comment_id) LEFT JOIN Comments c5 ON (c5.parent_id = c4.comment_id) LEFT JOIN Comments c6 ON (c6.parent_id = c5.comment_id) LEFT JOIN Comments c7 ON (c7.parent_id = c6.comment_id) LEFT JOIN Comments c8 ON (c8.parent_id = c7.comment_id) LEFT JOIN Comments c9 ON (c9.parent_id = c8.comment_id) LEFT JOIN Comments c10 ON (c10.parent_id = c9.comment_id) ...
  7. 7. MySQL Workarounds
  8. 8. MySQL Workarounds MySQL lacked support for recursive queries, so workarounds were needed These are all denormalized designs, most don’t have referential integrity §Path enumeration §Nested sets §Closure table
  9. 9. Path Enumeration Example Data comment_id path author comment 1 1/ Fran What’s the cause of this bug? 2 1/2/ Ollie I think it’s a null pointer. 3 1/2/3/ Fran No, I checked for that. 4 1/4/ Kukla We need to check valid input. 5 1/4/5/ Ollie Yes, that’s a bug. 6 1/4/6/ Fran Yes, please add a check 7 1/4/6/7/ Kukla That fixed it.
  10. 10. Path Enumeration Example Queries Query ancestors of comment #7: SELECT * FROM Comments WHERE '1/4/6/7/' LIKE CONCAT(path, '%'); Query descendants of comment #4: SELECT * FROM Comments WHERE path LIKE '1/4/%';
  11. 11. Path Enumeration Pros and Cons Pros: §Single non-recursive query to get a tree or a subtree Cons: §Complex updates to add or remove a node §Numbers are stored in a string—no referential integrity
  12. 12. Nested Sets Each comment encodes its descendants using two numbers: § A comment’s left number is less than all numbers used by the comment’s descendants. § A comment’s right number is greater than all numbers used by the comment’s descendants. § A comment’s numbers are between all numbers used by the comment’s ancestors. References: § “Recursive Hierarchies: The Relational Taboo!” Michael J. Kamfonas, Relational Journal, Oct/Nov 1992 § “Trees and Hierarchies in SQL For Smarties,” Joe Celko, 2004 § “Managing Hierarchical Data in MySQL,” Mike Hillyer, 2005
  13. 13. Nested Sets Example
  14. 14. Nested Sets Example Data comment_id nsleft nsright author comment 1 1 14 Fran What’s the cause of this bug? 2 2 5 Ollie I think it’s a null pointer. 3 3 4 Fran No, I checked for that. 4 6 13 Kukla We need to check valid input. 5 7 8 Ollie Yes, that’s a bug. 6 9 12 Fran Yes, please add a check 7 10 11 Kukla That fixed it.
  15. 15. Nested Sets Example Queries Query ancestors of comment #7: SELECT ancestor.* FROM Comments child JOIN Comments ancestor ON child.nsleft BETWEEN ancestor.nsleft AND ancestor.nsright WHERE child.comment_id = 7; Query subtree under comment #4: SELECT descendant.* FROM Comments parent JOIN Comments descendant ON descendant.nsleft BETWEEN parent.nsleft AND parent.nsright WHERE parent.comment_id = 4;
  16. 16. Nested Sets Pros and Cons Pros: §Single non-recursive query to get a tree or a subtree Cons: §Complex updates to add or remove a node §Numbers are not foreign keys—no referential integrity
  17. 17. Closure Table Many-to-many table Stores every path from each node to each of its descendants A node even connects to itself CREATE TABLE Closure ( ancestor INT NOT NULL, descendant INT NOT NULL, length INT NOT NULL, PRIMARY KEY (ancestor, descendant), FOREIGN KEY(ancestor) REFERENCES Comments(comment_id), FOREIGN KEY(descendant) REFERENCES Comments(comment_id) );
  18. 18. Closure Table Example
  19. 19. Closure Table Example Data comment_id author comment 1 Fran What’s the cause of this bug? 2 Ollie I think it’s a null pointer. 3 Fran No, I checked for that. 4 Kukla We need to check valid input. 5 Ollie Yes, that’s a bug. 6 Fran Yes, please add a check 7 Kukla That fixed it. ancestor descendant length 1 1 0 1 2 1 1 3 2 1 4 1 1 5 2 1 6 2 1 7 3 2 2 0 2 3 1 3 3 0 4 4 0 4 5 1 4 6 1 4 7 2 5 5 0 6 6 0 6 7 1 7 7 0
  20. 20. Closure Table Example Queries Query ancestors of comment #7: SELECT c.* FROM Comments c JOIN Closure t ON (c.comment_id = t.ancestor) WHERE t.descendant = 7; Query subtree under comment #4: SELECT c.* FROM Comments c JOIN Closure t ON (c.comment_id = t.descendant) WHERE t.ancestor = 4;
  21. 21. Closure Table Pros and Cons Pros: §Single non-recursive query to get a tree or a subtree §Referential integrity! Cons: §Extra table is required §Hierarchy is stored redundantly, too easy to mess up §Lots of joins to do most kinds of queries
  22. 22. ANSI SQL Recursive CTE
  23. 23. WITHer Recursive Queries in MySQL? SQL vendors gradually implemented SQL-99 WITH syntax: § IBM DB2 UDB 8 (Dec. 2002) § Microsoft SQL Server 2005 (Oct. 2005) § Sybase SQL Anywhere 11 (Aug. 2008) § Firebird 2.1 (Sep. 2008) § PostgreSQL 8.4 (Jul. 2009) § Oracle 11g release 2 (Sep. 2009) § Teradata (date and version of support unknown, at least 2009) § HSQLDB 2.3 (Jul. 2013) § SQLite 3.8.3.1 (Feb. 2014) § H2 (date and version unknown) https://www.percona.com/blog/2014/02/11/wither-recursive-queries/
  24. 24. ANSI SQL Recursive Common Table Expression WITH RECURSIVE cte_name (col_name, col_name, col_name) AS ( subquery base case UNION ALL subquery referencing cte_name ) SELECT ... FROM cte_name ... https://dev.mysql.com/doc/refman/8.0/en/with.html
  25. 25. Generating a Series of Numbers WITH RECURSIVE MySeries (n) AS ( SELECT 1 AS n UNION ALL SELECT 1+n FROM MySeries WHERE n < 10 ) SELECT * FROM MySeries; +------+ | n | +------+ | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | +------+
  26. 26. Generating a Series of Dates WITH RECURSIVE MyDates (d) AS ( SELECT CURRENT_DATE() AS d UNION ALL SELECT d + INTERVAL 1 DAY FROM MyDates WHERE d < CURRENT_DATE() + INTERVAL 7 DAY ) SELECT * FROM MyDates; +------------+ | d | +------------+ | 2017-04-24 | | 2017-04-25 | | 2017-04-26 | | 2017-04-27 | | 2017-04-28 | | 2017-04-29 | | 2017-04-30 | | 2017-05-01 | +------------+
  27. 27. Query ancestors of comment #7 WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment, depth) AS ( SELECT comment_id, parent_id, author, comment, 0 AS depth FROM Comments WHERE comment_id = 7 UNION ALL SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1 FROM CommentTree ct JOIN Comments c ON (ct.parent_id = c.comment_id) ) SELECT * FROM CommentTree;
  28. 28. Query subtree under comment #4 WITH RECURSIVE CommentTree (comment_id, parent_id, author, comment, depth) AS ( SELECT comment_id, parent_id, author, comment, 0 AS depth FROM Comments WHERE comment_id = 4 UNION ALL SELECT c.comment_id, c.parent_id, c.author, c.comment, ct.depth+1 FROM CommentTree ct JOIN Comments c ON (ct.comment_id = c.parent_id) ) SELECT * FROM CommentTree;
  29. 29. Recursive CTE Pros and Cons Pros: § ANSI SQL-99 Standard § Compatible with other SQL implementations § Works with Adjacency List (single source of authority) § Referential integrity! Cons: § Not compatible with earlier MySQL versions § Use of materialized temporary tables may cause performance problems
  30. 30. MySQL CTE Implementation: 💯 Thanks to @MarkusWinand for his preview analysis based on 8.0.1-dmr http://modern-sql.com/feature/with
  31. 31. Big Hierarchies
  32. 32. ITIS: Sample Hierarchical Data Integrated Taxonomic Information System (https://www.itis.gov/) §Biological database of species of animals, plants, fungi §One big tree of 544,954 nodes §Data comes in adjacency list & path enumeration format §I converted to closure table for query tests
  33. 33. ITIS Data Model mysql> select * from longnames where completename = 'Eschscholzia californica'; +--------+---------------------------+ | tsn | completename | +--------+---------------------------+ | 18956 | Eschscholzia californica | +--------+---------------------------+ mysql> select * from hierarchy where TSN = '18956'G TSN: 18956 Parent_TSN: 18954 level: 11 ChildrenCount: 8 hierarchy_string: 202422-954898-846494-954900-846496-846504-18063-846547-18409-18880-18954-18956
  34. 34. Indexes mysql> ALTER TABLE hierarchy ADD KEY (tsn, parent_tsn); Query OK, 0 rows affected (1.30 sec)
  35. 35. Breadcrumbs Query WITH RECURSIVE taxonomy AS ( SELECT base.tsn, base.parent_tsn, 0 as depth FROM hierarchy base WHERE tsn = '18956' UNION ALL SELECT next.tsn, next.parent_tsn, t.depth+1 FROM hierarchy next JOIN taxonomy t WHERE t.parent_tsn = next.tsn ) SELECT * FROM taxonomy JOIN longnames USING (tsn) ORDER BY depth DESC;
  36. 36. Breadcrumbs Query Result +--------+------------+-------+--------------------------+ | tsn | parent_tsn | depth | completename | +--------+------------+-------+--------------------------+ | 202422 | 0 | 11 | Plantae | | 954898 | 202422 | 10 | Viridiplantae | | 846494 | 954898 | 9 | Streptophyta | | 954900 | 846494 | 8 | Embryophyta | | 846496 | 954900 | 7 | Tracheophyta | | 846504 | 846496 | 6 | Spermatophytina | | 18063 | 846504 | 5 | Magnoliopsida | | 846547 | 18063 | 4 | Ranunculanae | | 18409 | 846547 | 3 | Ranunculales | | 18880 | 18409 | 2 | Papaveraceae | | 18954 | 18880 | 1 | Eschscholzia | | 18956 | 18954 | 0 | Eschscholzia californica | +--------+------------+-------+--------------------------+ 12 rows in set (0.00 sec)
  37. 37. Breadcrumbs Query EXPLAIN Plan §New note in Extra: "Recursive" §Using index (covering index) for both base case and recursive case §I can eliminate the filesort if I allow natural order (base case first) §No "Using Temporary"? Not so fast… +----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using filesort | | 1 | PRIMARY | longnames | eq_ref | PRIMARY,tsn | PRIMARY | 4 | taxonomy.tsn | 1 | 100.00 | NULL | | 2 | DERIVED | base | ref | TSN | TSN | 4 | const | 1 | 100.00 | Using index | | 3 | UNION | t | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where | | 3 | UNION | next | ref | TSN | TSN | 4 | t.parent_tsn | 1 | 100.00 | Using index | +----+-------------+------------+--------+---------------+---------+---------+--------------+------+----------+-----------------------------+
  38. 38. Breadcrumbs Query Performance mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG query: WITH RECURSIVE `taxonomy` AS ( ... `tsn` ) ORDER BY `depth` DESC db: itis exec_count: 1 total_latency: 10.05 ms memory_tmp_tables: 1 disk_tmp_tables: 0 avg_tmp_tables_per_query: 1 tmp_tables_to_disk_pct: 0 first_seen: 2017-04-24 22:07:56 last_seen: 2017-04-24 22:07:56 digest: 8438633360bedce178823bb868589fd0
  39. 39. Breadcrumbs Query Stages mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES; +------+--------------------------------+-------+---------------+-------------+ | user | event_name | total | total_latency | avg_latency | +------+--------------------------------+-------+---------------+-------------+ | root | stage/sql/System lock | 40 | 6.62 ms | 165.60 us | | root | stage/sql/Opening tables | 191 | 3.16 ms | 16.52 us | | root | stage/sql/checking permissions | 45 | 1.50 ms | 33.44 us | | root | stage/sql/Creating sort index | 1 | 239.63 us | 239.63 us | | root | stage/sql/closing tables | 191 | 191.03 us | 1.00 us | | root | stage/sql/starting | 2 | 188.44 us | 94.22 us | | root | stage/sql/Sending data | 6 | 138.96 us | 23.16 us | | root | stage/sql/statistics | 4 | 122.42 us | 30.60 us | | root | stage/sql/query end | 191 | 56.67 us | 296.00 ns | | root | stage/sql/preparing | 4 | 33.57 us | 8.39 us | | root | stage/sql/freeing items | 2 | 27.93 us | 13.96 us | | root | stage/sql/optimizing | 5 | 20.03 us | 4.01 us | | root | stage/sql/executing | 7 | 15.39 us | 2.20 us | | root | stage/sql/removing tmp table | 4 | 9.35 us | 2.34 us | | root | stage/sql/init | 3 | 8.76 us | 2.92 us | | root | stage/sql/Sorting result | 2 | 4.16 us | 2.08 us | | root | stage/sql/end | 3 | 1.93 us | 644.00 ns | | root | stage/sql/cleaning up | 2 | 1.43 us | 715.00 ns | +------+--------------------------------+-------+---------------+-------------+
  40. 40. Tree Expansion Query Result See Demo
  41. 41. Tree Expansion Query WITH RECURSIVE ancestors (tsn, parent_tsn) AS ( SELECT h.tsn, h.parent_tsn FROM hierarchy AS h WHERE h.tsn = %s UNION ALL SELECT h.tsn, h.parent_tsn FROM hierarchy AS h JOIN ancestors AS base ON h.tsn = base.parent_tsn ), breadcrumbs (tsn, parent_tsn, depth, breadcrumbs) AS ( SELECT h.tsn, h.parent_tsn, 0 AS depth, CAST(LPAD(h.tsn, 8, '0') AS CHAR(255)) AS breadcrumbs FROM hierarchy AS h WHERE h.parent_tsn = 0 UNION ALL SELECT h.tsn, h.parent_tsn, base.depth+1 AS depth, CONCAT(base.breadcrumbs, ',', LPAD(h.tsn, 8, '0')) FROM hierarchy AS h JOIN ancestors AS a ON h.tsn = a.tsn JOIN breadcrumbs AS base ON h.parent_tsn = base.tsn ) SELECT l.tsn, l.completename, b.depth, b.breadcrumbs FROM breadcrumbs AS b JOIN longnames AS l ON b.tsn = l.tsn UNION SELECT l.tsn, l.completename, b.depth+1, CONCAT(b.breadcrumbs, ',', LPAD(h.tsn, 8, '0')) FROM breadcrumbs AS b JOIN hierarchy AS h ON b.tsn = h.parent_tsn JOIN longnames AS l ON l.tsn = h.tsn ORDER BY breadcrumbs
  42. 42. Tree Expansion Query EXPLAIN --------------+------------+--------+-------------+---------+-------------------+--------+----------+-------------------------------- select_type | table | type | key | key_len | ref | rows | filtered | Extra --------------+------------+--------+-------------+---------+-------------------+--------+----------+-------------------------------- PRIMARY | <derived2> | ALL | NULL | NULL | NULL | 250230 | 100.00 | Using where PRIMARY | l | eq_ref | PRIMARY | 4 | b.tsn | 1 | 100.00 | NULL DERIVED | h | index | TSN | 9 | NULL | 500466 | 10.00 | Using where; Using index UNION | base | ALL | NULL | NULL | NULL | 50046 | 100.00 | Recursive; Using where UNION | <derived4> | ALL | NULL | NULL | NULL | 4 | 100.00 | Using where; Using join buffer UNION | h | ref | TSN | 9 | a.tsn,base.tsn | 1 | 100.00 | Using index DERIVED | h | ref | TSN | 4 | const | 1 | 100.00 | Using index UNION | base | ALL | NULL | NULL | NULL | 2 | 100.00 | Recursive; Using where UNION | h | ref | TSN | 4 | base.parent_tsn | 1 | 100.00 | Using index UNION | h | index | TSN | 9 | NULL | 500466 | 100.00 | Using where; Using index UNION | l | eq_ref | PRIMARY | 4 | itis.h.TSN | 1 | 100.00 | NULL UNION | <derived2> | ref | <auto_key0> | 5 | itis.h.Parent_TSN | 10 | 100.00 | NULL | UNION RESULT | <union1,8> | ALL | NULL | NULL | NULL | NULL | NULL | Using temporary; Using filesort --------------+------------+--------+-------------+---------+-------------------+--------+----------+-------------------------------- Maybe I need more indexes? Unfortunately I ran out of time to analyze.
  43. 43. Tree Expansion Query Performance mysql> SELECT * FROM SYS.STATEMENTS_WITH_TEMP_TABLESG query: WITH RECURSIVE `ancestors` ( ` ... `l` . `completename` , `b` . db: itis exec_count: 1 total_latency: 1.24 s memory_tmp_tables: 3 disk_tmp_tables: 0 avg_tmp_tables_per_query: 3 tmp_tables_to_disk_pct: 0 first_seen: 2017-04-27 01:33:14 last_seen: 2017-04-27 01:33:14 digest: 86c1417d2ff3679863db754eff425e94
  44. 44. Tree Expansion Query Stages mysql> SELECT * FROM SYS.USER_SUMMARY_BY_STAGES; +------+--------------------------------+-------+---------------+-------------+ | user | event_name | total | total_latency | avg_latency | +------+--------------------------------+-------+---------------+-------------+ | root | stage/sql/Sending data | 12 | 979.42 ms | 81.62 ms | | root | stage/sql/System lock | 40 | 6.34 ms | 158.52 us | | root | stage/sql/Opening tables | 191 | 3.34 ms | 17.51 us | | root | stage/sql/checking permissions | 53 | 1.35 ms | 25.45 us | | root | stage/sql/starting | 2 | 356.31 us | 178.16 us | | root | stage/sql/statistics | 12 | 271.01 us | 22.58 us | | root | stage/sql/closing tables | 191 | 179.15 us | 937.00 ns | | root | stage/sql/preparing | 12 | 98.18 us | 8.18 us | | root | stage/sql/query end | 191 | 57.60 us | 301.00 ns | | root | stage/sql/freeing items | 2 | 47.93 us | 23.96 us | | root | stage/sql/Creating sort index | 1 | 37.38 us | 37.38 us | | root | stage/sql/optimizing | 13 | 30.60 us | 2.35 us | | root | stage/sql/executing | 13 | 30.27 us | 2.33 us | | root | stage/sql/removing tmp table | 14 | 24.44 us | 1.74 us | | root | stage/sql/init | 3 | 14.78 us | 4.93 us | | root | stage/sql/cleaning up | 2 | 11.66 us | 5.83 us | | root | stage/sql/Sorting result | 2 | 3.67 us | 1.84 us | | root | stage/sql/end | 3 | 3.04 us | 1.01 us | +------+--------------------------------+-------+---------------+-------------+
  45. 45. Conclusions
  46. 46. Conclusions §Overall, MySQL 8 support for recursive CTE queries is worth the wait. §Exotic cases exist that are beyond any optimizer. §I'm excited to upgrade to MySQL 8.0.x ASAP! §Now that virtually all major SQL brands support recursive CTE's, we need developer tools and popular apps to use them!
  47. 47. License and Copyright Copyright 2017 Bill Karwin http://www.slideshare.net/billkarwin Released under a Creative Commons 3.0 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ You are free to share—to copy, distribute, and transmit this work, under the following conditions: Attribution. You must attribute this work to Bill Karwin. Noncommercial. You may not use this work for commercial purposes. No Derivative Works. You may not alter, transform, or build upon this work.

×