MySQL 8 is full of new features and amongst them, we have CTE and Window Functions. Really common when using data warehouses, these features can help you analyze what is going on in your business with just a simple query, no ETL needed! I am not saying you will never again have to export data to OLAP databases, but sometimes, being able to do these analyses on the main database could save the day!
Query analytics for the day-to-day developer with MySQL 8.0
1. Query analytics for the
day-to-day developer with
MySQL 8.0
Gabriela D'Ávila Ferrara
@gabidavila
Developer Advocate @ Google Cloud
^(gabi|gabby).dev$
2. I try to solve all my
problems with a single
SQL query
4. FizzBuzz
WITH RECURSIVE fizz_buzz (sequence, modulo_3, modulo_5) AS (
SELECT 1, CAST('' AS CHAR(4)), CAST('' AS CHAR(5))
UNION ALL
SELECT sequence + 1,
IF(MOD(sequence + 1, 3) = 0, 'Fizz', ''),
IF(MOD(sequence + 1, 5) = 0, 'Buzz', '')
FROM fizz_buzz
WHERE sequence < 100
)
SELECT
IF(
CONCAT(modulo_3, modulo_5) = '',
sequence,
CONCAT(modulo_3, modulo_5))
AS fizzbuzz
FROM fizz_buzz;
5. Brief History
● Created to handle from
up to 10 to 100M rows
or around 100MB/table
● Now supports terabyte-
sized databases
● Supports SQL standards
as new as SQL 2016
● But... some stuff from
SQL 2003 just became
available (Window
Functions, CTEs)
8. Don’t do this in production!
● Just because you can, doesn’t mean you
should!
● The cost of analytical queries are too high
● Run on a replica if you must to
● Remember this is just to help you in an
emergency
11. How often you write in raw SQL…
● A CRUD operation?
● A DML operation?
● Create/edit a Function?
● Create/edit a Procedure?
● A View?
12. Have you ever…
● Generated reports using a scripting
language? (Python, PHP)
● Did an ETL?
● Synced data across different types of
databases? (i.e. full-text search)
14. Let’s say
you own a
store
🏬 💵 💻
Tables:
• products
• users
• orders
• order_items
15.
16. And you
want several
reports
● Most expensive order
per user
● The highest price each
product was ever sold
and what was the date
of that
● The monthly amount
sold in a year together
with the growth
20. Who never did this?
SELECT users.id,
users.username,
(SELECT id FROM orders
WHERE users.id = user_id
ORDER BY total LIMIT 1) AS order_id,
(SELECT total FROM orders
WHERE users.id = user_id
ORDER BY total LIMIT 1) AS order_total
FROM users
ORDER BY users.id
LIMIT 10;
The most expensive order for each user
21. LATERAL
SELECT users.id,
users.username,
total_orders.id AS order_id,
total_orders.total AS order_total
FROM users,
LATERAL(
SELECT id, total FROM orders
WHERE users.id = user_id
ORDER BY total LIMIT 1
) AS total_orders
ORDER BY users.id
LIMIT 10;
29. What they do?
● Allows to analyze the rows of a given result set
● Can behave like a GROUP BY without changing
the result set
● Allows you to use a frame to "peek" OVER a
PARTITION of a window
35. Previous and Next orders | LAG and LEAD
SELECT id, user_id, status,
LAG(created_at) OVER(ORDER BY created_at)
AS previous_order,
created_at,
LEAD(created_at) OVER(ORDER BY created_at)
AS next_order
FROM orders
WHERE user_id = 654321
ORDER BY created_at
LIMIT 10;
37. Repetition?
SELECT id, user_id, status,
LAG(created_at) OVER(ORDER BY created_at)
AS previous_order,
created_at,
LEAD(created_at) OVER(ORDER BY created_at)
AS next_order
FROM orders
WHERE user_id = 654321
ORDER BY created_at
LIMIT 10;
38. Named Windows!
SELECT id, user_id, status,
LAG(created_at) OVER(dates)
AS previous_order,
created_at,
LEAD(created_at) OVER(dates)
AS next_order
FROM orders
WHERE user_id = 654321
WINDOW dates AS (ORDER BY created_at)
ORDER BY created_at
LIMIT 10;
41. Common Table Expressions
● Similar to CREATE [TEMPORARY] TABLE
● Doesn’t need CREATE privilege
● Can reference other CTEs (if those are already defined)
● Can be recursive
● Easier to read
42. Recursive CTE
● Useful with hierarchical data
● The Recipe is:
● Base query comes first
● Second query comes after an UNION statement
● And the stop condition should be on the recursive call
43. FizzBuzz
WITH RECURSIVE fizz_buzz (sequence, modulo_3, modulo_5) AS (
SELECT 1, CAST('' AS CHAR(4)), CAST('' AS CHAR(5))
UNION ALL
SELECT sequence + 1,
IF(MOD(sequence + 1, 3) = 0, 'Fizz', ''),
IF(MOD(sequence + 1, 5) = 0, 'Buzz', '')
FROM fizz_buzz
WHERE sequence < 100
)
SELECT
IF(
CONCAT(modulo_3, modulo_5) = '',
sequence,
CONCAT(modulo_3, modulo_5))
AS fizzbuzz
FROM fizz_buzz;
46. Recursive CTE
WITH RECURSIVE tree (depth_level, node, path, node_id) AS (
SELECT 1,
CAST('root' AS CHAR(255)),
CAST('root' AS CHAR(65535)),
0
)
SELECT * FROM tree;
47. Recursive CTE
WITH RECURSIVE tree (depth_level, node, path, node_id) AS (
SELECT 1,
CAST('root' AS CHAR(255)),
CAST('root' AS CHAR(65535)),
0
UNION ALL
SELECT tree.depth_level + 1,
categories.name,
CONCAT_WS('/', tree.path, categories.name),
categories.id
FROM tree
INNER JOIN categories
ON tree.node_id = categories.parent_category_id
WHERE tree.depth_level < 5
)
SELECT * FROM tree ORDER BY path;