SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
Efficient Pagination Using MySQL
                   Surat Singh Bhati (surat@yahoo-inc.com)
                    Rick James (rjames@yahoo-inc.com)


                               Yahoo Inc


Percona Performance Conference 2009
Outline

  1. Overview
     –    Common pagination UI pattern
     – Sample table and typical solution using OFFSET
     – Techniques to avoid large OFFSET
     – Performance comparison
     – Concerns




                                         -2-
Common Patterns




                  -3-
Basics


    First step toward having efficient pagination over large data set

     – Use index to filter rows (resolve WHERE)
     – Use same index to return rows in sorted order (resolve ORDER)




    Step zero
     – http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html
     – http://dev.mysql.com/doc/refman/5.1/en/order-by-optimization.html
     – http://dev.mysql.com/doc/refman/5.1/en/limit-optimization.html




                                        -4-
Using Index


  KEY a_b_c (a, b, c)

  ORDER may get resolved using Index
       –   ORDER BY a
       –   ORDER BY a,b
       –   ORDER BY a, b, c
       –   ORDER BY a DESC, b DESC, c DESC


  WHERE and ORDER both resolved using index:
       –   WHERE a = const ORDER BY b, c
       –   WHERE a = const AND b = const ORDER BY c
       –   WHERE a = const     ORDER BY b, c
       –   WHERE a = const AND b > const ORDER BY b, c

  ORDER will not get resolved uisng index (file sort)
       –   ORDER BY a ASC, b DESC, c DESC /* mixed sort direction */
       –   WHERE g = const ORDER BY b, c        /* a prefix is missing */
       –   WHERE a = const ORDER BY c           /* b is missing */
       –   WHERE a = const ORDER BY a, d        /* d is not part of index */
                                               -5-
Sample Schema
  CREATE TABLE `message` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `title` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
    `user_id` int(11) NOT NULL,
    `content` text COLLATE utf8_unicode_ci NOT NULL,
    `create_time` int(11) NOT NULL,
    `thumbs_up` int(11) NOT NULL DEFAULT '0', /* Vote Count */
    PRIMARY KEY (`id`),
    KEY `thumbs_up_key` (`thumbs_up`,`id`)
  ) ENGINE=InnoDB

   mysql> show table status like 'message' G
           Engine: InnoDB
          Version: 10
       Row_format: Compact
             Rows: 50000040    /* 50 Million */
   Avg_row_length: 565
      Data_length: 28273803264 /* 26 GB */
     Index_length: 789577728   /* 753 MB */
        Data_free: 6291456
      Create_time: 2009-04-20 13:30:45

  Two use case:
  • Paginate by time, recent message one page one
  • Paginate by thumps_up, largest value on page one
                                          -6-
Typical Query

  1. Get the total records
     SELECT count(*) FROM message


  2. Get current page
     SELECT * FROM message
     ORDER BY id DESC LIMIT 0, 20


  •   http://domain.com/message?page=1
              • ORDER BY id DESC LIMIT 0,                 20

  •   http://domain.com/message?page=2
              • ORDER BY id DESC LIMIT 20,                 20

  •   http://domain.com/message?page=3
              • ORDER BY id DESC LIMIT 40,                 20


  Note: id is auto_increment, same as create_time order, no need to create index on create_time, save space



        –                                           -7-
Explain


  mysql> explain SELECT * FROM message
         ORDER BY id DESC
         LIMIT 10000, 20G
  ***************** 1. row **************
             id: 1
    select_type: SIMPLE
          table: message
           type: index
  possible_keys: NULL
            key: PRIMARY
        key_len: 4
            ref: NULL
           rows: 10020
          Extra:
  1 row in set (0.00 sec)
      – it can read rows using index scan and execution will stop as soon as it finds
        required rows.
      – LIMIT 10000, 20 means it has to read 10020 and throw away 10000 rows, then
        return next 20 rows.


                                           -8-
Performance Implications

     – Larger OFFSET is going to increase active data set, MySQL has to bring data
       in memory that is never returned to caller.
     – Performance issue is more visible when your have database that can't fit in
       main memory.
     – Small percentage of request with large OFFSET would be able to hit disk I/O
       Disk I/O bottleneck
     – In order to display “21 to 40 of 1000,000” , some one has to count 1000,000
       rows.




                                         -9-
Simple Solution

     – Do not display total records, does user really care?



     – Do not let user go to deep pages, redirect him
       http://en.wikipedia.org/wiki/Internet_addiction_disorder after certain number of
       pages




                                          - 10 -
Avoid Count(*)


  1. Never display total messages, let user see more message by clicking
     'next'



  2. Do not count on every request, cache it, display stale count, user do not
     care about 324533 v/s 324633


  3. Display 41 to 80 of Thousands


  4. Use pre calculated count, increment/decrement value as insert/delete
     happens.




                                       - 11 -
Solution to avoid offset

  1. Change User Interface
      – No direct jumps to Nth page



  2. LIMIT N is fine, Do not use LIMIT M,N
      – Provide extra clue about from where to start given page
      – Find the desired records using more restricted WHERE using given clue and
        ORDER BY and LIMIT N without OFFSET)




                                         - 12 -
Find the clue

   150
   111
   102                       Page One
   101
   100   <a href=”/page=2;last_seen=100;dir=next>Next</a>

   98    <a href=”/page=1;last_seen=98;dir=prev>Prev</a>
   97
   96                        Page Two
   95
   94    <a href=”/page=3;last_seen=94;dir=next>Next</a>

   93    <a href=”/page=3;last_seen=93;dir=prev>Prev</a>
   92
   91                         Page Three
   90
   89    <a href=”/page=4;last_seen=89;dir=prev>Next</a>




                                   - 13 -
Solution using clue

  Next Page:
  http://domain.com/forum?page=2&last_seen=100&dir=next

           WHERE id < 100 /* last_seen *
           ORDER BY id DESC LIMIT $page_size /* No OFFSET*/


  Prev Page:
  http://domain.com/forum?page=1&last_seen=98&dir=prev

           WHERE id > 98 /* last_seen *
           ORDER BY id ASC LIMIT $page_size /* No OFFSET*/


           Reverse given 10 rows before sending to user


                                 - 14 -
Explain

  mysql> explain
         SELECT * FROM message
         WHERE id < '49999961'
         ORDER BY id DESC LIMIT 20 G
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: message
           type: range
  possible_keys: PRIMARY
            key: PRIMARY
        key_len: 4
            ref: NULL
           Rows: 25000020 /* ignore this */
          Extra: Using where
  1 row in set (0.00 sec)




                                   - 15 -
What about order by non unique values?


                         99
                         99
                         98    Page One
                         98
                         98
                         98
                         98
                         97    Page Two
                         97
                         10
  We can't do:
            WHERE thumbs_up < 98
            ORDER BY thumbs_up DESC /* It will return few seen rows */

  Can we say this:
            WHERE thumbs_up <= 98
            AND <extra_con>
            ORDER BY thumbs_up DESC

                                      - 16 -
Add more condition

  •   Consider thumbs_up as major number
       – if we have additional minor number, we can use combination of major & minor
         as extra condition




  •   Find additional column (minor number)
       – we can use id primary key as minor number




                                          - 17 -
Solution
                                     First Page
  SELECT thumbs_up, id
  FROM message
  ORDER BY thumbs_up DESC, id DESC
  LIMIT $page_size

  +-----------+----+
  | thumbs_up | id |
  +-----------+----+
  |        99 | 14 |
  |        99 | 2 |
  |        98 | 18 |
  |        98 | 15 |
  |        98 | 13 |
  +-----------+----+
                                     Next Page
  SELECT thumbs_up, id
  FROM message
  WHERE thumbs_up <= 98 AND (id < 13 OR thumbs_up < 98)
  ORDER BY thumbs_up DESC, id DESC
  LIMIT $page_size

  +-----------+----+
  | thumbs_up | id |
  +-----------+----+
  |        98 | 10 |
  |        98 | 6 |
  |        97 | 17 |
                                        - 18 -
Make it better..
   Query:


   SELECT * FROM message
   WHERE thumbs_up <= 98
            AND (id < 13 OR thumbs_up < 98)
   ORDER BY thumbs_up DESC, id DESC
   LIMIT 20


   Can be written as:


   SELECT m2.* FROM message m1, message m2
   WHERE m1.id = m2.id
            AND m1.thumbs_up <= 98
            AND (m1.id < 13 OR m1.thumbs_up < 98)
   ORDER BY m1.thumbs_up DESC, m1.id DESC
   LIMIT 20;

                                     - 19 -
Explain

  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: m1
           type: range
  possible_keys: PRIMARY,thumbs_up_key
            key: thumbs_up_key /* (thumbs_up,id) */
        key_len: 4
            ref: NULL
           Rows: 25000020 /*ignore this, we will read just 20 rows*/
          Extra: Using where; Using index /* Cover */
  *************************** 2. row ***************************
             id: 1
    select_type: SIMPLE
          table: m2
           type: eq_ref
  possible_keys: PRIMARY
            key: PRIMARY
        key_len: 4
            ref: forum.m1.id
           rows: 1
          Extra:

                                   - 20 -
Performance Gain (Primary Key Order)




                       - 21 -
Performance Gain (Secondary Key Order)




                      - 22 -
Throughput Gain


 •   Throughput Gain while hitting first 30 pages:

      – Using LIMIT OFFSET, N
          • 600 query/sec




      – Using LIMIT N (no OFFSET)
          • 3.7k query/sec




                                        - 23 -
Bonus Point

  Product issue with LIMIT M, N


     User is reading a page, in the mean time some records may be added to
     previous page.


     Due to insert/delete pages records are going to move forward/backward
     as rolling window:
      – User is reading messages on 4th page
      – While he was reading, one new message posted (it would be there on page
        one), all pages are going to move one message to next page.
      – User Clicks on Page 5
      – One message from page got pushed forward on page 5, user has to read it
        again


  No such issue with news approach

                                        - 24 -
Drawback

  Search Engine Optimization Expert says:
  Let bot reach all you pages with fewer number of deep dive


  Two Solutions:
  •   Read extra rows
       – Read extra rows in advance and construct links for few previous & next pages


  •   Use small offset
       – Do not read extra rows in advance, just add links for few past & next pages
         with required offset & last_seen_id on current page
       – Do query using new approach with small offset to display desired page
       –
                             file:///Users/surat/Desktop/Picture%2043.png




  Additional concern: Dynamic urls, last_seen is not constant over time.
                                                                            - 25 -
Thanks




  - 26 -

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

The MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer TraceThe MySQL Query Optimizer Explained Through Optimizer Trace
The MySQL Query Optimizer Explained Through Optimizer Trace
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
 
Mongo DB 완벽가이드 - 4장 쿼리하기
Mongo DB 완벽가이드 - 4장 쿼리하기Mongo DB 완벽가이드 - 4장 쿼리하기
Mongo DB 완벽가이드 - 4장 쿼리하기
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
Faster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDBFaster, better, stronger: The new InnoDB
Faster, better, stronger: The new InnoDB
 
Materialize: a platform for changing data
Materialize: a platform for changing dataMaterialize: a platform for changing data
Materialize: a platform for changing data
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
Mysql Explain Explained
Mysql Explain ExplainedMysql Explain Explained
Mysql Explain Explained
 
MySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELKMySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELK
 
MySQL Indexing : Improving Query Performance Using Index (Covering Index)
MySQL Indexing : Improving Query Performance Using Index (Covering Index)MySQL Indexing : Improving Query Performance Using Index (Covering Index)
MySQL Indexing : Improving Query Performance Using Index (Covering Index)
 
Optimizing queries MySQL
Optimizing queries MySQLOptimizing queries MySQL
Optimizing queries MySQL
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
 
あるmmapの話
あるmmapの話あるmmapの話
あるmmapの話
 
Data Warehouse on Kubernetes: lessons from Clickhouse Operator
Data Warehouse on Kubernetes: lessons from Clickhouse OperatorData Warehouse on Kubernetes: lessons from Clickhouse Operator
Data Warehouse on Kubernetes: lessons from Clickhouse Operator
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication TutorialThe Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
 

Destaque

CogLab | Imaginove | UI#02 – BCI : Usages et enjeux pour l’innovation et la c...
CogLab | Imaginove | UI#02 – BCI : Usages et enjeux pour l’innovation et la c...CogLab | Imaginove | UI#02 – BCI : Usages et enjeux pour l’innovation et la c...
CogLab | Imaginove | UI#02 – BCI : Usages et enjeux pour l’innovation et la c...
af83
 
Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113
Erwan Pigneul
 

Destaque (20)

Pagination Done the Right Way
Pagination Done the Right WayPagination Done the Right Way
Pagination Done the Right Way
 
Modern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial DatabasesModern SQL in Open Source and Commercial Databases
Modern SQL in Open Source and Commercial Databases
 
Mobile commerce km
Mobile commerce kmMobile commerce km
Mobile commerce km
 
Implications of 4G Deployments (MEF for MPLS World Congress Ethernet Wholesa...
Implications of 4G Deployments (MEF for MPLS World Congress  Ethernet Wholesa...Implications of 4G Deployments (MEF for MPLS World Congress  Ethernet Wholesa...
Implications of 4G Deployments (MEF for MPLS World Congress Ethernet Wholesa...
 
Seerus analytics or how integrate smart data in your company
Seerus analytics or how integrate smart data in your company Seerus analytics or how integrate smart data in your company
Seerus analytics or how integrate smart data in your company
 
Brand Positioning, a component of INDIGITAL BRANDING MODEL©
Brand Positioning, a component of INDIGITAL BRANDING MODEL©Brand Positioning, a component of INDIGITAL BRANDING MODEL©
Brand Positioning, a component of INDIGITAL BRANDING MODEL©
 
Zéphir, ERP dans le Cloud
Zéphir, ERP dans le CloudZéphir, ERP dans le Cloud
Zéphir, ERP dans le Cloud
 
CogLab | Imaginove | UI#02 – BCI : Usages et enjeux pour l’innovation et la c...
CogLab | Imaginove | UI#02 – BCI : Usages et enjeux pour l’innovation et la c...CogLab | Imaginove | UI#02 – BCI : Usages et enjeux pour l’innovation et la c...
CogLab | Imaginove | UI#02 – BCI : Usages et enjeux pour l’innovation et la c...
 
Growth hacking - Telecom bretagne - 2015-10-21
Growth hacking - Telecom bretagne - 2015-10-21Growth hacking - Telecom bretagne - 2015-10-21
Growth hacking - Telecom bretagne - 2015-10-21
 
CANDDi Insights
CANDDi InsightsCANDDi Insights
CANDDi Insights
 
Introduction to C#3
Introduction to C#3Introduction to C#3
Introduction to C#3
 
Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113Elasticmeetup curiosity 20141113
Elasticmeetup curiosity 20141113
 
TIBCO Loyalty Lab paris event
TIBCO Loyalty Lab paris eventTIBCO Loyalty Lab paris event
TIBCO Loyalty Lab paris event
 
sfPot aop
sfPot aopsfPot aop
sfPot aop
 
Big on Mobile, Big on Facebook. How the European super startups did it.
Big on Mobile, Big on Facebook. How the European super startups did it. Big on Mobile, Big on Facebook. How the European super startups did it.
Big on Mobile, Big on Facebook. How the European super startups did it.
 
IBM MQ Overview (IBM Message Queue)
IBM MQ Overview (IBM Message Queue)IBM MQ Overview (IBM Message Queue)
IBM MQ Overview (IBM Message Queue)
 
Indian IT industry analysis of 5 slides and company ( Infosys) analysis ( FY ...
Indian IT industry analysis of 5 slides and company ( Infosys) analysis ( FY ...Indian IT industry analysis of 5 slides and company ( Infosys) analysis ( FY ...
Indian IT industry analysis of 5 slides and company ( Infosys) analysis ( FY ...
 
Best Bourbons
Best BourbonsBest Bourbons
Best Bourbons
 
Performance and scalability for machine learning
Performance and scalability for machine learningPerformance and scalability for machine learning
Performance and scalability for machine learning
 
Maximizing information and communications technologies for development in fai...
Maximizing information and communications technologies for development in fai...Maximizing information and communications technologies for development in fai...
Maximizing information and communications technologies for development in fai...
 

Semelhante a Efficient Pagination Using MySQL

Covering indexes
Covering indexesCovering indexes
Covering indexes
MYXPLAIN
 
Database Oracle Basic
Database Oracle BasicDatabase Oracle Basic
Database Oracle Basic
Kamlesh Singh
 
What's New In MySQL 5.6
What's New In MySQL 5.6What's New In MySQL 5.6
What's New In MySQL 5.6
Abdul Manaf
 

Semelhante a Efficient Pagination Using MySQL (20)

PPC2009_yahoo_mysql_pagination
PPC2009_yahoo_mysql_paginationPPC2009_yahoo_mysql_pagination
PPC2009_yahoo_mysql_pagination
 
Covering indexes
Covering indexesCovering indexes
Covering indexes
 
MySQL for business developer - Titouan BENOIT
MySQL for business developer - Titouan BENOITMySQL for business developer - Titouan BENOIT
MySQL for business developer - Titouan BENOIT
 
Optimizando MySQL
Optimizando MySQLOptimizando MySQL
Optimizando MySQL
 
Oracle
OracleOracle
Oracle
 
Introduction into MySQL Query Tuning
Introduction into MySQL Query TuningIntroduction into MySQL Query Tuning
Introduction into MySQL Query Tuning
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, HerokuPostgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
Postgres & Redis Sitting in a Tree- Rimas Silkaitis, Heroku
 
Raj mysql
Raj mysqlRaj mysql
Raj mysql
 
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBAPHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
 
Troubleshooting MySQL Performance
Troubleshooting MySQL PerformanceTroubleshooting MySQL Performance
Troubleshooting MySQL Performance
 
SQL (1).pptx
SQL (1).pptxSQL (1).pptx
SQL (1).pptx
 
MariaDB Optimizer
MariaDB OptimizerMariaDB Optimizer
MariaDB Optimizer
 
MariaDB and Clickhouse Percona Live 2019 talk
MariaDB and Clickhouse Percona Live 2019 talkMariaDB and Clickhouse Percona Live 2019 talk
MariaDB and Clickhouse Percona Live 2019 talk
 
Database Oracle Basic
Database Oracle BasicDatabase Oracle Basic
Database Oracle Basic
 
What's New In MySQL 5.6
What's New In MySQL 5.6What's New In MySQL 5.6
What's New In MySQL 5.6
 
SQL: tricks and cheats
SQL: tricks and cheatsSQL: tricks and cheats
SQL: tricks and cheats
 
MySQL Query tuning 101
MySQL Query tuning 101MySQL Query tuning 101
MySQL Query tuning 101
 
Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with ClickhouseWebinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
Webinar slides: Adding Fast Analytics to MySQL Applications with Clickhouse
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Efficient Pagination Using MySQL

  • 1. Efficient Pagination Using MySQL Surat Singh Bhati (surat@yahoo-inc.com) Rick James (rjames@yahoo-inc.com) Yahoo Inc Percona Performance Conference 2009
  • 2. Outline 1. Overview – Common pagination UI pattern – Sample table and typical solution using OFFSET – Techniques to avoid large OFFSET – Performance comparison – Concerns -2-
  • 4. Basics First step toward having efficient pagination over large data set – Use index to filter rows (resolve WHERE) – Use same index to return rows in sorted order (resolve ORDER) Step zero – http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html – http://dev.mysql.com/doc/refman/5.1/en/order-by-optimization.html – http://dev.mysql.com/doc/refman/5.1/en/limit-optimization.html -4-
  • 5. Using Index KEY a_b_c (a, b, c) ORDER may get resolved using Index – ORDER BY a – ORDER BY a,b – ORDER BY a, b, c – ORDER BY a DESC, b DESC, c DESC WHERE and ORDER both resolved using index: – WHERE a = const ORDER BY b, c – WHERE a = const AND b = const ORDER BY c – WHERE a = const ORDER BY b, c – WHERE a = const AND b > const ORDER BY b, c ORDER will not get resolved uisng index (file sort) – ORDER BY a ASC, b DESC, c DESC /* mixed sort direction */ – WHERE g = const ORDER BY b, c /* a prefix is missing */ – WHERE a = const ORDER BY c /* b is missing */ – WHERE a = const ORDER BY a, d /* d is not part of index */ -5-
  • 6. Sample Schema CREATE TABLE `message` ( `id` int(11) NOT NULL AUTO_INCREMENT, `title` varchar(255) COLLATE utf8_unicode_ci NOT NULL, `user_id` int(11) NOT NULL, `content` text COLLATE utf8_unicode_ci NOT NULL, `create_time` int(11) NOT NULL, `thumbs_up` int(11) NOT NULL DEFAULT '0', /* Vote Count */ PRIMARY KEY (`id`), KEY `thumbs_up_key` (`thumbs_up`,`id`) ) ENGINE=InnoDB mysql> show table status like 'message' G Engine: InnoDB Version: 10 Row_format: Compact Rows: 50000040 /* 50 Million */ Avg_row_length: 565 Data_length: 28273803264 /* 26 GB */ Index_length: 789577728 /* 753 MB */ Data_free: 6291456 Create_time: 2009-04-20 13:30:45 Two use case: • Paginate by time, recent message one page one • Paginate by thumps_up, largest value on page one -6-
  • 7. Typical Query 1. Get the total records SELECT count(*) FROM message 2. Get current page SELECT * FROM message ORDER BY id DESC LIMIT 0, 20 • http://domain.com/message?page=1 • ORDER BY id DESC LIMIT 0, 20 • http://domain.com/message?page=2 • ORDER BY id DESC LIMIT 20, 20 • http://domain.com/message?page=3 • ORDER BY id DESC LIMIT 40, 20 Note: id is auto_increment, same as create_time order, no need to create index on create_time, save space – -7-
  • 8. Explain mysql> explain SELECT * FROM message ORDER BY id DESC LIMIT 10000, 20G ***************** 1. row ************** id: 1 select_type: SIMPLE table: message type: index possible_keys: NULL key: PRIMARY key_len: 4 ref: NULL rows: 10020 Extra: 1 row in set (0.00 sec) – it can read rows using index scan and execution will stop as soon as it finds required rows. – LIMIT 10000, 20 means it has to read 10020 and throw away 10000 rows, then return next 20 rows. -8-
  • 9. Performance Implications – Larger OFFSET is going to increase active data set, MySQL has to bring data in memory that is never returned to caller. – Performance issue is more visible when your have database that can't fit in main memory. – Small percentage of request with large OFFSET would be able to hit disk I/O Disk I/O bottleneck – In order to display “21 to 40 of 1000,000” , some one has to count 1000,000 rows. -9-
  • 10. Simple Solution – Do not display total records, does user really care? – Do not let user go to deep pages, redirect him http://en.wikipedia.org/wiki/Internet_addiction_disorder after certain number of pages - 10 -
  • 11. Avoid Count(*) 1. Never display total messages, let user see more message by clicking 'next' 2. Do not count on every request, cache it, display stale count, user do not care about 324533 v/s 324633 3. Display 41 to 80 of Thousands 4. Use pre calculated count, increment/decrement value as insert/delete happens. - 11 -
  • 12. Solution to avoid offset 1. Change User Interface – No direct jumps to Nth page 2. LIMIT N is fine, Do not use LIMIT M,N – Provide extra clue about from where to start given page – Find the desired records using more restricted WHERE using given clue and ORDER BY and LIMIT N without OFFSET) - 12 -
  • 13. Find the clue 150 111 102 Page One 101 100 <a href=”/page=2;last_seen=100;dir=next>Next</a> 98 <a href=”/page=1;last_seen=98;dir=prev>Prev</a> 97 96 Page Two 95 94 <a href=”/page=3;last_seen=94;dir=next>Next</a> 93 <a href=”/page=3;last_seen=93;dir=prev>Prev</a> 92 91 Page Three 90 89 <a href=”/page=4;last_seen=89;dir=prev>Next</a> - 13 -
  • 14. Solution using clue Next Page: http://domain.com/forum?page=2&last_seen=100&dir=next WHERE id < 100 /* last_seen * ORDER BY id DESC LIMIT $page_size /* No OFFSET*/ Prev Page: http://domain.com/forum?page=1&last_seen=98&dir=prev WHERE id > 98 /* last_seen * ORDER BY id ASC LIMIT $page_size /* No OFFSET*/ Reverse given 10 rows before sending to user - 14 -
  • 15. Explain mysql> explain SELECT * FROM message WHERE id < '49999961' ORDER BY id DESC LIMIT 20 G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: message type: range possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: NULL Rows: 25000020 /* ignore this */ Extra: Using where 1 row in set (0.00 sec) - 15 -
  • 16. What about order by non unique values? 99 99 98 Page One 98 98 98 98 97 Page Two 97 10 We can't do: WHERE thumbs_up < 98 ORDER BY thumbs_up DESC /* It will return few seen rows */ Can we say this: WHERE thumbs_up <= 98 AND <extra_con> ORDER BY thumbs_up DESC - 16 -
  • 17. Add more condition • Consider thumbs_up as major number – if we have additional minor number, we can use combination of major & minor as extra condition • Find additional column (minor number) – we can use id primary key as minor number - 17 -
  • 18. Solution First Page SELECT thumbs_up, id FROM message ORDER BY thumbs_up DESC, id DESC LIMIT $page_size +-----------+----+ | thumbs_up | id | +-----------+----+ | 99 | 14 | | 99 | 2 | | 98 | 18 | | 98 | 15 | | 98 | 13 | +-----------+----+ Next Page SELECT thumbs_up, id FROM message WHERE thumbs_up <= 98 AND (id < 13 OR thumbs_up < 98) ORDER BY thumbs_up DESC, id DESC LIMIT $page_size +-----------+----+ | thumbs_up | id | +-----------+----+ | 98 | 10 | | 98 | 6 | | 97 | 17 | - 18 -
  • 19. Make it better.. Query: SELECT * FROM message WHERE thumbs_up <= 98 AND (id < 13 OR thumbs_up < 98) ORDER BY thumbs_up DESC, id DESC LIMIT 20 Can be written as: SELECT m2.* FROM message m1, message m2 WHERE m1.id = m2.id AND m1.thumbs_up <= 98 AND (m1.id < 13 OR m1.thumbs_up < 98) ORDER BY m1.thumbs_up DESC, m1.id DESC LIMIT 20; - 19 -
  • 20. Explain *************************** 1. row *************************** id: 1 select_type: SIMPLE table: m1 type: range possible_keys: PRIMARY,thumbs_up_key key: thumbs_up_key /* (thumbs_up,id) */ key_len: 4 ref: NULL Rows: 25000020 /*ignore this, we will read just 20 rows*/ Extra: Using where; Using index /* Cover */ *************************** 2. row *************************** id: 1 select_type: SIMPLE table: m2 type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: forum.m1.id rows: 1 Extra: - 20 -
  • 21. Performance Gain (Primary Key Order) - 21 -
  • 22. Performance Gain (Secondary Key Order) - 22 -
  • 23. Throughput Gain • Throughput Gain while hitting first 30 pages: – Using LIMIT OFFSET, N • 600 query/sec – Using LIMIT N (no OFFSET) • 3.7k query/sec - 23 -
  • 24. Bonus Point Product issue with LIMIT M, N User is reading a page, in the mean time some records may be added to previous page. Due to insert/delete pages records are going to move forward/backward as rolling window: – User is reading messages on 4th page – While he was reading, one new message posted (it would be there on page one), all pages are going to move one message to next page. – User Clicks on Page 5 – One message from page got pushed forward on page 5, user has to read it again No such issue with news approach - 24 -
  • 25. Drawback Search Engine Optimization Expert says: Let bot reach all you pages with fewer number of deep dive Two Solutions: • Read extra rows – Read extra rows in advance and construct links for few previous & next pages • Use small offset – Do not read extra rows in advance, just add links for few past & next pages with required offset & last_seen_id on current page – Do query using new approach with small offset to display desired page – file:///Users/surat/Desktop/Picture%2043.png Additional concern: Dynamic urls, last_seen is not constant over time. - 25 -
  • 26. Thanks - 26 -