SlideShare uma empresa Scribd logo
1 de 62
Baixar para ler offline
Improving MySQL-
based applications
performance with
Sphinx

Maciej Dobrzaoski
(Мачей Добжаньски)
Percona, Inc.
INTRODUCTION
Who am I?
  – Consultant at Percona, Inc.
  – What do I do?
     • Performance audits
     • Fix broken systems
     • Design architectures
  – Typically work from home
INTRODUCTION
What is Percona, Inc.?
   – Consulting company
   – Provides services for MySQL applications
   – Develops open-source software
      • Scalability patches for InnoDB
      • XtraDB storage engine for MySQL
      • Xtrabackup – free backup solution for InnoDB/XtraDB
WHAT IS MYSQL?
WHAT IS MYSQL?
MySQL is...
   – Open-source relational database management system
   – Popular enough to assume everyone here knows it
WHAT IS SPHINX?
WHAT IS SPHINX?
A standalone full-text search engine
   – Consists of two major applications
      • indexer
      • searchd
   – More efficient than MySQL FULLTEXT
      • On larger data sets
WHAT IS SPHINX?
A standalone full-text search engine
   – Can be easily scaled horizontally
      • Sphinx indexes can be distributed across many servers
      • Allows parallel searching
      • One instance becomes a dispatcher
          – Forwards queries to other instances
          – Combines results before sending them back to clients
WHAT IS SPHINX?
WHAT IS SPHINX?
Many additional features beyond just full-text search
   – Indexable attributes for non-FTS filtering
      • numerical, multi-value and now also text
      • Example: limit results to rows which have
        article_score>=2
   – Sorting results by an attribute or an expression
      • Example: @weight+(article_score)*0.1
WHAT IS SPHINX?
Many additional features beyond just full-text search
   – Grouping results by an attribute
      • Additional support for timestamp attributes
      • Returns also row count per group – may be approximate
   – Calculating expressions
      • Much faster than in MySQL as per recent benchmarks
WHAT IS SPHINX?
Anything else?
   – On-line re-indexing
   – Live index updates
   – Extensive API available for many programming languages
      •   PHP
      •   Python
      •   Java
      •   many more
WHAT IS SPHINX?
There’s even more!
   – SphinxQL – MySQL server protocol compatible
      • Connect with any MySQL client
         – command line
         – API call, e.g. mysql_connect()
      • Run SQL-like queries
WHAT IS SPHINX?
Example use of SphinxQL
HOW DOES SPHINX WORK WITH MYSQL?
HOW DOES SPHINX WORK WITH MYSQL?
Sphinx is external application; not part of MYSQL
   – Uses own data files
   – Needs memory
   – Has to be queried separately
      • Sphinx API
      • SphinxQL
      • Sphinx Storage Engine for MySQL
HOW DOES SPHINX WORK WITH MYSQL?
Sphinx is external application; not part of MySQL
   – Updating Sphinx indexes has to be done separately too
      • Periodic data re-indexing with indexer
          – Some information may be outdated for a while
          – Can be optimized through re-indexing the latest changes only
      • Live index updates from applications
          – Applications need to write twice to both MySQL and Sphinx
          – Available only for attributes; full-text updates to come
HOW DOES MYSQL WORK WITH SPHINX?
Example data source for Sphinx index
sql_query = SELECT mi.id, mi.movie_id, t.production_year,
   t.title, mi.info FROM movie_info mi JOIN title t
   ON t.id = mi.movie_id
sql_attr_uint                   = movie_id
sql_attr_uint                   = production_year
• Notice the source can be any valid SQL query
   – Uses joins to denormalize data for Sphinx
• Two integer attributes – movie_id and production_year
HOW DOES SPHINX WORK WITH MYSQL?
Sphinx is not a full database (yet?)
   – It’s primarily a search engine
   – It can return values stored as attributes, e.g:
     movie_id, production_year
   – …but not any full-text searchable columns
   – Results from Sphinx can be used to fetch full details from
     database
IMPORTANT FACTS TO KNOW ABOUT
           MYSQL
IMPORTANT FACTS TO KNOW ABOUT MYSQL
Uses B-TREE indexes to improve search performance
   – Works great for equality operator (=)
   – …and small range lookups: >, >=, <, <=, IN (list), LIKE
      • Range size relative to table size, not an absolute value
      • Large range often turns into plain scan
IMPORTANT FACTS TO KNOW ABOUT MYSQL
MySQL can use any left-most part of an index
   – INDEX (a, b, c) can fully optimize both:
      (1) SELECT * FROM T WHERE a=9
      (2) SELECT * FROM T WHERE a=9 AND b IN (1,2) AND c=4
     …but not any of:
      (3) SELECT * FROM T WHERE b=7 AND c=1
      (4) SELECT * FROM T WHERE a=9 AND c=2 (may still use index for a=9 only)
   – No good indexes means you may need a new one
IMPORTANT FACTS TO KNOW ABOUT MYSQL
Each index slows down writes to a table
   – Index is an organized structure, it has to be maintained
   – There can’t be too many or performance will suffer
MySQL can typically use only one index per query
   – There are rare exceptions – index merge optimizations
   – Merges are often not good enough – an observation
IMPORTANT FACTS TO KNOW ABOUT MYSQL
These work great in MySQL
   – Index optimized searching
      • A query which uses indexes efficiently is fast enough
      • B-TREE lookups are typically very efficient
      • FULLTEXT indexes can be the exception
   – Index optimized sorting and grouping
      • Rows are read in the proper order
IMPORTANT FACTS TO KNOW ABOUT MYSQL
These can cause problems in MySQL
   – Full table scans
      • No index is used
      • Query reads entire table row by row checking for matches
   – Large scans related to poor selectivity
      • An index is used, but it is not selective enough
      • MySQL has to read a lot of rows and reject many of them
IMPORTANT FACTS TO KNOW ABOUT MYSQL
These can cause problems in MySQL
   – Search on many combinations of columns in a single table
      • Each combination may require new index
      • Can’t have too many indexes in table at the same time
   – Handling multi-value properties in searches
      • Keywords, tags
      • Such queries often can’t be optimized very well
IMPORTANT FACTS TO KNOW ABOUT MYSQL
These can cause problems in MySQL
   – Sorting or grouping not done through indexes
      • Requires rewriting rows into temporary storage
      • At least one additional pass over results to complete
      • LIMIT does not work until all matches are found and
        sorted/grouped
IMPORTANT FACTS TO KNOW ABOUT MYSQL
Indexes and data may be cached in memory
   – key_buffer and filesystem cache for MyISAM tables
   – innodb_buffer_pool for InnoDB tables
   – No guarantees what is in RAM
      • MySQL has no option to lock certain data in buffers
IMPORTANT FACTS TO KNOW ABOUT MYSQL
Full-text support in MySQL
   – Available through FULLTEXT keys
   – Only supported by MyISAM engine
      • MyISAM uses table level locking
      • May become a showstopper for busy databases
   – Cannot be used together with any other index
      • Even index merge will not work
IMPORTANT FACTS TO KNOW ABOUT
            SPHINX
IMPORTANT FACTS TO KNOW ABOUT SPHINX
Search remembers no more than max_matches results
  | total           | 1000   |
  | total_found     | 2255   |
  –   Other results are ignored before sending them to client
  –   Saves some CPU and RAM
  –   All results are often unnecessary
  –   Accuracy costs
IMPORTANT FACTS TO KNOW ABOUT SPHINX
IMPORTANT FACTS TO KNOW ABOUT SPHINX
Grouping is done in fixed memory
   – Results may be approximate
      • When number of matches exceeds max_matches
   – Inaccuracy depends on max_matches setting
      • The larger the more accurate grouping results
      • Growing max_matches can reduce performance
   – Accuracy costs
IMPORTANT FACTS TO KNOW ABOUT SPHINX
MySQL                         Sphinx (uses SphinxQL)
SELECT ..., COUNT(1) _c       SELECT *
   FROM movie_info               FROM movies
WHERE                         WHERE
   MATCH (info)                  MATCH ('@info "story"')
   AGAINST ('"story"'         GROUP BY movie_id
       IN BOOLEAN MODE)       ORDER BY @count DESC 4
   GROUP BY movie_id
   ORDER BY _c DESC LIMIT 4
IMPORTANT FACTS TO KNOW ABOUT SPHINX
MySQL                     Sphinx
+----------+----------+   +----------+--------+
| movie_id | COUNT(1) |   | movie_id | @count |
+----------+----------+   +----------+--------+
|    30372 |       15 |   |    30372 |     15 |
|   855624 |       13 |   |   855624 |     13 |
|   590071 |       13 |   |   143384 |     12 |
|   143384 |       12 |   |   590071 |     12 |
+----------+----------+   +----------+--------+
IMPORTANT FACTS TO KNOW ABOUT SPHINX
Full copy of attributes is always kept in RAM
   –   If attribute storage was set to ‘extern’ – the typical use
   –   Preloaded on start
   –   Never read from disk again once Sphinx is up
   –   Guarantees certain performance
   –   Calculate the storage requirements properly
        • Sphinx may want to allocate too much memory
IMPORTANT FACTS TO KNOW ABOUT SPHINX
Sphinx stores rows in blocks
   – 64 rows per block
   – Meta data contains (min, max) range of every attribute
   – Allows quick rejection when filtering by attributes
      • No need to scan every row individually
MYSQL V SPHINX
 PERFORMANCE
FULL-TEXT SEARCH PERFORMANCE

           USES FULL IMDB DATABASE
 IMPORTED INTO MYSQL AND INDEXED WITH SPHINX
FULL-TEXT SEARCH PERFORMANCE
MySQL                        Sphinx (uses SphinxQL)
SELECT COUNT(1)              SELECT *
   FROM movie_info              FROM movies
WHERE                        WHERE
   MATCH (info)                 MATCH ('@info "james
   AGAINST ('"james bond"'      bond"')
       IN BOOLEAN MODE)
FULL-TEXT SEARCH PERFORMANCE
MySQL                     Sphinx
+----------+              +---------------+-------+
| COUNT(1) |              | Variable_name | Value |
+----------+              +---------------+-------+
|     2255 |              | total         | 1000 |
+----------+              | total_found   | 2255 |
1 row in set (0.13 sec)   | time          | 0.003 |
                          ...
SCAN PERFORMANCE

          USES FULL IMDB DATABASE
IMPORTED INTO MYSQL AND INDEXED WITH SPHINX
SCAN PERFORMANCE
MySQL                           Sphinx (uses SphinxQL)
SELECT COUNT(1)                 SELECT *
   FROM title                      FROM titles
WHERE                           WHERE
   production_year >= 1990         production_year >= 1990
   AND                             AND
   production_year <= 2000         production_year <= 2000

No index on `production_year`
SCAN PERFORMANCE
MySQL                     Sphinx
+----------+              +---------------+--------+
| COUNT(1) |              | Variable_name | Value |
+----------+              +---------------+--------+
|   239203 |              | total         | 1000   |
+----------+              | total_found   | 239203 |
1 row in set (1.09 sec)   | time          | 0.051 |
                          ...
MORE COMPLEX CASE
      SEARCH BY KEYWORDS
          USES FULL IMDB DATABASE
IMPORTED INTO MYSQL AND INDEXED WITH SPHINX
SEARCH BY KEYWORDS
MySQL                             Sphinx (uses SphinxQL)
SELECT t.id FROM title t          SELECT *
   JOIN movie_keyword mk             FROM keywords
   ON mk.movie_id = t.id          WHERE
   JOIN keyword k
   ON k.id = mk.keyword_id           MATCH
                                     ('@keywords
WHERE                                     ("beautiful-woman"|
   k.keyword IN ('beautiful-              "women"|"murder")')
   woman', 'women', 'murder')
                                  ORDER BY production_year DESC
GROUP BY t.id ORDER BY               LIMIT 3
   production_year DESC LIMIT 3
SEARCH BY KEYWORDS
MySQL                      Sphinx
+--------+                 +--------+
| id     |                 | id     |
+--------+                 +--------+
| 561959 |                 | 561959 |
| 74273 |                  | 74273 |
| 344814 |                 | 344814 |
+--------+                 +--------+
3 rows in set (1.84 sec)   time = 0.015
SEARCH BY KEYWORDS
Sphinx returns
   – Values of the indexed attrubites
   – Meta information about search and results
   – No text
      • Recent version can actually store and return short strings
      • But only defined as attributes, not full-text searchable
SEARCH BY KEYWORDS
Use that information to fetch full details from MySQL

mysql> SELECT t.id, t.title FROM title t WHERE
        t.id IN(561959, 74273, 344814)
   +--------+---------------------------------------+
   | id     | title                                 |
   +--------+---------------------------------------+
   | 74273 | Blue Silence                           |
   | 344814 | Marvin: The Life Story of Marvin Gaye |
   | 561959 | The Red Man's View                    |
   +--------+---------------------------------------+
SEARCH BY KEYWORDS
MySQL                            Sphinx
+--------+-------------------+   +--------+-----------------+
| id     | title             |   | id     | production_year |
+--------+-------------------+   +--------+-----------------+
| 74273 | Blue Silence       |   | 561959 |            2014 |
| 344814 | Marvin: The Li... |   | 74273 |             2013 |
| 561959 | The Red Man's ... |   | 344814 |            2012 |
+--------+-------------------+   +--------+-----------------+
       Notice MySQL returned rows in different order!
SEARCH BY KEYWORDS
The order in SQL can only be guaranteed with ORDER BY!
What is the solution?
   – Append ORDER       BY production_year DESC
        • applies to only small number of rows, so it’s probably okay
   or
   – Remember the order of Sphinx results in application
   – Restore it after reveiving data from MySQL
SEARCH BY KEYWORDS
What if „keywords” were numerical identifiers?
   – Create „fake keywords” and index them as text
   – Convert numbers into strings when building index
     sql_query = SELECT t.id,
     GROUP_CONCAT(CONCAT('KEY_', mk.keyword_id))
     FROM title t JOIN movie_keyword mk ON t.id = mk.movie_id
     GROUP BY t.id

   – Run full-text searches using strings such as "KEY_1234"
FLEXIBLE SEARCH
FLEXIBLE SEARCH
A data structure describing user profile
CREATE TABLE `members` (
   `user_id` int(10) unsigned,
   `user_firstname` varchar(50) unsigned,
   `user_surname` varchar(50) unsigned,
   `user_dob` date unsigned,
   `user_lastvisit` datetime unsigned,
   `user_datetime` datetime unsigned,
   `user_bio` unsigned,
   `user_hasphoto` tinyint(2) unsigned,
   `user_hasvideo` tinyint(2) unsigned,
   ...
FLEXIBLE SEARCH
Flexible search typically means
   – Search conditions may involve any number of columns in
     any combination
   – Sorting may be done on one of many columns as well

Often impossible to add all necessary indexes in MySQL
FLEXIBLE SEARCH
Many columns may have very low cardinality
   – Example: user_gender
   – MySQL would not even consider using index for such
     column

It may be very difficult to make it work fast in MySQL
   – When tables or traffic are large enough
FLEXIBLE SEARCH
How does Sphinx help?
   –   Scans are optimized
   –   Optimizations apply to all columns
   –   Possibility to use „fake keywords”
   –   Data can be split across several instances
        • Parallel search
        • No extra application logic necessary to combine results
SUMMARY
SUMMARY
Sphinx can be of great help to many MySQL-based apps
   – Developed to work better where MySQL performs poorly
      •   Text search
      •   Large scans
      •   Filtering on many combinations of columns
      •   Handling multi-value properties
SUMMARY
Sphinx can be of great help to any MySQL-based apps
   –   Comes with features that can actually replace database
   –   Easily scalable
   –   Actively developed
   –   You can sponsor development and have features you need
       done soon
        • No need to wait long until some functionality „appears”
Sphinx
http://www.sphinxsearch.com/

Percona Consulting
http://www.percona.com/
THANK YOU!

Mais conteúdo relacionado

Mais procurados

Spark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean WamplerSpark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean WamplerSpark Summit
 
Spark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub HavaSpark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub HavaSpark Summit
 
Pinterest hadoop summit_talk
Pinterest hadoop summit_talkPinterest hadoop summit_talk
Pinterest hadoop summit_talkKrishna Gade
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Spark Summit
 
Running Spark on Cloud
Running Spark on CloudRunning Spark on Cloud
Running Spark on CloudQubole
 
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015Iulia Emanuela Iancuta
 
An overview of Amazon Athena
An overview of Amazon AthenaAn overview of Amazon Athena
An overview of Amazon AthenaJulien SIMON
 
Apache Solr 5.0 and beyond
Apache Solr 5.0 and beyondApache Solr 5.0 and beyond
Apache Solr 5.0 and beyondAnshum Gupta
 
Hadoopsummit16 myui
Hadoopsummit16 myuiHadoopsummit16 myui
Hadoopsummit16 myuiMakoto Yui
 
Scala and jvm_languages_praveen_technologist
Scala and jvm_languages_praveen_technologistScala and jvm_languages_praveen_technologist
Scala and jvm_languages_praveen_technologistpmanvi
 
Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera
Solr Consistency and Recovery Internals - Mano Kovacs, ClouderaSolr Consistency and Recovery Internals - Mano Kovacs, Cloudera
Solr Consistency and Recovery Internals - Mano Kovacs, ClouderaLucidworks
 
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Cohesive Networks
 
Using Elasticsearch for Analytics
Using Elasticsearch for AnalyticsUsing Elasticsearch for Analytics
Using Elasticsearch for AnalyticsVaidik Kapoor
 
Building a Business Logic Translation Engine with Spark Streaming for Communi...
Building a Business Logic Translation Engine with Spark Streaming for Communi...Building a Business Logic Translation Engine with Spark Streaming for Communi...
Building a Business Logic Translation Engine with Spark Streaming for Communi...Spark Summit
 
Art of Feature Engineering for Data Science with Nabeel Sarwar
Art of Feature Engineering for Data Science with Nabeel SarwarArt of Feature Engineering for Data Science with Nabeel Sarwar
Art of Feature Engineering for Data Science with Nabeel SarwarSpark Summit
 
Open Source Ingredients for Interactive Data Analysis in Spark by Maxim Lukiy...
Open Source Ingredients for Interactive Data Analysis in Spark by Maxim Lukiy...Open Source Ingredients for Interactive Data Analysis in Spark by Maxim Lukiy...
Open Source Ingredients for Interactive Data Analysis in Spark by Maxim Lukiy...DataWorks Summit/Hadoop Summit
 

Mais procurados (20)

Spark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean WamplerSpark Summit EU talk by Dean Wampler
Spark Summit EU talk by Dean Wampler
 
Spark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub HavaSpark Summit EU talk by Jakub Hava
Spark Summit EU talk by Jakub Hava
 
Pinterest hadoop summit_talk
Pinterest hadoop summit_talkPinterest hadoop summit_talk
Pinterest hadoop summit_talk
 
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East ...
 
Running Spark on Cloud
Running Spark on CloudRunning Spark on Cloud
Running Spark on Cloud
 
Apis with dotnet postgreSQL and Apsaradb
Apis with dotnet postgreSQL and ApsaradbApis with dotnet postgreSQL and Apsaradb
Apis with dotnet postgreSQL and Apsaradb
 
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
In Memory Data Pipeline And Warehouse At Scale - BerlinBuzzwords 2015
 
An overview of Amazon Athena
An overview of Amazon AthenaAn overview of Amazon Athena
An overview of Amazon Athena
 
MySQL Query Optimization.
MySQL Query Optimization.MySQL Query Optimization.
MySQL Query Optimization.
 
Apache Solr 5.0 and beyond
Apache Solr 5.0 and beyondApache Solr 5.0 and beyond
Apache Solr 5.0 and beyond
 
Hadoopsummit16 myui
Hadoopsummit16 myuiHadoopsummit16 myui
Hadoopsummit16 myui
 
Scala and jvm_languages_praveen_technologist
Scala and jvm_languages_praveen_technologistScala and jvm_languages_praveen_technologist
Scala and jvm_languages_praveen_technologist
 
Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera
Solr Consistency and Recovery Internals - Mano Kovacs, ClouderaSolr Consistency and Recovery Internals - Mano Kovacs, Cloudera
Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera
 
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
Lessons Learned in Deploying the ELK Stack (Elasticsearch, Logstash, and Kibana)
 
Presto
PrestoPresto
Presto
 
Using Elasticsearch for Analytics
Using Elasticsearch for AnalyticsUsing Elasticsearch for Analytics
Using Elasticsearch for Analytics
 
Look Mom nosql
Look Mom nosqlLook Mom nosql
Look Mom nosql
 
Building a Business Logic Translation Engine with Spark Streaming for Communi...
Building a Business Logic Translation Engine with Spark Streaming for Communi...Building a Business Logic Translation Engine with Spark Streaming for Communi...
Building a Business Logic Translation Engine with Spark Streaming for Communi...
 
Art of Feature Engineering for Data Science with Nabeel Sarwar
Art of Feature Engineering for Data Science with Nabeel SarwarArt of Feature Engineering for Data Science with Nabeel Sarwar
Art of Feature Engineering for Data Science with Nabeel Sarwar
 
Open Source Ingredients for Interactive Data Analysis in Spark by Maxim Lukiy...
Open Source Ingredients for Interactive Data Analysis in Spark by Maxim Lukiy...Open Source Ingredients for Interactive Data Analysis in Spark by Maxim Lukiy...
Open Source Ingredients for Interactive Data Analysis in Spark by Maxim Lukiy...
 

Semelhante a Sphinx new

MariaDB with SphinxSE
MariaDB with SphinxSEMariaDB with SphinxSE
MariaDB with SphinxSEColin Charles
 
Plugin Opensql2008 Sphinx
Plugin Opensql2008 SphinxPlugin Opensql2008 Sphinx
Plugin Opensql2008 SphinxLiu Lizhi
 
ElasticSearch as (only) datastore
ElasticSearch as (only) datastoreElasticSearch as (only) datastore
ElasticSearch as (only) datastoreTomas Sirny
 
Upgrade to MySQL 8.0!
Upgrade to MySQL 8.0!Upgrade to MySQL 8.0!
Upgrade to MySQL 8.0!Ted Wennmark
 
01 upgrade to my sql8
01 upgrade to my sql8 01 upgrade to my sql8
01 upgrade to my sql8 Ted Wennmark
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHPMike Lively
 
My sql crashcourse_intro_kdl
My sql crashcourse_intro_kdlMy sql crashcourse_intro_kdl
My sql crashcourse_intro_kdlsqlhjalp
 
MySQL :What's New #GIDS16
MySQL :What's New #GIDS16MySQL :What's New #GIDS16
MySQL :What's New #GIDS16Sanjay Manwani
 
MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0Ted Wennmark
 
Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)kayokogoto
 
MySQL: Know more about open Source Database
MySQL: Know more about open Source DatabaseMySQL: Know more about open Source Database
MySQL: Know more about open Source DatabaseMahesh Salaria
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdfMySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdfAlkin Tezuysal
 
Data Warehouse Logical Design using Mysql
Data Warehouse Logical Design using MysqlData Warehouse Logical Design using Mysql
Data Warehouse Logical Design using MysqlHAFIZ Islam
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast MeetupsMembase
 
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp0220140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02Francisco Gonçalves
 
MySQL Ecosystem in 2020
MySQL Ecosystem in 2020MySQL Ecosystem in 2020
MySQL Ecosystem in 2020Alkin Tezuysal
 

Semelhante a Sphinx new (20)

MariaDB with SphinxSE
MariaDB with SphinxSEMariaDB with SphinxSE
MariaDB with SphinxSE
 
Plugin Opensql2008 Sphinx
Plugin Opensql2008 SphinxPlugin Opensql2008 Sphinx
Plugin Opensql2008 Sphinx
 
ElasticSearch as (only) datastore
ElasticSearch as (only) datastoreElasticSearch as (only) datastore
ElasticSearch as (only) datastore
 
Upgrade to MySQL 8.0!
Upgrade to MySQL 8.0!Upgrade to MySQL 8.0!
Upgrade to MySQL 8.0!
 
01 upgrade to my sql8
01 upgrade to my sql8 01 upgrade to my sql8
01 upgrade to my sql8
 
Using Sphinx for Search in PHP
Using Sphinx for Search in PHPUsing Sphinx for Search in PHP
Using Sphinx for Search in PHP
 
My sql crashcourse_intro_kdl
My sql crashcourse_intro_kdlMy sql crashcourse_intro_kdl
My sql crashcourse_intro_kdl
 
MySQL :What's New #GIDS16
MySQL :What's New #GIDS16MySQL :What's New #GIDS16
MySQL :What's New #GIDS16
 
MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0
 
Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)Maria db 10 and the mariadb foundation(colin)
Maria db 10 and the mariadb foundation(colin)
 
MySQL: Know more about open Source Database
MySQL: Know more about open Source DatabaseMySQL: Know more about open Source Database
MySQL: Know more about open Source Database
 
Sql Server2008
Sql Server2008Sql Server2008
Sql Server2008
 
Cassandra
CassandraCassandra
Cassandra
 
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdfMySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
 
Breaking data
Breaking dataBreaking data
Breaking data
 
Data Warehouse Logical Design using Mysql
Data Warehouse Logical Design using MysqlData Warehouse Logical Design using Mysql
Data Warehouse Logical Design using Mysql
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast Meetups
 
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp0220140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
20140128 webinar-get-more-out-of-mysql-with-tokudb-140319063324-phpapp02
 
MySQL Ecosystem in 2020
MySQL Ecosystem in 2020MySQL Ecosystem in 2020
MySQL Ecosystem in 2020
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 

Mais de rit2010

Microsoft cluster systems ritconf
Microsoft cluster systems ritconfMicrosoft cluster systems ritconf
Microsoft cluster systems ritconfrit2010
 
анатомия интернет банка Publish
анатомия интернет банка Publishанатомия интернет банка Publish
анатомия интернет банка Publishrit2010
 
анатомия интернет банка Publish
анатомия интернет банка Publishанатомия интернет банка Publish
анатомия интернет банка Publishrit2010
 
Anatol filin pragmatic documentation 1_r
Anatol filin  pragmatic documentation 1_rAnatol filin  pragmatic documentation 1_r
Anatol filin pragmatic documentation 1_rrit2010
 
Ilia kantor паттерны серверных comet решений
Ilia kantor паттерны серверных comet решенийIlia kantor паттерны серверных comet решений
Ilia kantor паттерны серверных comet решенийrit2010
 
Alexei shilov 2010 rit-rakudo
Alexei shilov 2010 rit-rakudoAlexei shilov 2010 rit-rakudo
Alexei shilov 2010 rit-rakudorit2010
 
Alexandre.iline rit 2010 java_fxui_extra
Alexandre.iline rit 2010 java_fxui_extraAlexandre.iline rit 2010 java_fxui_extra
Alexandre.iline rit 2010 java_fxui_extrarit2010
 
Konstantin kolomeetz послание внутреннему заказчику
Konstantin kolomeetz послание внутреннему заказчикуKonstantin kolomeetz послание внутреннему заказчику
Konstantin kolomeetz послание внутреннему заказчикуrit2010
 
Bykov monitoring mailru
Bykov monitoring mailruBykov monitoring mailru
Bykov monitoring mailrurit2010
 
Alexander shigin slides
Alexander shigin slidesAlexander shigin slides
Alexander shigin slidesrit2010
 
иван василевич Eye tracking и нейрокомпьютерный интерфейс
иван василевич Eye tracking и нейрокомпьютерный интерфейсиван василевич Eye tracking и нейрокомпьютерный интерфейс
иван василевич Eye tracking и нейрокомпьютерный интерфейсrit2010
 
Andrey Petrov P D P
Andrey Petrov P D PAndrey Petrov P D P
Andrey Petrov P D Prit2010
 
Andrey Petrov методология P D P, часть 1, цели вместо кейсов
Andrey Petrov методология P D P, часть 1, цели вместо кейсовAndrey Petrov методология P D P, часть 1, цели вместо кейсов
Andrey Petrov методология P D P, часть 1, цели вместо кейсовrit2010
 
Dmitry lohansky rit2010
Dmitry lohansky rit2010Dmitry lohansky rit2010
Dmitry lohansky rit2010rit2010
 
Dmitry Lohansky Rit2010
Dmitry Lohansky Rit2010Dmitry Lohansky Rit2010
Dmitry Lohansky Rit2010rit2010
 
Related Queries Braslavski Yandex
Related Queries Braslavski YandexRelated Queries Braslavski Yandex
Related Queries Braslavski Yandexrit2010
 
молчанов сергей датацентры 10 04 2010 Light
молчанов сергей датацентры 10 04 2010  Lightмолчанов сергей датацентры 10 04 2010  Light
молчанов сергей датацентры 10 04 2010 Lightrit2010
 
Sergey Ilinsky Rit 2010 Complex Gui Development Ample Sdk
Sergey Ilinsky Rit 2010 Complex Gui Development Ample SdkSergey Ilinsky Rit 2010 Complex Gui Development Ample Sdk
Sergey Ilinsky Rit 2010 Complex Gui Development Ample Sdkrit2010
 
Serge P Nekoval Grails
Serge P  Nekoval GrailsSerge P  Nekoval Grails
Serge P Nekoval Grailsrit2010
 
Pavel Braslavski Related Queries Braslavski Yandex
Pavel Braslavski Related Queries Braslavski YandexPavel Braslavski Related Queries Braslavski Yandex
Pavel Braslavski Related Queries Braslavski Yandexrit2010
 

Mais de rit2010 (20)

Microsoft cluster systems ritconf
Microsoft cluster systems ritconfMicrosoft cluster systems ritconf
Microsoft cluster systems ritconf
 
анатомия интернет банка Publish
анатомия интернет банка Publishанатомия интернет банка Publish
анатомия интернет банка Publish
 
анатомия интернет банка Publish
анатомия интернет банка Publishанатомия интернет банка Publish
анатомия интернет банка Publish
 
Anatol filin pragmatic documentation 1_r
Anatol filin  pragmatic documentation 1_rAnatol filin  pragmatic documentation 1_r
Anatol filin pragmatic documentation 1_r
 
Ilia kantor паттерны серверных comet решений
Ilia kantor паттерны серверных comet решенийIlia kantor паттерны серверных comet решений
Ilia kantor паттерны серверных comet решений
 
Alexei shilov 2010 rit-rakudo
Alexei shilov 2010 rit-rakudoAlexei shilov 2010 rit-rakudo
Alexei shilov 2010 rit-rakudo
 
Alexandre.iline rit 2010 java_fxui_extra
Alexandre.iline rit 2010 java_fxui_extraAlexandre.iline rit 2010 java_fxui_extra
Alexandre.iline rit 2010 java_fxui_extra
 
Konstantin kolomeetz послание внутреннему заказчику
Konstantin kolomeetz послание внутреннему заказчикуKonstantin kolomeetz послание внутреннему заказчику
Konstantin kolomeetz послание внутреннему заказчику
 
Bykov monitoring mailru
Bykov monitoring mailruBykov monitoring mailru
Bykov monitoring mailru
 
Alexander shigin slides
Alexander shigin slidesAlexander shigin slides
Alexander shigin slides
 
иван василевич Eye tracking и нейрокомпьютерный интерфейс
иван василевич Eye tracking и нейрокомпьютерный интерфейсиван василевич Eye tracking и нейрокомпьютерный интерфейс
иван василевич Eye tracking и нейрокомпьютерный интерфейс
 
Andrey Petrov P D P
Andrey Petrov P D PAndrey Petrov P D P
Andrey Petrov P D P
 
Andrey Petrov методология P D P, часть 1, цели вместо кейсов
Andrey Petrov методология P D P, часть 1, цели вместо кейсовAndrey Petrov методология P D P, часть 1, цели вместо кейсов
Andrey Petrov методология P D P, часть 1, цели вместо кейсов
 
Dmitry lohansky rit2010
Dmitry lohansky rit2010Dmitry lohansky rit2010
Dmitry lohansky rit2010
 
Dmitry Lohansky Rit2010
Dmitry Lohansky Rit2010Dmitry Lohansky Rit2010
Dmitry Lohansky Rit2010
 
Related Queries Braslavski Yandex
Related Queries Braslavski YandexRelated Queries Braslavski Yandex
Related Queries Braslavski Yandex
 
молчанов сергей датацентры 10 04 2010 Light
молчанов сергей датацентры 10 04 2010  Lightмолчанов сергей датацентры 10 04 2010  Light
молчанов сергей датацентры 10 04 2010 Light
 
Sergey Ilinsky Rit 2010 Complex Gui Development Ample Sdk
Sergey Ilinsky Rit 2010 Complex Gui Development Ample SdkSergey Ilinsky Rit 2010 Complex Gui Development Ample Sdk
Sergey Ilinsky Rit 2010 Complex Gui Development Ample Sdk
 
Serge P Nekoval Grails
Serge P  Nekoval GrailsSerge P  Nekoval Grails
Serge P Nekoval Grails
 
Pavel Braslavski Related Queries Braslavski Yandex
Pavel Braslavski Related Queries Braslavski YandexPavel Braslavski Related Queries Braslavski Yandex
Pavel Braslavski Related Queries Braslavski Yandex
 

Último

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Último (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

Sphinx new

  • 1. Improving MySQL- based applications performance with Sphinx Maciej Dobrzaoski (Мачей Добжаньски) Percona, Inc.
  • 2. INTRODUCTION Who am I? – Consultant at Percona, Inc. – What do I do? • Performance audits • Fix broken systems • Design architectures – Typically work from home
  • 3. INTRODUCTION What is Percona, Inc.? – Consulting company – Provides services for MySQL applications – Develops open-source software • Scalability patches for InnoDB • XtraDB storage engine for MySQL • Xtrabackup – free backup solution for InnoDB/XtraDB
  • 5. WHAT IS MYSQL? MySQL is... – Open-source relational database management system – Popular enough to assume everyone here knows it
  • 7. WHAT IS SPHINX? A standalone full-text search engine – Consists of two major applications • indexer • searchd – More efficient than MySQL FULLTEXT • On larger data sets
  • 8. WHAT IS SPHINX? A standalone full-text search engine – Can be easily scaled horizontally • Sphinx indexes can be distributed across many servers • Allows parallel searching • One instance becomes a dispatcher – Forwards queries to other instances – Combines results before sending them back to clients
  • 10. WHAT IS SPHINX? Many additional features beyond just full-text search – Indexable attributes for non-FTS filtering • numerical, multi-value and now also text • Example: limit results to rows which have article_score>=2 – Sorting results by an attribute or an expression • Example: @weight+(article_score)*0.1
  • 11. WHAT IS SPHINX? Many additional features beyond just full-text search – Grouping results by an attribute • Additional support for timestamp attributes • Returns also row count per group – may be approximate – Calculating expressions • Much faster than in MySQL as per recent benchmarks
  • 12. WHAT IS SPHINX? Anything else? – On-line re-indexing – Live index updates – Extensive API available for many programming languages • PHP • Python • Java • many more
  • 13. WHAT IS SPHINX? There’s even more! – SphinxQL – MySQL server protocol compatible • Connect with any MySQL client – command line – API call, e.g. mysql_connect() • Run SQL-like queries
  • 14. WHAT IS SPHINX? Example use of SphinxQL
  • 15. HOW DOES SPHINX WORK WITH MYSQL?
  • 16. HOW DOES SPHINX WORK WITH MYSQL? Sphinx is external application; not part of MYSQL – Uses own data files – Needs memory – Has to be queried separately • Sphinx API • SphinxQL • Sphinx Storage Engine for MySQL
  • 17. HOW DOES SPHINX WORK WITH MYSQL? Sphinx is external application; not part of MySQL – Updating Sphinx indexes has to be done separately too • Periodic data re-indexing with indexer – Some information may be outdated for a while – Can be optimized through re-indexing the latest changes only • Live index updates from applications – Applications need to write twice to both MySQL and Sphinx – Available only for attributes; full-text updates to come
  • 18. HOW DOES MYSQL WORK WITH SPHINX? Example data source for Sphinx index sql_query = SELECT mi.id, mi.movie_id, t.production_year, t.title, mi.info FROM movie_info mi JOIN title t ON t.id = mi.movie_id sql_attr_uint = movie_id sql_attr_uint = production_year • Notice the source can be any valid SQL query – Uses joins to denormalize data for Sphinx • Two integer attributes – movie_id and production_year
  • 19. HOW DOES SPHINX WORK WITH MYSQL? Sphinx is not a full database (yet?) – It’s primarily a search engine – It can return values stored as attributes, e.g: movie_id, production_year – …but not any full-text searchable columns – Results from Sphinx can be used to fetch full details from database
  • 20. IMPORTANT FACTS TO KNOW ABOUT MYSQL
  • 21. IMPORTANT FACTS TO KNOW ABOUT MYSQL Uses B-TREE indexes to improve search performance – Works great for equality operator (=) – …and small range lookups: >, >=, <, <=, IN (list), LIKE • Range size relative to table size, not an absolute value • Large range often turns into plain scan
  • 22. IMPORTANT FACTS TO KNOW ABOUT MYSQL MySQL can use any left-most part of an index – INDEX (a, b, c) can fully optimize both: (1) SELECT * FROM T WHERE a=9 (2) SELECT * FROM T WHERE a=9 AND b IN (1,2) AND c=4 …but not any of: (3) SELECT * FROM T WHERE b=7 AND c=1 (4) SELECT * FROM T WHERE a=9 AND c=2 (may still use index for a=9 only) – No good indexes means you may need a new one
  • 23. IMPORTANT FACTS TO KNOW ABOUT MYSQL Each index slows down writes to a table – Index is an organized structure, it has to be maintained – There can’t be too many or performance will suffer MySQL can typically use only one index per query – There are rare exceptions – index merge optimizations – Merges are often not good enough – an observation
  • 24. IMPORTANT FACTS TO KNOW ABOUT MYSQL These work great in MySQL – Index optimized searching • A query which uses indexes efficiently is fast enough • B-TREE lookups are typically very efficient • FULLTEXT indexes can be the exception – Index optimized sorting and grouping • Rows are read in the proper order
  • 25. IMPORTANT FACTS TO KNOW ABOUT MYSQL These can cause problems in MySQL – Full table scans • No index is used • Query reads entire table row by row checking for matches – Large scans related to poor selectivity • An index is used, but it is not selective enough • MySQL has to read a lot of rows and reject many of them
  • 26. IMPORTANT FACTS TO KNOW ABOUT MYSQL These can cause problems in MySQL – Search on many combinations of columns in a single table • Each combination may require new index • Can’t have too many indexes in table at the same time – Handling multi-value properties in searches • Keywords, tags • Such queries often can’t be optimized very well
  • 27. IMPORTANT FACTS TO KNOW ABOUT MYSQL These can cause problems in MySQL – Sorting or grouping not done through indexes • Requires rewriting rows into temporary storage • At least one additional pass over results to complete • LIMIT does not work until all matches are found and sorted/grouped
  • 28. IMPORTANT FACTS TO KNOW ABOUT MYSQL Indexes and data may be cached in memory – key_buffer and filesystem cache for MyISAM tables – innodb_buffer_pool for InnoDB tables – No guarantees what is in RAM • MySQL has no option to lock certain data in buffers
  • 29. IMPORTANT FACTS TO KNOW ABOUT MYSQL Full-text support in MySQL – Available through FULLTEXT keys – Only supported by MyISAM engine • MyISAM uses table level locking • May become a showstopper for busy databases – Cannot be used together with any other index • Even index merge will not work
  • 30. IMPORTANT FACTS TO KNOW ABOUT SPHINX
  • 31. IMPORTANT FACTS TO KNOW ABOUT SPHINX Search remembers no more than max_matches results | total | 1000 | | total_found | 2255 | – Other results are ignored before sending them to client – Saves some CPU and RAM – All results are often unnecessary – Accuracy costs
  • 32. IMPORTANT FACTS TO KNOW ABOUT SPHINX
  • 33. IMPORTANT FACTS TO KNOW ABOUT SPHINX Grouping is done in fixed memory – Results may be approximate • When number of matches exceeds max_matches – Inaccuracy depends on max_matches setting • The larger the more accurate grouping results • Growing max_matches can reduce performance – Accuracy costs
  • 34. IMPORTANT FACTS TO KNOW ABOUT SPHINX MySQL Sphinx (uses SphinxQL) SELECT ..., COUNT(1) _c SELECT * FROM movie_info FROM movies WHERE WHERE MATCH (info) MATCH ('@info "story"') AGAINST ('"story"' GROUP BY movie_id IN BOOLEAN MODE) ORDER BY @count DESC 4 GROUP BY movie_id ORDER BY _c DESC LIMIT 4
  • 35. IMPORTANT FACTS TO KNOW ABOUT SPHINX MySQL Sphinx +----------+----------+ +----------+--------+ | movie_id | COUNT(1) | | movie_id | @count | +----------+----------+ +----------+--------+ | 30372 | 15 | | 30372 | 15 | | 855624 | 13 | | 855624 | 13 | | 590071 | 13 | | 143384 | 12 | | 143384 | 12 | | 590071 | 12 | +----------+----------+ +----------+--------+
  • 36. IMPORTANT FACTS TO KNOW ABOUT SPHINX Full copy of attributes is always kept in RAM – If attribute storage was set to ‘extern’ – the typical use – Preloaded on start – Never read from disk again once Sphinx is up – Guarantees certain performance – Calculate the storage requirements properly • Sphinx may want to allocate too much memory
  • 37. IMPORTANT FACTS TO KNOW ABOUT SPHINX Sphinx stores rows in blocks – 64 rows per block – Meta data contains (min, max) range of every attribute – Allows quick rejection when filtering by attributes • No need to scan every row individually
  • 38. MYSQL V SPHINX PERFORMANCE
  • 39. FULL-TEXT SEARCH PERFORMANCE USES FULL IMDB DATABASE IMPORTED INTO MYSQL AND INDEXED WITH SPHINX
  • 40. FULL-TEXT SEARCH PERFORMANCE MySQL Sphinx (uses SphinxQL) SELECT COUNT(1) SELECT * FROM movie_info FROM movies WHERE WHERE MATCH (info) MATCH ('@info "james AGAINST ('"james bond"' bond"') IN BOOLEAN MODE)
  • 41. FULL-TEXT SEARCH PERFORMANCE MySQL Sphinx +----------+ +---------------+-------+ | COUNT(1) | | Variable_name | Value | +----------+ +---------------+-------+ | 2255 | | total | 1000 | +----------+ | total_found | 2255 | 1 row in set (0.13 sec) | time | 0.003 | ...
  • 42. SCAN PERFORMANCE USES FULL IMDB DATABASE IMPORTED INTO MYSQL AND INDEXED WITH SPHINX
  • 43. SCAN PERFORMANCE MySQL Sphinx (uses SphinxQL) SELECT COUNT(1) SELECT * FROM title FROM titles WHERE WHERE production_year >= 1990 production_year >= 1990 AND AND production_year <= 2000 production_year <= 2000 No index on `production_year`
  • 44. SCAN PERFORMANCE MySQL Sphinx +----------+ +---------------+--------+ | COUNT(1) | | Variable_name | Value | +----------+ +---------------+--------+ | 239203 | | total | 1000 | +----------+ | total_found | 239203 | 1 row in set (1.09 sec) | time | 0.051 | ...
  • 45. MORE COMPLEX CASE SEARCH BY KEYWORDS USES FULL IMDB DATABASE IMPORTED INTO MYSQL AND INDEXED WITH SPHINX
  • 46. SEARCH BY KEYWORDS MySQL Sphinx (uses SphinxQL) SELECT t.id FROM title t SELECT * JOIN movie_keyword mk FROM keywords ON mk.movie_id = t.id WHERE JOIN keyword k ON k.id = mk.keyword_id MATCH ('@keywords WHERE ("beautiful-woman"| k.keyword IN ('beautiful- "women"|"murder")') woman', 'women', 'murder') ORDER BY production_year DESC GROUP BY t.id ORDER BY LIMIT 3 production_year DESC LIMIT 3
  • 47. SEARCH BY KEYWORDS MySQL Sphinx +--------+ +--------+ | id | | id | +--------+ +--------+ | 561959 | | 561959 | | 74273 | | 74273 | | 344814 | | 344814 | +--------+ +--------+ 3 rows in set (1.84 sec) time = 0.015
  • 48. SEARCH BY KEYWORDS Sphinx returns – Values of the indexed attrubites – Meta information about search and results – No text • Recent version can actually store and return short strings • But only defined as attributes, not full-text searchable
  • 49. SEARCH BY KEYWORDS Use that information to fetch full details from MySQL mysql> SELECT t.id, t.title FROM title t WHERE t.id IN(561959, 74273, 344814) +--------+---------------------------------------+ | id | title | +--------+---------------------------------------+ | 74273 | Blue Silence | | 344814 | Marvin: The Life Story of Marvin Gaye | | 561959 | The Red Man's View | +--------+---------------------------------------+
  • 50. SEARCH BY KEYWORDS MySQL Sphinx +--------+-------------------+ +--------+-----------------+ | id | title | | id | production_year | +--------+-------------------+ +--------+-----------------+ | 74273 | Blue Silence | | 561959 | 2014 | | 344814 | Marvin: The Li... | | 74273 | 2013 | | 561959 | The Red Man's ... | | 344814 | 2012 | +--------+-------------------+ +--------+-----------------+ Notice MySQL returned rows in different order!
  • 51. SEARCH BY KEYWORDS The order in SQL can only be guaranteed with ORDER BY! What is the solution? – Append ORDER BY production_year DESC • applies to only small number of rows, so it’s probably okay or – Remember the order of Sphinx results in application – Restore it after reveiving data from MySQL
  • 52. SEARCH BY KEYWORDS What if „keywords” were numerical identifiers? – Create „fake keywords” and index them as text – Convert numbers into strings when building index sql_query = SELECT t.id, GROUP_CONCAT(CONCAT('KEY_', mk.keyword_id)) FROM title t JOIN movie_keyword mk ON t.id = mk.movie_id GROUP BY t.id – Run full-text searches using strings such as "KEY_1234"
  • 54. FLEXIBLE SEARCH A data structure describing user profile CREATE TABLE `members` ( `user_id` int(10) unsigned, `user_firstname` varchar(50) unsigned, `user_surname` varchar(50) unsigned, `user_dob` date unsigned, `user_lastvisit` datetime unsigned, `user_datetime` datetime unsigned, `user_bio` unsigned, `user_hasphoto` tinyint(2) unsigned, `user_hasvideo` tinyint(2) unsigned, ...
  • 55. FLEXIBLE SEARCH Flexible search typically means – Search conditions may involve any number of columns in any combination – Sorting may be done on one of many columns as well Often impossible to add all necessary indexes in MySQL
  • 56. FLEXIBLE SEARCH Many columns may have very low cardinality – Example: user_gender – MySQL would not even consider using index for such column It may be very difficult to make it work fast in MySQL – When tables or traffic are large enough
  • 57. FLEXIBLE SEARCH How does Sphinx help? – Scans are optimized – Optimizations apply to all columns – Possibility to use „fake keywords” – Data can be split across several instances • Parallel search • No extra application logic necessary to combine results
  • 59. SUMMARY Sphinx can be of great help to many MySQL-based apps – Developed to work better where MySQL performs poorly • Text search • Large scans • Filtering on many combinations of columns • Handling multi-value properties
  • 60. SUMMARY Sphinx can be of great help to any MySQL-based apps – Comes with features that can actually replace database – Easily scalable – Actively developed – You can sponsor development and have features you need done soon • No need to wait long until some functionality „appears”