6. CONFIGURING INDEXES
• Add indexes on your AR class Article < ActiveRecord::Base
classes using define_index define_index do
# fields
• Fields (indexes) contain text indexes subject, :sortable => true
indexes content
you can search indexes author.name, :as=> :author,
:sortable => true
• Attributes (has)
allow you to # attributes
sort and constrain your has author_id, created_at,
updated_at
searches end
end
• Careful!Column names
aren’t symbols
8. source twitterer_core_0
{
type = mysql
sql_host = 127.0.0.1
sql_user = cheaptweet
sql_pass = cheaptweet
sql_db = cheaptweet_development2
sql_query_pre = UPDATE `twitterer` SET `delta` = 0
sql_query_pre = SET NAMES utf8
sql_query = SELECT `twitterer`.`id` * 1 + 0 AS `id` , CAST(`twitterer`.`screen_name` AS CHAR) AS `screen_name`, CAST(`twitterer`.`name` AS
CHAR) AS `name`, CAST(`twitterer`.`description` AS CHAR) AS `description`, CAST(`twitterer`.`url` AS CHAR) AS `url`,
CAST(`twitterer`.`location` AS CHAR) AS `location`, `twitterer`.`id` AS `sphinx_internal_id`, 283224142 AS `class_crc`, '283224142' AS
`subclass_crcs`, 0 AS `sphinx_deleted` FROM twitterer WHERE `twitterer`.`id` >= $start AND `twitterer`.`id` <= $end AND
`twitterer`.`delta` = 0 GROUP BY `twitterer`.`id` ORDER BY NULL
sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM `twitterer` WHERE `twitterer`.`delta` = 0
sql_attr_uint = sphinx_internal_id
sql_attr_uint = class_crc
sql_attr_uint = sphinx_deleted
sql_attr_multi = uint subclass_crcs from field
sql_query_info = SELECT * FROM `twitterer` WHERE `id` = (($id - 0) / 1)
}
index twitterer_core
{
source = twitterer_core_0
path = /Users/hayesdavis/Appozite/workspace/CheapTweet/data/sphinx/development/twitterer_core
morphology = stem_en
charset_type = utf-8
}
MORE ABOUT INDEXING
Thinking Sphinx generates a config file for sphinx, indexes (aka
“sources”) are defined. It’s a little complicated.
10. #Searches all fields for “pants”
Article.search “pants”
#Conditions are allowed on fields but must be hash
Article.search “pants”, :conditions=>{
:subject=>”How To Wear”
}
#Query attributes using :with
Article.search “pants”, :with=>{
:author_id=>1, :created_at=>1.week.ago..Time.now
}
SEARCHING
Use the search method on AR classes
11. BUT WAIT
HOW DO I KEEP INDEXES
(ESPECIALLY BIG ONES) UP TO DATE?
12. DELTA INDEXES TO THE
RESCUE
• Mini index of only rows that have been updated
• Must merge into “core” index periodically or it’ll get slow
• Simplest approach: add delta boolean column to model
• Add set_property :delta=>true to define_index block
• Delta index is rebuilt on model saves, can cause performance
hit
13. DEPLOYMENT &
PRODUCTION
• Must schedule full re-indexing periodically
• Have god or monit keep an eye on things
• Consider adding some cap tasks to help out with reindexing
and restarting
14. TIPS, TRICKS, GOTCHAS
• Simplest delta indexing can lead to performance issues
• Indexer assumes you have sequential ids on your DB rows and
iterates through them in chunks - very bad if you have big
gaps
• Run full indexing as often as you can without hurting
performance - it’s usually pretty fast
• Youcan hand-edit config files if you need to tune - but be
careful not to regenerate