Exploring the Future Potential of AI-Enabled Smartphone Processors
Snapguide - Amazon Cloudsearch
1. Share what you know
Sam Kimbrel sam@snapguide.com
Software Engineer
Monday, April 1, 13
2. What is Snapguide?
• 1.5 million uniques/month
• ~2000 reqs/min across app
and web
• Python (Pyramid/uWSGI/
nginx)
• MySQL/Redis
• Built primarily on AWS: EC2,
RDS, S3, SQS, SNS,
CloudSearch, CloudFront
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
6. Snapguide on CloudSearch
• Beta trial users after mentioning Solr on the phone
(seriously!)
• Primary data set: guides
• Facets: guide topic, “featured” boolean, visibility/ACL
flags
• “autocomplete” search (more later)
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
7. {
"lang": "en",
"fields": {
"step_count": "14",
"author_external_id": "qS878yliQ4mxg_9uHt2AZg",
"author": "Claire Hesseltine",
"items": [
"Preheat oven to 325 degrees Fahrenheit.",
...
],
"title": "Make Brown Butter Sea Salt Cookies",
"featured": 1,
"summary": "The brown butter adds a nutty, caramel-like taste
to these delicious cookies.",
"topic": [
"desserts"
],
"main_image_uuid": "43d201c8fd4b4833b83d3f95d112f1c1",
"like_count": 761,
"public": "true"
},
"version": 1364333310,
"type": "add",
"id": "9eabff97e32c4244a8205da3fba442e9"
} daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
8. Queries
• Guide text search:
q=cookies
• Guide search with topic:
q=cookies&facet=topic&bq=topic:‘desserts’
• “Typeahead”/suggestion search:
bq=(or ‘paper flower’ ‘paper flower*’)
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
9. Result Ranking
• Use “Compare Rank Expressions”
• text_relevance is your friend
• Goals:
• Boost popular/featured guides
• Make title/summary matches worth more than item
(supplies, step text) matches
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
10. min(
cs.text_relevance(
{"weights":
{"title":2.5, "author": 1.5, "items":
0.1, "summary": 1.5},
"default_weight":1}),
1000)
+ min(200, like_count / 10)
+ 100*featured
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
11. Offline index updates
• Extracting guide data to update document is slow
• Remove update from online web request process
• Internal-only API endpoints
• SQS
• queue_consumer daemon
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
12. Offline index updates
Web server SQS
Queue consumer
Snapguide
DB/Redis
Web server
(dedicated to queues) CloudSearch
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
13. Performance
SSL is painful
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
14. Performance
but physical proximity (us-west-1) is
awesome
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
15. Future work
• Add more domains (users, new features)
• Search-based suggestion engine
• Improved ranking/scoring — crawl our social graph
daniel@snapguide.com • confidential do not distribute
Monday, April 1, 13
16. Questions?
www.snapguide.com
Monday, April 1, 13