Presentation on how to chat with PDF using ChatGPT code interpreter
Using Solr in Online Travel Shopping to Improve User Experience
1. Using Solr in Online Travel to
Improve User Experience
Sudhakar Karegowdra, Esteban Donato
Travelocity, May 25TH 2011
{ sudhakar.karegowdra, esteban.donato}@travelocity.com
2. What We Will Cover
§ Travelocity
§ Speakers Background
§ Merchandising & Solr
• Challenges
• Solution
• Sizing and performance data
• Take Away
§ Location Resolution & Solr
• Challenges
• Solution
• Sizing and performance data
• Take Away
§ Q&A
3
3. § First Online Travel Agency(OTA) Launched in 1996
§ Grown to 3,000 employees and is one of the largest
travel agencies worldwide
§ Headquartered in Dallas/Fort Worth with satellite
offices in San Francisco, New York, London,
Singapore, Bangalore, Buenos Aires to name a few
§ In 2004, the Roaming Gnome became the
centerpiece of marketing efforts and has become an
international pop icon
§ Owned by Sabre Holdings - sister companies include
Travelocity Business, IgoUgo.com, lastminute.com,
Zuji among others
4
4. Speakers Background
§ Sudhakar Karegowdra § Esteban Donato
• Principal Architect • Lead Architect
Travelocity.com Travelocity.com
§ My experience § My experience
– 13 + years – 10 + years
– Solr/ Lucene 3 years – Solr 2 years
– Implementing Hadoop, – Analyzing Mahout and
Pig and Hive for Data Carrot2 for document
warehouse. clustering engine.
§ Topic : § Topic :
Merchandising Location Resolution
5
6. The Challenge
§ Market Drivers
• Build Landing Pages with Faceted Navigation
• Enable Content Segmentation and delivery
• Support Roll out of Promotions
• Roll up Data to a higher level
§ E.g., All 5 star hotels in California to bring all the 5 Star
hotels from SFO,LAX, SAN etc.,
• Faster time to market new Ideas
• Rapidly scale to accommodate global brands
with disparate data sources
7
7. The Challenge
§ Traditional Database approach
• Higher time to market
• Specialized skill set to design and optimize
database structures and queries
• Aggregation of data and changing of structures
quite complex
• Building Faceted navigation capabilities needs
complex logic leading to high maintenance cost
8
8. Solution - Overview
§ Data from various sources aggregated and
ingested into Solr
• Core per Locale and Product Type
§ Wrapper service to combine some data across
product cores and manage configuration rules
§ Solr’s built in Search and Faceting to power the
navigation
9
10. Solution - Achievements
§ Millions of unique Long Tail Landing Pages
§ E.g.,
http://www.travelocity.com/hotel-d4980-nevada-las-vegas-
hotels_5-star_business-center_green
§ Faster search across products
§ E.g., Beach Deals under $500
§ Segmented Content delivery through tagging
§ Scaled well to distribute the content to different
brands, partners and advertisers
§ Opened up for other innovative applications
§ Deals on Map, Deals on Mobile, Wizards etc.,
11
11. Solution – Road Ahead
§ Migration to Solr 3.1
• Geo spatial search
• CSV out put format
§ Query boosting by Search pattern
§ Near Real time Updates
§ Deal and user behavior mining in Hadoop –
MapReduce and Solr to Serve the Content
§ Move Slaves to Cloud
12
12. Sizing & Performance
§ Index Stats
§ Number of Cores : 25
§ Number of Documents : ~ 1 Million Records
§ Response
§ Requests : 70 tps
§ Average response time : 0.005 seconds (5 ms)
§ Software Versions
§ Solr Version 1.4.0
– filterCache size : 30000
§ Tomcat – 5.5.9
§ JDK1.6
13
13. Take Away
§ Semi Structured Storage in Solr helps
aggregate disparate sources easily
Remember Dynamic fields
§ Multiple Cores to manage multiple locale data
§ Solr is a great enabler of “Innovations”
14
15. The Challenge
§ How to develop a global location resolution
service?
§ Flexibility to changes
§ General enough to cover everyone needs
§ Multi language
§ Performance and scalability
§ Configurable by site
16
16. Architecture of the solution
Auto-complete
Solr Slave
Resolution
§ Master/Slave architecture
§ SolrJ client each core
§ Multi-core: binary format
§ Solr response cache
represents a language Solr Master
§ Remote Streaming indexing
§ CSV format
Management Batch Job
Tool Location DB
17
17. Auto-complete
§ System has to suggest options as the users
type their desired location
§ Examples “san” => San Francisco, “veg” =>
Las Vegas
§ Relevancy: not all the locations are equally
important. “par” => “Paris, France”; “Parana,
Argentina”
§ Users can search by various fields: location
code, location name, city code, city name,
state/province code, state province name,
country code, country name.
18
19. Resolution
§ System has to resolve the location requested
by the users.
§ Contemplates aliases. Big Apple => New York
§ Contemplates ambiguities.
§ Contemplates misspellings. Lomdon => London
§ NGramDistance algorithm.
§ How to combine distance with relevancy
§ Error suggesting the correct location when it is a prefix.
Lond => London
20