Reed Business Information reached out to Search Technologies for help with a migration project, to take them from FAST ESP to a Solr-based infrastructure running on Amazon AWS. Read how Search Technologies implemented this project. http://www.searchtechnologies.com/fast-solr-migration-case-study.html
1. US: Phone: 703-953-2791
UK: Phone: 01344 292 292
www.searchtechnologies.com
info@searchtechnologies.com
INTRODUCTION
Reed Business Information (RBI), a leading provider of
business information, data and marketing solutions,
produces industry critical data services and lead
generation tools, as well as online community and job
websites. RBI reached out to Search Technologies for
help with a migration project, to take them from FAST
ESP to a Solr-based infrastructure running on Amazon
AWS.
DETAILS
RBI has been using FAST to power search on a wide
range of websites since 2005. FAST ESP has proved to be
a highly reliable platform for this purpose. In addition to
very low down-time, it has provided large query
capacities, and has been used as a platform to develop
or deploy a range of sophisticated, multi-lingual
capabilities for entity extraction and categorization,
powering both search, and the contextual serving of
widgetized links across hundreds of websites.
The future for the FAST technology, owned by Microsoft
since 2008, is SharePoint-centric, and as an agile
publisher of both business and consumer titles to niche
audiences, RBI decided that Solr was their preferred
alternative to replace FAST ESP. RBI contracted with
Search Technologies to deliver consulting and
implementation services, and to manage the FAST ESP
to Solr transition project.
The overall project objective was simply to emulate the
FAST-based service, without loss of functionality or
performance.
The websites served by this application use a range of
languages, including Chinese (simple and traditional),
English, Dutch, Spanish, French, Italian and German.
The content sets to be indexed include the participating
websites, to enable sophisticated site search
functionality, plus numerous other content sources, to
provide supplementary information and news, usually
focused around specific industry verticals. The new
system also powers RBI’s business search portal,
Zibb.com.
SAME FUNCTIONALITY, MUCH LOWER COSTS
The key challenge set by RBI, was to find a way to
maintain existing functionality, providing a highly
functional and reliable service to publishers within RBI,
and at the same time to substantially lower the overall
cost-of-ownership of the search infrastructure.
Some key aspects of the existing infrastructure used
FAST-specific methods. In addition, FAST was running on
a substantial, Microsoft-hosted facility involving more
than 90 servers.
THE APPROACH
A key aspect of the requirements was that publications
using the search service should not be required to
change anything in their configuration. This necessitated
emulating a number of FAST ESP methods, including:
Transforming FAST FQL search requests into
Solr’s query syntax
Returning results in standard FAST ESP format
(by manipulating Solr’s XML-based results)
Making use of existing content processing
capabilities, such as entity extraction and
categorization
CASE STUDY:
Reed Business Information – A FAST ESP to Solr Migration
2. It was agreed that Search Technologies’ Aspire Content
Processing Platform, and QPL, the Query Processing
Language, should be used to effect these transitions.
Therefore, the overall solution comprised Solr, Aspire,
plus a number of existing technologies within RBI, many
of which are home-grown.
THE PROJECT
Over a period of a few months, with regular daily calls
between the RBI and Search Technologies’ teams, the
project was detailed and progressed. Key decisions
included:
The use of Amazon AWS to host the new search
service
The development of a query parser to translate
FAST FQL into Solr search syntax, and to
translate Solr search results into a FAST ESP
format, so that the receiving Content
Management Systems would not notice any
changes and would function as normal
The use of software-based load balancing to
send queries to servers with spare capacity
The project also involved a significant amount of work to
re-create the FAST ESP index pipeline, including
interfaces to third party tools. This was achieved using
the’ Aspire Content Processing framework.
Solr’s native language processing capabilities coped
adequately with the multi-language demands of this
project.
IN SUMMARY
Graeme McCracken, CIO at Reed Business International,
commented, “We progressively transitioned our sites
and services from FAST to Solr over a two-week period,
and nobody noticed. At the same time, we reduced our
on-going cost-of-ownership by more than half.”
Numerous Reed Business properties are now served by
this Solr-based service. Design specifications called for
an average search-time of less than 200 milliseconds.
The live system is consistently delivering an average of
70 milliseconds.
The new Solr search system has more than 30 million
documents under index, and it meets sustained capacity
demands of more than 300 queries-per-second, without
compromising search speed.
About Search Technologies
Leadership
The largest IT service company
dedicated to enterprise search
implementation, consulting, and
managed services
Independence
Working with all of the leading
search software vendors and
open source alternatives
Experience
400+ customers and more than
50,000 consultant days of expert
services delivered in the last four
years alone