Liferay's new search engine is extremely powerful. This session will show how the new OSGi infrastructure in DXP made it possible for us to introduce several new modular extension points in Search so customers can add their own search engine mappings, build custom queries for individual fields, programmatically fine tune boosting and relevance, and go beyond the Search Portlet with new Search Pages that can be assembled from both out-of-the-box portlets and brand new components built from scratch - for the ultimate customer experience.
André "Arbo" Oliveira joined Liferay in early 2014 as a senior engineer and leads the Search Infrastructure team. He's been writing code for a living for 22 years, 14 of them as a Java developer and architect. Ever since discovering Elasticsearch, he's vowed never to write another SQL WHERE clause again.
Search Intelligently - Liferay Symposium North America 2016, Chicago, USA
1. Search Intelligently
Building the Digital Experience with
Liferay DXP, new queries and Elasticsearch
André Ricardo Barreto de Oliveira ("Arbo")
Search Infrastructure Lead - Liferay, Inc.
Chicago, USA
September 27, 2016
3. The classic approach to Search
● A tool for content retrieval, and little else
● Analyze - Index - Query - Display - Repeat
● Website finished: "add a Search Bar" as an afterthought
● No real correlation between Search and Business Case
@arbocombr
4. Meanwhile, in the Search landscape...
● Big Data
● Analytics and Statistics
● Post-text content
● Maps and Geolocation
@arbocombr
● Natural Language Processing
● Bots
● Machine Learning
● Artificial Intelligence
5. Search meets Digital Transformation
● Well-designed Search, now central to digital business
● A search is often the starting point for the User Journey
● Results with nonlinear interaction- and action
● Impossible to tell apart: Search and the Digital Experience
@arbocombr
45. The classic approach to queries
● Full text "bag of words" → all results with same relevance
● Substring (*wildcards*) → performance hit in large indices
● Limited flexibility for special parsing cases (e.g. emails)
● Manual configuration for fields with custom analyzers
@arbocombr
46. Intelligent queries in Liferay DXP
● Modular OSGi extension points for query builders
● Construct sophisticated, compound queries per field
● Add custom analyzers and type mappings programmatically
● Design searchable assets and fine tune relevance
@arbocombr
47. Rich data requires tailored search
@arbocombr
TITLE
(Short, autocomplete aware)
Description
(Lengthy, full text)
Email address
(Special formats, symbols)
Geolocation
(Coordinates)
48. Description field style - ready to use
● Classic full text search
● Match by any number of words (or phrases in quotes)
● Proximity: words near each other become top results
@arbocombr
49. Title field style - ready to use
● Full text search like classic Description- with tweaks
● Autocomplete ready: match to start will boost relevance
● A perfect match to the exact title becomes a top result
● Ignore proximity, since titles are short- best performance
@arbocombr
50. Substring field style (discouraged)
● Find anywhere in "keyword" field: "user@liferay.com"
● Not actually analyzed- full scan- kills performance
● Kept for backward compatibility only ("wildcard", "like")
● Use intelligent mappings, analyzers and queries instead
@arbocombr
51.
52. An intelligent platform
● A well designed User Journey- an increase in Search volume
● Content suggestion- incremental filters- advanced queries
● Liferay DXP: focus on application and content management
● Search engine: external scalability, dynamic advantages
@arbocombr
54. ● Match
● Multi Match
● Match All
● Query String
● Term / Terms
Elasticsearch queries for any use case
@arbocombr
● Regexp
● Fuzzy
● Type
● Ids
● DisMax
● REST API for compound queries- Lucene with better syntax
● Range
● Exists
● Missing
● Prefix
● Wildcard
● Geo Distance
● Geo Distance Range
● Geo Bounding Box
● Geo Polygon
● More Like This
55. Fine tune relevance rapidly...
… then bring it back
into your Liferay search
@arbocombr
GET /cars/transactions/_search
{
"query" : {
"constant_score": {
"filter": {
"range": {
"price": {
"gte": 10000
}
}
}
}
},
"aggs" : {
"single_avg_price": {
"avg" : { "field" : "price" }
}
}
}
56. Similarity suggestion: More Like This
● Liferay DXP: MoreLikeThisQuery
● User viewing blogs, documents, your own custom entities
● Automatically suggest related assets, based on content
● Full text and specific fields
@arbocombr
63. The Digital Experience and Search
● Digital Transformation: not just index-and-find anymore
● Your User Journey will often start with a search
● Effective matches and refinements generate business
● With more user searches, underlying platform must scale
@arbocombr
64. Liferay DXP: innovations in Search
● Elasticsearch: Lucene at core, improvements at every level
● Enterprise-grade Search with Shield, Marvel and Kibana
● Maximum scalability, decoupled from the DXP footprint
● Flexibility with modular API and OSGi extension points
@arbocombr
65. Intelligent queries for all use cases
● DXP: many new filters and queries to mix and match
● Ultimate relevance with per-field analyzers, queries, boosts
● Geolocation and more native field type mappings
@arbocombr
66. Intelligent Search for your User Journey
● Leverage modularity for fast, extensible development
● Small components that modify queries and share results
● Aggregations, filters, boxes, lists, maps, custom UI portlets
● Search Pages tailored to the needs of your business
@arbocombr
67. Thank you
- The Liferay Search Infrastructure Team -
@arbocombr
André de Oliveira ➤ Lead, Engineering (USA)
Tibor Lipusz ➤ SME, Support (Hungary)
Felipe Pires, Vitor Fernandes ➤ Design (Brazil)
Rodrigo Paulino ➤ Back-end (Brazil)
Jonathan Mak, Kevin Tan ➤ Front-end (USA)
Albert Lee, Brian Lee ➤ QA (USA)
Russell Bohl, Rich Sezov ➤ Tech writing (USA)
David Truong, Michael Han ➤ Product Management (USA)
http://j.mp/SearchLiferayNorthAmerica2016
andre.oliveira@liferay.com
github.com/arboliveira