My talk at the Swiss Elastic Meetup #20: https://www.meetup.com/elasticsearch-switzerland/events/237184939/
Elasticsearch (ES) is commonly known as a search and analytics engine. At the same time, information retrieval techniques available in ES can be used to deliver additional value to the users by providing recommendations.
In my talk, I show how to employ ES to obtain various types of recommendations. We consider basic content-based techniques as well as hybrid ones involving automatic user interests identification. Considering the example of our web app Graasp (http://graasp.net), I give ideas how recommendations can be integrated into your product.
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
Multiple ways of building a recommender system with Elasticsearch - Elastic Meetup Switzerland - Andrii Vozniuk
1. The copyright of images belongs to their authors. Drop me a message at andrii@vozniuk.com to remove
Talk description: https://www.meetup.com/elasticsearch-switzerland/events/237184939/
MULTIPLE WAYS OF BUILDING A
RECOMMENDER SYSTEM WITH
ELASTICSEARCH
ANDRII VOZNIUK
REACT-EPFL
Elastic MeetupLausanne, March 2017
1
3. WHY RECOMMENDATIONS
• Increase engagement
• Address information overload
• Improve information findability
• Not aware of its existence
• Do not know particular keywords
• New content appearing
• Facilitate discovery of relevant content
• Not only search or tags
3
10. GOALS
• Provide contextually relevant recommendations
• Should work for individual items and for spaces
(collections of items)
• Will allow the user to discover contextually relevant
content items or users
10
12. ELASTICSEARCH
COMPUTING RELEVANCE
12
STEP 1.
Represent each content item using the document vector model
STEP 0.
Compute TF-IDF for each term in the vectors
STEP 2.
Use vector cosine similarity for scoring and ranking
14. ELASTICSEARCH
MORE LIKE THIS (MLT)
QUERY
14
Source: More Like This Query https://www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-mlt-query.html
Text-based
Can be a combination of both
Document Id-based
“The MLT query simply extracts
the text from the input
document, analyzes it, usually
using the same analyzer at the
field, then selects the top K
terms with highest tf-idf to form a
disjunctive query of these
terms.”
15. ELASTICSEARCH
MORE LIKE THIS (MLT)
LIMITATIONS
15
Source: Lucene MoreLikeThis.java
• Earlier, in 2016 when the doc id is
supplied, the text content was
concatenated, the search was
done over all specified fields
• No way to boost individual fields.
Matching on title can be more
important than on content
• Now, the query is done field-by-
field. Cannot boost, or match
desc field with the content field.
• We wanted to do cross-field
matching with boosting
16. 16
USING SEARCH
FOR RECOMMENDATIONS
Decided to concat
fields manually and
use the match query
+can boost fields
+can do cross-field
matching
+can do cross-type
matching
- slower
18. GOALS
• Recommendations matching the user interests
rather than the context
• The user should understand the recommender
model (interpretability)
• The user should be able to adjust the recommender
(interactive)
• In general, we wanted the user to understand and
control the recommendations when needed
18
20. CONCEPT IDENTIFICATION
PIPELINE
20
Extracted
Text
Content
Items on platform
Binary Text
File
.pdf .docx
Image
with text
.png .jpg .tiff
Image
Audio
Video
Content
Extraction
Plain Text File
Optical
Character
Recognition
Speech-To-
Text
Visual Image
Recognition
Visual Video
Recognition
Content
Analysis
Content and
Concepts
Indexing
Identified
Concepts
Indexed
Identified
Concepts
and
Text
Content
Recommender
System
Leptonica
Tesseract
21. Pdf Report
Powerpoint
Presentation
Image with
Text
Youtube
Video
Σw*UA
*DC
accessed
rated
commented
downloaded
Education
Educational psychology
Knowledge
Learning
Knowledge Management
Human-Computer Interaction
Interdisciplinarity
Academia
Systems thinking
Scientific method
Educational technology
Virtual learning environment
User
Identified Concepts (DC)
Identified User Concepts
(UC)
Tracked Activities (UA)
Education
Educational psychology
Knowledge
Learning
Knowledge Management
Systems thinking
Scientific method
Educational technology
Virtual learning environment
Learning
Knowledge Management
Human-Computer Interaction
Interdisciplinarity
Education
Educational psychology
Academia
21
PROPOSAL
INTERESTS PROFILE
24. SUMMARY
24
DEMONSTRATED HOW TO USE ELASTICSEARCH FOR
• Contextual recommendations (relevant to the context)
• Personalized recommendations (relevant to the user)
• More LikeThis vs Common queries (e.g., match)
POSSIBLE EXTENSIONS
• Displaying highlights to explain the recommendations
• Using the Percolator to notify the user about new relevant
content as it gets uploaded
• Alternative ways of constructing the user profile
• Trying collaborative filtering, user-user similarity can be
implemented with Elasticsearch