Mais conteúdo relacionado Semelhante a Query relaxation - A rewriting technique between search and recommendations (20) Query relaxation - A rewriting technique between search and recommendations1. Query relaxation
A rewriting technique between search and recommendations
René Kriegler, @renekrie
Haystack - The Search Relevance
Conference
24 April 2019
2. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
About me
More than 10 years experience as a freelance search consultant, often in a role
for OpenSource Connections
Focus:
- Search relevance optimisation
- E-commerce search
- Solr
- Coaching teams to establish search within their organisation
Organiser of MICES - Mix-Camp E-commerce Search (Berlin, 19 June,
mices.co, right after Berlin Buzzwords)
Maintainer of Querqy (OSS query rewriting library - github.com/renekrie/querqy)
2
3. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
No results
3
4. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
No results - strategies
Apply synonyms and hyponyms (laptop = notebook; shoes => trainers)
Spelling correction (Did you mean ...? / We’ve searched for ...)
Also search in low-quality data fields
Loosen boolean constraints (AND -> OR, mm<100%)
Apply hypernyms (boots => shoes)
Use more distant semantic relation (beard balm => trimmer)
Show more general recommendations (related to user’s shopping history,
popular items)
4
5. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
No results - strategies
Apply synonyms and hyponyms
Spelling correction
Also search in low-quality data fields
Loosen boolean constraints
Apply hypernyms
Use more distant semantic relation
Show more general recommendations
5
Explainable?
(in e-commerce search)
Don’t want to tell
mm: no; AND/OR: yes, but bad UX
Don’t need to tell
Can be hard
6. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
No results - Query relaxation
6
Explainable!
(& conversational!)
7. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Query relaxation
Which query term should be removed?
7
8. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Query relaxation - intuition
8
iphone 9 => iphone 9
(*) iphone 9 => iphone 9
9. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Query relaxation - intuition
9
iphone 9 plus => iphone 9 plus
(?) iphone 9 plus => iphone 9 plus
(?) iphone 9 plus => iphone 9 plus
10. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Query relaxation - intuition
10
black boots => black boots
(*) black boots => black boots
11. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Query relaxation - intuition
11
purple boots => purple boots
(?) purple boots => purple boots
12. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Query relaxation - intuition
12
(?) usb charger 12v => usb charger 12v
(?) usb charger 12v => usb charger 12v
(?) usb charger 12v => usb charger 12v
13. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Query intent & information need
Apply synonyms and hyponyms
Spelling correction
Also search in low-quality data fields
Loosen boolean constraints
Apply hypernyms
Use more distant semantic relation
Show more general recommendations
13
Trying to match original information
need
Remotely related to user intent
Query relaxation
14. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Query relaxation
14
“A popular approach to cope with empty-answers is query relaxation, which attempts to reformulate the
original query into a new query, by removing or relaxing conditions, so that the result of the new query
is likely to contain the items of interest for that user.” (Mottin et al., 2013)
“We present a method which we call relaxation for expanding deductive database and logic
programming queries. The set of answers obtained with the relaxation method includes both answers
deduced traditionally and answers related in some way with the original query. The relaxation method
expands the scope query by relaxing the constraints implicit in the query.” (Gaasterland et al., 1992)
“An extended query-document matching system is described in this study that relaxes the stringent
requirements of the conventional Boolean retrieval operations.” (Salton et al., 1983)
15. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Query relaxation
15
=> How can we find the best query term to be removed from the query so that
“... the result of the new query is likely to contain the items of interest for that user”
“... answers [are] related in some way with the original query” ?
=> How can we test, compare and optimise solutions?
16. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Online testing
16
Click-through-rate / hit rate
Exit rate / time spent on site
=> Do we manage to keep the user interacting with our site?
=> similar to recommendations / exploratory search
17. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Finding the term to be dropped: data sets
17
Data sets for training and evaluation
Find pairs:
- a long query having 0 results
- a corresponding relaxed query having results
18. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Finding the term to be dropped: data sets
18
FREQ: Query frequencies
- Have we observed the original and the relaxed query before? (We want to
make sure that we produce a meaningful query.)
COOC: Query cooccurrences per session
- Have the original and rewritten query occurred together in a session?
=> Can we find the original/rewritten query pair in tracking data? How often?
(more often is better)
19. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
0 - Drop random term (baseline)
19
Remove a random term from the query
20. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
1 - Drop shortest term
20
Remove the shortest term from the query
21. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
2 - Drop shortest non-alphabetical term
21
Remove the shortest term that doesn’t contain any alphabetical character
22. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
3 - Combined 1 and 2
22
Remove the shortest term that doesn’t contain any alphabetical character, fall
back to removing shortest term if all terms have >=1 alphabetical character
23. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
4/5 - Drop most/least frequent term
23
Remove the term with the highest/lowest index frequency
24. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
6/7 - Drop term with highest/lowest entropy
24
Remove the term with the highest/lowest entropy across navigational categories
25. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
8 - Keep most similar query (Word2vec)
25
Use the rewritten query that is most similar to the original query based on
Word2vec embeddings [as mentioned in D.Tunkelang, Query relaxation,
https://bit.ly/2ItxF3Z]
26. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Word2vec (CBOW)
26
w (t-2)
pepe jeans
w(t)
projection
Input
Output
slim cut
w (t-1) w (t+1) w (t+2)
london
london
Sequence of words
27. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
8 - Keep most similar query (Word2vec)
27
Use the rewritten query that is most similar to the original query based on
Word2vec embeddings
Train Word2Vec embeddings
- word = query term, window = query
- 300 dimensions
Use sum of word(=term) vectors to represent the queries (original/rewritten)
Calculate cosine similarity between original query and each rewritten query
Use rewritten query that is most similar to the original query
28. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
8 - Keep most similar query (Word2vec)
28
Use the rewritten query that is most similar to the original query based on
Word2vec embeddings
29. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
9 - Keep most similar query (Query2vec)
29
Use the rewritten query that is most similar to the original query based on query
embeddings
[Grbovic et al., Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising. SIGIR
2016]
30. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
‘Query2vec’ (CBOW)
30
q (t-2)
smartphone smartphone 64g
q (t)
projection
Input
Output
iphone iphone 64g
q (t-1) q (t+1) q (t+2)
galaxy 64g
galaxy 64g
Queries in a session
31. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
9 - Keep most similar query (Query2vec)
31
Use the rewritten query that is most similar to the original query based on Query
embeddings
32. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
10 - MNN with Word2vec input
32
Predict the term to be dropped using a multi-layer neural network (MNN) with
Word2vec embeddings as input.
33. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
10 - MNN with Word2vec input
33
0: 0.01
1:-0.94
...
300: 0.18
0: 0.63
1: 0.56
...
300: 0.04
0:-0.59
1: 0.02
...
300: 0.77
0: 0.00
1: 0.00
...
300: 0.00
0: 0.00
1: 0.00
...
300: 0.00
0: 0.00
1: 0.00
...
300: 0.00
0: 0.00
1: 0.00
...
300: 0.00
0: 0.00
1: 0.00
...
300: 0.00
nike boots 11
0: 0 0: 1 0: 0 0: 0 0: 0 0: 0 0: 0 0: 0
2 hidden layers
Input
Output
34. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
10 - MNN with Word2vec input
34
Predict the term to be dropped using a multi-layer neural network (MNN) with
Word2vec embeddings as input
35. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
11 - MNN / Word2vec plus wordshape
35
Predict the term to be dropped using a multi-layer neural network (MNN) with
Word2vec embeddings and wordshape features as input.
Add additional dimensions to the input vector:
- Word length
- Number of digits
- Does the word have an ‘e’ in the penultimate or ultimate position?
36. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
11 - MNN / Word2vec plus wordshape
36
...
301: 4.00
302: 0.00
303: 1.00
...
301: 5.00
302: 0.00
303: 0.00
...
301: 2.00
302: 2.00
303: 0.00
...
301: 0.00
302: 0.00
303: 0.00
...
301: 0.00
302: 0.00
303: 0.00
...
301: 0.00
302: 0.00
303: 0.00
...
301: 0.00
302: 0.00
303: 0.00
...
301: 0.00
302: 0.00
303: 0.00
nike boots 11
0: 0 0: 1 0: 0 0: 0 0: 0 0: 0 0: 0 0: 0
2 hidden layers
Input
Output
37. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
11 - MNN / Word2vec plus wordshape
37
Predict the term to be dropped using a multi-layer neural network (MNN) with
Word2vec embeddings and wordshape features as input.
38. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
11/12 - MNN / Word2vec plus term stats
38
Predict the term to be dropped using a multi-layer neural network (MNN) with
Word2vec embeddings and per-field DF or index frequency.
39. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Conclusion
39
Query relaxation:
- best understood as a query recommendation
- information need not necessarily matched but relaxed query still related to
user intent
- can be communicated nicely to the user (‘conversational’)
Best approach to find term to be dropped:
- Multi-layer neural network with Word2Vec plus wordshape features as
inputs. It can be extended to incorporate further features and optimisation
targets.
40. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie)
Thank you!
http://www.rene-kriegler.com
@renekrie
40