Gain insight into the state-of-the-art deep learning algorithms being used to power e-commerce search at Target and how to customize Solr to blend multiple ML signals at a large scale.
Speakers:
Aashish Dattani, Lead Data Engineer, Target
Richard Wang, Principal AI Scientist, Target
Sunil Srinivasan, Lead Engineer, Target
Spark NLP: State of the Art Natural Language Processing at Scale
Â
Semelhante a Using Deep Learning and Customized Solr Components to Improve search Relevancy at Target - Aashish Dattani, Richard Wang & Sunil Srinivasan, Target
Search Quality Evaluation to Help Reproducibility : an Open Source ApproachAlessandro Benedetti
Â
Semelhante a Using Deep Learning and Customized Solr Components to Improve search Relevancy at Target - Aashish Dattani, Richard Wang & Sunil Srinivasan, Target (20)
3. Target
âą 1,855 stores in the United States
âą 39 distribution centers in the United
States
âą 350,000+ team members worldwide
âą Online business at target.com
âą Global offices in China, Hong Kong and
India
4. About us
S U N I L S R I N I VA S A N
Lead Engineer
A A S H I S H D AT TA N I
Lead AI Engineer
R I C H A R D WA N G
Principal AI Engineer
5. Agenda
âą Solr at Target
âą Architecture Overview
âą Solr Components
âą Deep Learning
6. Moved away from proprietary engine
to Solr
Growing index by the day
Highly performant engine
Customized for relevancy
and store availability
5 YEARSO N S O L R
2+
MILLIONS K U S
P95â
8. Querying Solr
Searchable Attributes using eDisMax query parser
âą Title - Women's Sling Backpack - Universal Thread
âą Category - Women > Women's Accessories > Handbags > Fashion Backpacks
âą Item Type - Backpacks
âą Description - Keep your essentials close at hand with this Sling Backpack from Universal Threadâą.
âą Augmented/Normalized data
â feet to ft, quart to qt, in to inch, â to inch , etc..
9. Querying Solr
R E C A L L A N D P R E C I S I O N C O N T R O L L E D B Y A C O M B I N AT I O N O F
Category/Attribute classification (bq parameter)
â âstudent deskâ belongs to `desksâ category/sku hierarchy
Filtering based on attributes (fq parameter)
â âstudent deskâ restricts to `desksâ , âhutch topsâ, âkids deskâ categories
Elevate to show list of most popular items (customized component)
â query to popular sku based on ranking signal
Precision component that filters out skus based on a threshold
10. Solr Components
C U S T O M C O M P O N E N T S
T O I M P R O V E R E L E VA N C Y, W E U S E A C O M B I N AT I O N O F C U S T O M I Z E D P O S T F I L T E R S
A N D C O M P O N E N T S
âą Precision Control (post filter)
âą Score Combination Function (post filter)
âą Custom Elevate (component)
11. Precision Control
T W O - PA S S P R O B L E M
Filter out documents based on
score distribution
This requires us to do two
passes!
S O L U T I O N
Post-filter API has collect()
and finish() methods
Do first pass in collect() and
second pass in finish()
score
doc rank
40%
14. Combining scores
D I F F E R E N T S C O R I N G F U N C T I O N S
⹠Linear weighted combination: w1s1 + w2s2 + ⊠+ wNsN
âą Polynomial combination: w1s1
n1 + w2s2
n2 + ⊠+ wNsN
nN
âą Step functions
â Different functions based on score tier
â Each tier optimizes for a different metric
15. Signal sources
L O O K I N G U P VA L U E S
âą Category/Brand/Attribute boost â Reverse index
â e.g. brand:goodfellow^20
âą SKU-level query-dependent boost â Reverse index
â e.g. sku:1145367 is top selling SKU for a given query
âą SKU-level query-independent boost â Forward index (docValues)
â e.g. sku:1145367 based on newness
16. Elevate component
D E S C R I P T I O N
âą Force certain results to the top of the ranking order
âą Takes precedence over other sort profiles (e.g. score)
L I M I TAT I O N S
âą Can only read from a static .xml file
âą Does not allow for reading ranks from different sources
18. Custom Elevate
C U S T O M I Z E D F E AT U R E S
âą Bury SKUs to the bottom of the result list
âą Input elevated values via URL parameters
â e.g. âŠ&elevate=sku:1,sku:2,sku:3&bury=sku:10,sku:11
âą Read elevated signals from doc values (forward lookup)
â e.g. store availability etc.
19. Query Understanding
Objective: To accurately and fully understand user intent (in terms of
product attributes) based on input search query.
Example query: âc9 running shoes for boysâ
âą Brand: C9 Champion
âą Gender: male
âą Item type: athletic shoes, sneakers
âą Age group: kids, toddler, junior
âą Material: polyester, plastic, nylon
We treat this as a classification problem, and we designed a classification
framework that, for each product attribute, can automatically generate a
model to classify any query into that attribute.
20. Query Classification Overview
First, we gather abundant training data
1. User searches â behavior data (click, add to cart, purchase, etc.)
2. Product attributes (categories, colors, sizes, brands, gender, etc.)
Second, we train machine-learned models (per attribute)
Training data consists of a list of (query, attribute value) pairs:
âą For category attribute: (âshoesâ, athletic shoes), (âshoesâ, sneakers), etc.
During prediction (serving) time
Input: any search query (e.g. âstudent deskâ)
Output: a list of predicted attribute values (e.g. desks, kids desk, hutch tops, etc.), each with
a probability, that are passed to Solr via the bq, fq, and a custom parameter.
22. Training Data Preparation
We use (1) Clickstream And (2) Product Attribute data:
(1) Search query â Product SKUs clicked/carted/purchased
â Past 2 years of clickstream data, 1.5M+ unique queries post-filtering
(2) Product SKU â Product attribute values
â Attributes (categories, gender, brands, etc.) are from Targetâs item catalog (2M+ SKUs)
Combining (1) and (2) above, we get:
âą Search query â list of attribute values, each with a score
âą Score of a attribute value V given a query Q is:
â đ(đ | đ) =
# đđ đĄđđđđ đ đđ đđđđđđđ,đđđđĄđđ,đđąđđâđđ đđ đđđŁđđ đ
đĄđđĄđđ # đđ đđđđąđđđđđđđ đđ đ
â For the category attribute & for the query ârunning shoesâ:
athletic shoes (0.5), sneakers (0.2), ⊠sandals (0.01)
23. Neural Model
Training
Our hyperparameters:
Embedding dimension: d = 100
Region sizes (n-grams): 1, 2, 3, 4, 5
Filters per region: 64
Drop-out rate: 0.2
Max tokens per query: 10
# of output classes: varies depending on attribute
room
essentials
full
size
bedding
sheet
set
24. Evaluation Metrics
Precision of a query: # of correct predicted attribute values over total # of predictions
for that query from the classifier
âą The higher the precision, the more accurate the predictions are.
Recall of a query: # of correct predicted attribute values over total # of attribute values
there are for that query in the test set
âą The higher the recall, the more coverage of those attribute values in the test set.
Top-N accuracy:
âą For a query, if any of the top N predictions is relevant, then it scores a 1, otherwise
0.
Experimental settings:
Attribute # of Train Queries # of Dev Queries # of Test Queries # of Classes
Category 1.5M 12K 12K ~4K
27. Evaluation Results
F1 Score is harmonic mean between precision and
recall
The more parameters in a model, the better the F1
score
28. Takeaway
âą Our classifiers achieve precision and recall above 90%, and have an
accuracy of top 5 predictions above 96%
âą With the classification pipeline, a new model can be automatically
generated on any attribute within 18 hours
âą By using state-of-the-art neural network techniques, in conjunction
with customized Solr components, we have improved our search
relevancy by more than 20%
31. STAY CONNECTED
Twitter @activate_conf
Facebook @activateconf
#Activate19
Log in to wifi, follow Activate on social media,
and download the event app where you can
submit an evaluation after the session
WIFI NETWORK: Activate2019
PASSWORD: Lucidworks
DOWNLOAD THE ACTIVATE 2019 MOBILE APP
Search Activate2019 in the App/Play store
Or visit: http://crowd.cc/activate19