SlideShare a Scribd company logo
1 of 25
Download to read offline
● Search is one of the most
important discovery tools in
E-commerce.
● Powers other features like
merchandising (promotions),
recommendations etc.
● Accounts for big fraction of the
units sold and GMV.
● Important signals that
affect search: Price,
offers, popularity,
availability, serviceability
etc.
● Used in ranking of
products.
● Exposed as filters and
sorts to end users.
● These signals are very
dynamic, particularly
during sales.
● E-commerce search != websearch.
● Documents have a structure to them
● Queries have an implicit structure
● Challenges:
○ Large document collection with a long heavy tail
○ Extremely high rate of changes/updates (Thousands per sec)
○ Geo specific ranking
○ Multi-objective optimization (GMV, Units, Ads revenue, Long
Term Value)
● Opportunities:
○ Broad queries: personalization can play a huge role
● Queries per day: XXX Millions / week
● Latencies:
○ Average: ~ 100 ms
○ Median: ~ 50 ms
○ 90th percentile: ~ 500 ms
● Documents retrieved and scored from index:
○ Median: 1K to 10K
○ 95th percentile: 200K to 500K
○ 99th percentile: 500K to 3M+
● Search CTR: Around 50%
● Architectural overview of the search platform
○ Serving and Ingestion
○ Serving functional view
○ Serving architectural view
○ Ingestion architectural view
○ Example ingestion topology
● Search quality
○ Challenges
○ Life of a query: Typical flow for query understanding
○ Illustrative problems
● 1,000,000 Compute Cores
● 2.56 Petabytes RAM
● 120 Petabytes Disk
Storage
● 1 Petabytes NVMe SSD
● 128 Tbps bisection
bandwidth Clos network
Query Rewriter
(Spell Check, Concept, NLP, Intent,
Augmentation,Retrieval/Scoring query
formulation)
Reverse Proxy
(Geo Coding, User Context, Caching,
Isolation, Rate Limit, Tee-off test framework)
Search Broker
(Distributed Search across shards, Blending
Of Results from shards)
Searcher
(Matching, Scoring, Faceting, Top-K Retrieval
(pass-1 ranking))
Text index NRT index
Metadata
Re-ranking
(Pass-2 Ranking) - ML Model
Pluggable
Ranking Models
Pluggable
Rewriter Modules
Serving:
Arch View
● Architectural overview of the search platform
○ Serving and Ingestion
○ Serving functional view
○ Serving architectural view
○ Ingestion architectural view
○ Example ingestion topology
● Search quality
○ Challenges
○ Life of a query: Typical flow for query understanding
○ Illustrative problems
● Marketplace
○ Catalog entries vary in quality from seller to seller. Spam is
rampant.
● Diversity of users
● Mobile heavy users: Real estate on UI
● Poor internet connectivity
● Literacy/Internet awareness
● Language
● Economic power
● Regional preferences
Abstraction: City-tier
Query/Intent Solicitation
Result Presentation
Product Ranking
40% increase in proportion of tier-3 customers vis-a-vis metro
Query: samsang
Relative ratio of query Tier-3 Vs Metro: 1.8
Query: jins
Relative ratio of query Tier-3 Vs Metro: 2.2
Query
Scoring
Normalisation(Index time as well)
- String clean-up
- lower
Spell Correction
- Resource-based
- term->term
- Query->query
- Online
Init
Context
Phrasing (Index time as well)
- Frequent bi/tri grams
Stemming (Index time as well)
- Core e-commerce
stemmer
- plurals
Common MetaData Store (Query Level)
- Raw Data: metrics (CTR, Impression, NDCG…)
- Derived Data: Store, LM score, Features
Synonyms
- Resource-based
Intent
- Deductions
- Tagging (CRF)
Query Rewrite
- Best query selection
- Partial match
SOLR interface
Query Understanding
Output Generator
Retrieval
ranking
logic
Store Classifier
Query LM
Feature Store
Classification
• Special patterns:
– Segmented words: lgnexus5
Counting: “samsang” & no-click followed
by “samsung”& click a million times
– Context aware counting
• Language modeling and edit distance
• Term to vector models in deep learning.
Specific
General
● Intent: From query tokens to (implicit) attributes that are
represented by those tokens
● Examples:
○ “red tape shoes” -> (brand) “red tape” (store) “shoes”
○ “kids party dress 4-5 years pack of 2” -> (ideal_for) “kids”
(occasion) “party” (store) “dress” (size) “4-5 years”
(pack_of) “pack of 2”
○ “samsung e6 cases” -> (“compatible_with”) “samsung e6”
(store) “cases”
● Memorization, Language modeling, CRF
Past orders Product Views
Users’ activity on the platform
Customised Search Ranking
for User-segment
economical expensive
shoes
watches
Past orders Product Views
5 price ranges defined for each
vertical.
1 2 3 4 5
User-Segments based on price affinities
Users’ past activity on the platform.
Customised Search Ranking
for each User-segment
Price
Personalization
#ofusers
E-commerce Search Platform Architecture and Quality

More Related Content

What's hot

Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Sease
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information RetrievalRoi Blanco
 
Search, Discovery and Questions at Quora
Search, Discovery and Questions at QuoraSearch, Discovery and Questions at Quora
Search, Discovery and Questions at QuoraNikhil Dandekar
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 
Find it! Nail it! Boosting e-commerce search conversions with machine learnin...
Find it! Nail it!Boosting e-commerce search conversions with machine learnin...Find it! Nail it!Boosting e-commerce search conversions with machine learnin...
Find it! Nail it! Boosting e-commerce search conversions with machine learnin...Rakuten Group, Inc.
 
Learning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwiseLearning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwiseHasan H Topcu
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxElasticsearch
 
AI SEO Presentation
AI SEO PresentationAI SEO Presentation
AI SEO Presentationaiseoadmin
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at NetflixLinas Baltrunas
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingDatabricks
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsLior Rokach
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated RecommendationsHarald Steck
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix ScaleJustin Basilico
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Saeedeh Shekarpour
 
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Sease
 
Fact Store at Scale for Netflix Recommendations with Nitin Sharma and Kedar S...
Fact Store at Scale for Netflix Recommendations with Nitin Sharma and Kedar S...Fact Store at Scale for Netflix Recommendations with Nitin Sharma and Kedar S...
Fact Store at Scale for Netflix Recommendations with Nitin Sharma and Kedar S...Databricks
 
Consuming RealTime Signals in Solr
Consuming RealTime Signals in Solr Consuming RealTime Signals in Solr
Consuming RealTime Signals in Solr Umesh Prasad
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 

What's hot (20)

Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Search, Discovery and Questions at Quora
Search, Discovery and Questions at QuoraSearch, Discovery and Questions at Quora
Search, Discovery and Questions at Quora
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Find it! Nail it! Boosting e-commerce search conversions with machine learnin...
Find it! Nail it!Boosting e-commerce search conversions with machine learnin...Find it! Nail it!Boosting e-commerce search conversions with machine learnin...
Find it! Nail it! Boosting e-commerce search conversions with machine learnin...
 
Learning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwiseLearning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwise
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
AI SEO Presentation
AI SEO PresentationAI SEO Presentation
AI SEO Presentation
 
Context Aware Recommendations at Netflix
Context Aware Recommendations at NetflixContext Aware Recommendations at Netflix
Context Aware Recommendations at Netflix
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce Setting
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Eval...
 
Fact Store at Scale for Netflix Recommendations with Nitin Sharma and Kedar S...
Fact Store at Scale for Netflix Recommendations with Nitin Sharma and Kedar S...Fact Store at Scale for Netflix Recommendations with Nitin Sharma and Kedar S...
Fact Store at Scale for Netflix Recommendations with Nitin Sharma and Kedar S...
 
Consuming RealTime Signals in Solr
Consuming RealTime Signals in Solr Consuming RealTime Signals in Solr
Consuming RealTime Signals in Solr
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 

Similar to E-commerce Search Platform Architecture and Quality

A Survey of Recommender System Techniques and the E-commerce Domain.pptx
A Survey of Recommender System Techniques and the E-commerce Domain.pptxA Survey of Recommender System Techniques and the E-commerce Domain.pptx
A Survey of Recommender System Techniques and the E-commerce Domain.pptxmansivekaria09
 
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...Marketing Festival
 
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Lucidworks
 
Anatomy of Search Relevance: From Data To Action
Anatomy of Search Relevance: From Data To ActionAnatomy of Search Relevance: From Data To Action
Anatomy of Search Relevance: From Data To ActionSaïd Radhouani
 
Search analytics what why how - By Otis Gospodnetic
 Search analytics what why how - By Otis Gospodnetic  Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic lucenerevolution
 
Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis GospodneticSearch analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodneticlucenerevolution
 
Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxCloudBusiness2
 
Personalized search
Personalized searchPersonalized search
Personalized searchToine Bogers
 
Nicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at TwitterNicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at TwitterDavid Garrison
 
Data Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science CatalystData Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science CatalystFormulatedby
 
Being a Data Science Product Manager
Being a Data Science Product ManagerBeing a Data Science Product Manager
Being a Data Science Product ManagerRam Narayan Subudhi
 
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...William Renedo
 
Big Data in Ecommerce
Big Data in EcommerceBig Data in Ecommerce
Big Data in EcommerceTeguh Nugraha
 
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...TAUS - The Language Data Network
 
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUSThe TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUSTAUS - The Language Data Network
 
Deepak Tiwari, Lyft
Deepak Tiwari, LyftDeepak Tiwari, Lyft
Deepak Tiwari, LyftHilary Ip
 
Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017Clovis Chapman
 

Similar to E-commerce Search Platform Architecture and Quality (20)

A Survey of Recommender System Techniques and the E-commerce Domain.pptx
A Survey of Recommender System Techniques and the E-commerce Domain.pptxA Survey of Recommender System Techniques and the E-commerce Domain.pptx
A Survey of Recommender System Techniques and the E-commerce Domain.pptx
 
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
 
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
 
Anatomy of Search Relevance: From Data To Action
Anatomy of Search Relevance: From Data To ActionAnatomy of Search Relevance: From Data To Action
Anatomy of Search Relevance: From Data To Action
 
Search analytics what why how - By Otis Gospodnetic
 Search analytics what why how - By Otis Gospodnetic  Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic
 
Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis GospodneticSearch analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic
 
Groupon at H2O World - London
Groupon at H2O World - LondonGroupon at H2O World - London
Groupon at H2O World - London
 
Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptx
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
Big data: Bringing competition policy to the digital era – VARIAN – November ...
Big data: Bringing competition policy to the digital era – VARIAN – November ...Big data: Bringing competition policy to the digital era – VARIAN – November ...
Big data: Bringing competition policy to the digital era – VARIAN – November ...
 
Nicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at TwitterNicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at Twitter
 
Data Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science CatalystData Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science Catalyst
 
Dicon interactive
Dicon interactiveDicon interactive
Dicon interactive
 
Being a Data Science Product Manager
Being a Data Science Product ManagerBeing a Data Science Product Manager
Being a Data Science Product Manager
 
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
 
Big Data in Ecommerce
Big Data in EcommerceBig Data in Ecommerce
Big Data in Ecommerce
 
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
 
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUSThe TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
 
Deepak Tiwari, Lyft
Deepak Tiwari, LyftDeepak Tiwari, Lyft
Deepak Tiwari, Lyft
 
Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017
 

More from Naresh Jain

Problem Solving Techniques For Evolutionary Design
Problem Solving Techniques For Evolutionary DesignProblem Solving Techniques For Evolutionary Design
Problem Solving Techniques For Evolutionary DesignNaresh Jain
 
Agile India 2019 Conference Welcome Note
Agile India 2019 Conference Welcome NoteAgile India 2019 Conference Welcome Note
Agile India 2019 Conference Welcome NoteNaresh Jain
 
Organizational Resilience
Organizational ResilienceOrganizational Resilience
Organizational ResilienceNaresh Jain
 
Improving the Quality of Incoming Code
Improving the Quality of Incoming CodeImproving the Quality of Incoming Code
Improving the Quality of Incoming CodeNaresh Jain
 
Agile India 2018 Conference Summary
Agile India 2018 Conference SummaryAgile India 2018 Conference Summary
Agile India 2018 Conference SummaryNaresh Jain
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 ConferenceNaresh Jain
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 ConferenceNaresh Jain
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 ConferenceNaresh Jain
 
Pilgrim's Progress to the Promised Land by Robert Virding
Pilgrim's Progress to the Promised Land by Robert VirdingPilgrim's Progress to the Promised Land by Robert Virding
Pilgrim's Progress to the Promised Land by Robert VirdingNaresh Jain
 
Concurrent languages are Functional by Francesco Cesarini
Concurrent languages are Functional by Francesco CesariniConcurrent languages are Functional by Francesco Cesarini
Concurrent languages are Functional by Francesco CesariniNaresh Jain
 
Erlang from behing the trenches by Francesco Cesarini
Erlang from behing the trenches by Francesco CesariniErlang from behing the trenches by Francesco Cesarini
Erlang from behing the trenches by Francesco CesariniNaresh Jain
 
Setting up Continuous Delivery Culture for a Large Scale Mobile App
Setting up Continuous Delivery Culture for a Large Scale Mobile AppSetting up Continuous Delivery Culture for a Large Scale Mobile App
Setting up Continuous Delivery Culture for a Large Scale Mobile AppNaresh Jain
 
Towards FutureOps: Stable, Repeatable environments from Dev to Prod
Towards FutureOps: Stable, Repeatable environments from Dev to ProdTowards FutureOps: Stable, Repeatable environments from Dev to Prod
Towards FutureOps: Stable, Repeatable environments from Dev to ProdNaresh Jain
 
Value Driven Development by Dave Thomas
Value Driven Development by Dave Thomas Value Driven Development by Dave Thomas
Value Driven Development by Dave Thomas Naresh Jain
 
No Silver Bullets in Functional Programming by Brian McKenna
No Silver Bullets in Functional Programming by Brian McKennaNo Silver Bullets in Functional Programming by Brian McKenna
No Silver Bullets in Functional Programming by Brian McKennaNaresh Jain
 
Functional Programming Conference 2016
Functional Programming Conference 2016Functional Programming Conference 2016
Functional Programming Conference 2016Naresh Jain
 
Agile India 2017 Conference
Agile India 2017 ConferenceAgile India 2017 Conference
Agile India 2017 ConferenceNaresh Jain
 
Unleashing the Power of Automated Refactoring with JDT
Unleashing the Power of Automated Refactoring with JDTUnleashing the Power of Automated Refactoring with JDT
Unleashing the Power of Automated Refactoring with JDTNaresh Jain
 
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo KimGetting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo KimNaresh Jain
 

More from Naresh Jain (20)

Problem Solving Techniques For Evolutionary Design
Problem Solving Techniques For Evolutionary DesignProblem Solving Techniques For Evolutionary Design
Problem Solving Techniques For Evolutionary Design
 
Agile India 2019 Conference Welcome Note
Agile India 2019 Conference Welcome NoteAgile India 2019 Conference Welcome Note
Agile India 2019 Conference Welcome Note
 
Organizational Resilience
Organizational ResilienceOrganizational Resilience
Organizational Resilience
 
Improving the Quality of Incoming Code
Improving the Quality of Incoming CodeImproving the Quality of Incoming Code
Improving the Quality of Incoming Code
 
Agile India 2018 Conference Summary
Agile India 2018 Conference SummaryAgile India 2018 Conference Summary
Agile India 2018 Conference Summary
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 Conference
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 Conference
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 Conference
 
Pilgrim's Progress to the Promised Land by Robert Virding
Pilgrim's Progress to the Promised Land by Robert VirdingPilgrim's Progress to the Promised Land by Robert Virding
Pilgrim's Progress to the Promised Land by Robert Virding
 
Concurrent languages are Functional by Francesco Cesarini
Concurrent languages are Functional by Francesco CesariniConcurrent languages are Functional by Francesco Cesarini
Concurrent languages are Functional by Francesco Cesarini
 
Erlang from behing the trenches by Francesco Cesarini
Erlang from behing the trenches by Francesco CesariniErlang from behing the trenches by Francesco Cesarini
Erlang from behing the trenches by Francesco Cesarini
 
Setting up Continuous Delivery Culture for a Large Scale Mobile App
Setting up Continuous Delivery Culture for a Large Scale Mobile AppSetting up Continuous Delivery Culture for a Large Scale Mobile App
Setting up Continuous Delivery Culture for a Large Scale Mobile App
 
Towards FutureOps: Stable, Repeatable environments from Dev to Prod
Towards FutureOps: Stable, Repeatable environments from Dev to ProdTowards FutureOps: Stable, Repeatable environments from Dev to Prod
Towards FutureOps: Stable, Repeatable environments from Dev to Prod
 
Value Driven Development by Dave Thomas
Value Driven Development by Dave Thomas Value Driven Development by Dave Thomas
Value Driven Development by Dave Thomas
 
No Silver Bullets in Functional Programming by Brian McKenna
No Silver Bullets in Functional Programming by Brian McKennaNo Silver Bullets in Functional Programming by Brian McKenna
No Silver Bullets in Functional Programming by Brian McKenna
 
Functional Programming Conference 2016
Functional Programming Conference 2016Functional Programming Conference 2016
Functional Programming Conference 2016
 
Agile India 2017 Conference
Agile India 2017 ConferenceAgile India 2017 Conference
Agile India 2017 Conference
 
The Eclipse Way
The Eclipse WayThe Eclipse Way
The Eclipse Way
 
Unleashing the Power of Automated Refactoring with JDT
Unleashing the Power of Automated Refactoring with JDTUnleashing the Power of Automated Refactoring with JDT
Unleashing the Power of Automated Refactoring with JDT
 
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo KimGetting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
 

Recently uploaded

Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesKrzysztofKkol1
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profileakrivarotava
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolsosttopstonverter
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 

Recently uploaded (20)

Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilitiesAmazon Bedrock in Action - presentation of the Bedrock's capabilities
Amazon Bedrock in Action - presentation of the Bedrock's capabilities
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
SoftTeco - Software Development Company Profile
SoftTeco - Software Development Company ProfileSoftTeco - Software Development Company Profile
SoftTeco - Software Development Company Profile
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
eSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration toolseSoftTools IMAP Backup Software and migration tools
eSoftTools IMAP Backup Software and migration tools
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 

E-commerce Search Platform Architecture and Quality

  • 1.
  • 2.
  • 3. ● Search is one of the most important discovery tools in E-commerce. ● Powers other features like merchandising (promotions), recommendations etc. ● Accounts for big fraction of the units sold and GMV.
  • 4. ● Important signals that affect search: Price, offers, popularity, availability, serviceability etc. ● Used in ranking of products. ● Exposed as filters and sorts to end users. ● These signals are very dynamic, particularly during sales.
  • 5. ● E-commerce search != websearch. ● Documents have a structure to them ● Queries have an implicit structure ● Challenges: ○ Large document collection with a long heavy tail ○ Extremely high rate of changes/updates (Thousands per sec) ○ Geo specific ranking ○ Multi-objective optimization (GMV, Units, Ads revenue, Long Term Value) ● Opportunities: ○ Broad queries: personalization can play a huge role
  • 6. ● Queries per day: XXX Millions / week ● Latencies: ○ Average: ~ 100 ms ○ Median: ~ 50 ms ○ 90th percentile: ~ 500 ms ● Documents retrieved and scored from index: ○ Median: 1K to 10K ○ 95th percentile: 200K to 500K ○ 99th percentile: 500K to 3M+ ● Search CTR: Around 50%
  • 7. ● Architectural overview of the search platform ○ Serving and Ingestion ○ Serving functional view ○ Serving architectural view ○ Ingestion architectural view ○ Example ingestion topology ● Search quality ○ Challenges ○ Life of a query: Typical flow for query understanding ○ Illustrative problems
  • 8. ● 1,000,000 Compute Cores ● 2.56 Petabytes RAM ● 120 Petabytes Disk Storage ● 1 Petabytes NVMe SSD ● 128 Tbps bisection bandwidth Clos network
  • 9.
  • 10. Query Rewriter (Spell Check, Concept, NLP, Intent, Augmentation,Retrieval/Scoring query formulation) Reverse Proxy (Geo Coding, User Context, Caching, Isolation, Rate Limit, Tee-off test framework) Search Broker (Distributed Search across shards, Blending Of Results from shards) Searcher (Matching, Scoring, Faceting, Top-K Retrieval (pass-1 ranking)) Text index NRT index Metadata Re-ranking (Pass-2 Ranking) - ML Model Pluggable Ranking Models Pluggable Rewriter Modules
  • 12.
  • 13.
  • 14. ● Architectural overview of the search platform ○ Serving and Ingestion ○ Serving functional view ○ Serving architectural view ○ Ingestion architectural view ○ Example ingestion topology ● Search quality ○ Challenges ○ Life of a query: Typical flow for query understanding ○ Illustrative problems
  • 15. ● Marketplace ○ Catalog entries vary in quality from seller to seller. Spam is rampant. ● Diversity of users ● Mobile heavy users: Real estate on UI ● Poor internet connectivity
  • 16. ● Literacy/Internet awareness ● Language ● Economic power ● Regional preferences Abstraction: City-tier Query/Intent Solicitation Result Presentation Product Ranking
  • 17. 40% increase in proportion of tier-3 customers vis-a-vis metro
  • 18. Query: samsang Relative ratio of query Tier-3 Vs Metro: 1.8 Query: jins Relative ratio of query Tier-3 Vs Metro: 2.2
  • 19.
  • 20. Query Scoring Normalisation(Index time as well) - String clean-up - lower Spell Correction - Resource-based - term->term - Query->query - Online Init Context Phrasing (Index time as well) - Frequent bi/tri grams Stemming (Index time as well) - Core e-commerce stemmer - plurals Common MetaData Store (Query Level) - Raw Data: metrics (CTR, Impression, NDCG…) - Derived Data: Store, LM score, Features Synonyms - Resource-based Intent - Deductions - Tagging (CRF) Query Rewrite - Best query selection - Partial match SOLR interface Query Understanding Output Generator Retrieval ranking logic Store Classifier Query LM Feature Store Classification
  • 21. • Special patterns: – Segmented words: lgnexus5 Counting: “samsang” & no-click followed by “samsung”& click a million times – Context aware counting • Language modeling and edit distance • Term to vector models in deep learning. Specific General
  • 22. ● Intent: From query tokens to (implicit) attributes that are represented by those tokens ● Examples: ○ “red tape shoes” -> (brand) “red tape” (store) “shoes” ○ “kids party dress 4-5 years pack of 2” -> (ideal_for) “kids” (occasion) “party” (store) “dress” (size) “4-5 years” (pack_of) “pack of 2” ○ “samsung e6 cases” -> (“compatible_with”) “samsung e6” (store) “cases” ● Memorization, Language modeling, CRF
  • 23. Past orders Product Views Users’ activity on the platform Customised Search Ranking for User-segment
  • 24. economical expensive shoes watches Past orders Product Views 5 price ranges defined for each vertical. 1 2 3 4 5 User-Segments based on price affinities Users’ past activity on the platform. Customised Search Ranking for each User-segment Price Personalization #ofusers