SlideShare a Scribd company logo
1 of 30
1
Related Searches at LinkedIn
Mitul Tiwari
Joint work with Azarias Reda, Yubin Park, Christian
Posse, and Sam Shah
LinkedIn
SIGIR Industrial Track 2013
2
Who am I
3
Outline
• About LinkedIn
• Related Searches
‣ Design
‣ Implementation
‣ Evaluation
4
LinkedIn by the numbers
• 175M+ members
• 2+ new user registrations per second
• 4.2 Billion people searches in 2011
• 9.3 Billion page views in Q2 2012
• 100+ million monthly active users in Q2 2012
Broad Range of Products
Profile Search Hiring Solutions
People You May Know
Skills
News
6
Related Searches at LinkedIn
• Millions of searches everyday
• Goal: Build related searches system at LinkedIn
• To help users to explore and refine their queries
Related Searches at LinkedIn
8
Related Searches
• Design
• Implementation
• Evaluation
9
Related Searches
• Design
• Implementation
• Evaluation
10
Design
• Signals
‣ Collaborative Filtering
‣ Query-Result Click graph
‣ Overlapping terms
• Length-bias
• Ensemble approach for unified recommendation
• Practical considerations
11
Design: Collaborative Filtering
• Searches correlated by time
‣ Searches done in the same session by the same user
‣ Collaborative filtering: implicit feedback
‣ TFIDF scoring to take care of popular queries (e.g. `Obama’)
Q1 Q2 Q3 Q4
Time
12
Design: Query-Result Clicks
• Searches correlated by result clicks
Q1
Qn
R1
Rm
14
Design: Overlapping Terms
• Searches with overlapping terms
‣ TFIDF scoring to give importance to terms
Software Developer
Software Engineer
Q1
Q2
15
Design: Length Bias
• Insight: clicks on suggestions one term longer
Design: Length Bias
• Insight: clicks on suggestions one term longer
• Corresponds to refining the initial query
• Statistical biasing model to score a longer query
higher
18
Design: Ensemble Approach
• Need to generate unified recommendation dataset
• Analysis to figure out engagement of each signal
• Attempted ML approach
‣ Minimal overlap across different signals
19
Design: Ensemble Approach
• Step-wise unionization
• Importance based on individual signal performance
‣ First, collaborative filter
‣ Second, queries correlated by query-result clicks
‣ Third, queries overlapping terms
20
Design: Practical Considerations
• System designed for public consumption
‣ Strong profanity filters
‣ Need to deal with misspellings
‣ Languages
‣ Remove spammy search queries
21
Related Searches
• Design
• Implementation
• Evaluation
22
Implementation Challenge
• Scale
‣ 175M+ members
‣ Billions of searches
‣ Terabytes of data to process
Implementation
• Kafka: publish-subscribe messaging system
• Hadoop: MapReduce data processing system
• Azkaban: Hadoop workflow management tool
• Voldemort: Key-value store
Implementation: Workflow
29
Related Searches
• Design
• Implementation
• Evaluation
30
Evaluation
• Performance of each signal and combination
• How does the system scale?
31
Evaluation Cont’d
• Offline evaluation
‣ Precision-Recall
• Online evaluation
‣ A/B testing to measure engagement
‣ Performance evaluation
32
Offline Evaluation
• Correct set: set of searches performed by a user in
the following K minutes, here K=10
33
Online Evaluation
• Used A/B testing
• Metrics
‣ Coverage: queries with recommendations
‣ Impressions: # of recommendations shown
‣ Clicks: Clicks on recommendations
‣ Click-through rate (CTR): Clicks per impression
Online Evaluation
35
Evaluation: System Runtime
36
Details
• Metaphor: a System for Related Search
Recommendations, Azarias Reda, Yubin Park, Mitul
Tiwari, Christian Posse, and Sam Shah. In Proceedings
of the CIKM, 2012.

More Related Content

Viewers also liked

Catálogo de Debbie Reynolds
Catálogo de Debbie ReynoldsCatálogo de Debbie Reynolds
Catálogo de Debbie Reynolds
jaugustosma
 
Idus euroson 2008
Idus euroson 2008Idus euroson 2008
Idus euroson 2008
sebikovacs
 
핵 없는 세계를 위한 요코하마 선언
핵 없는 세계를 위한 요코하마 선언핵 없는 세계를 위한 요코하마 선언
핵 없는 세계를 위한 요코하마 선언
보아 이
 
RESTful API Design & Implementation with CodeIgniter PHP Framework
RESTful API Design & Implementation with CodeIgniter PHP FrameworkRESTful API Design & Implementation with CodeIgniter PHP Framework
RESTful API Design & Implementation with CodeIgniter PHP Framework
Bo-Yi Wu
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
Liang Xiang
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
NYC Predictive Analytics
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
Liang Xiang
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
Chris Johnson
 

Viewers also liked (17)

Catálogo de Debbie Reynolds
Catálogo de Debbie ReynoldsCatálogo de Debbie Reynolds
Catálogo de Debbie Reynolds
 
Research recommendations at Mendeley
Research recommendations at MendeleyResearch recommendations at Mendeley
Research recommendations at Mendeley
 
Idus euroson 2008
Idus euroson 2008Idus euroson 2008
Idus euroson 2008
 
핵 없는 세계를 위한 요코하마 선언
핵 없는 세계를 위한 요코하마 선언핵 없는 세계를 위한 요코하마 선언
핵 없는 세계를 위한 요코하마 선언
 
Neo4j - graph database for recommendations
Neo4j - graph database for recommendationsNeo4j - graph database for recommendations
Neo4j - graph database for recommendations
 
RESTful API Design & Implementation with CodeIgniter PHP Framework
RESTful API Design & Implementation with CodeIgniter PHP FrameworkRESTful API Design & Implementation with CodeIgniter PHP Framework
RESTful API Design & Implementation with CodeIgniter PHP Framework
 
Graph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBayGraph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBay
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 

Similar to Related searches at LinkedIn

Top 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQLTop 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQL
MongoDB
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
enterprisesearchmeetup
 

Similar to Related searches at LinkedIn (20)

Browsemap: Collaborative Filtering at LinkedIn
Browsemap: Collaborative Filtering at LinkedInBrowsemap: Collaborative Filtering at LinkedIn
Browsemap: Collaborative Filtering at LinkedIn
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
Digital Workplace Case Studies (Intranet)
Digital Workplace Case Studies (Intranet)Digital Workplace Case Studies (Intranet)
Digital Workplace Case Studies (Intranet)
 
How Capital One Scaled API Design to Deliver New Products Faster
How Capital One Scaled API Design to Deliver New Products FasterHow Capital One Scaled API Design to Deliver New Products Faster
How Capital One Scaled API Design to Deliver New Products Faster
 
Neo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperativeNeo4j GraphDay Seattle- Sept19- Connected data imperative
Neo4j GraphDay Seattle- Sept19- Connected data imperative
 
CROSSMINER Project at OW2con'19
CROSSMINER Project at OW2con'19CROSSMINER Project at OW2con'19
CROSSMINER Project at OW2con'19
 
Liberating data power of APIs
Liberating data power of APIsLiberating data power of APIs
Liberating data power of APIs
 
Using analytics in ux design my view
Using analytics in ux design   my viewUsing analytics in ux design   my view
Using analytics in ux design my view
 
Top 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQLTop 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQL
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
apidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglass
apidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglassapidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglass
apidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglass
 
apidays LIVE Australia 2020 - Growing an API Culture by Liz Douglass & Saul C...
apidays LIVE Australia 2020 - Growing an API Culture by Liz Douglass & Saul C...apidays LIVE Australia 2020 - Growing an API Culture by Liz Douglass & Saul C...
apidays LIVE Australia 2020 - Growing an API Culture by Liz Douglass & Saul C...
 
Growing an API Culture - APIdays LIVE AU 2020
Growing an API Culture - APIdays LIVE AU 2020Growing an API Culture - APIdays LIVE AU 2020
Growing an API Culture - APIdays LIVE AU 2020
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
 
Growth Analytics: Evolution, Community and Tools
Growth Analytics: Evolution, Community and ToolsGrowth Analytics: Evolution, Community and Tools
Growth Analytics: Evolution, Community and Tools
 
Cloud Readiness 101: Analyzing and Visualizing Your IT Infrastructure
Cloud Readiness 101: Analyzing and Visualizing Your IT InfrastructureCloud Readiness 101: Analyzing and Visualizing Your IT Infrastructure
Cloud Readiness 101: Analyzing and Visualizing Your IT Infrastructure
 
Software Project Management Presentation Final
Software Project Management Presentation FinalSoftware Project Management Presentation Final
Software Project Management Presentation Final
 
The Ultimate Website Development Roadmap
The Ultimate Website Development RoadmapThe Ultimate Website Development Roadmap
The Ultimate Website Development Roadmap
 
Building Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | HautelookBuilding Search and Personalization at Nordstrom Rack | Hautelook
Building Search and Personalization at Nordstrom Rack | Hautelook
 
Alternatives to Google
Alternatives to GoogleAlternatives to Google
Alternatives to Google
 

More from Mitul Tiwari

More from Mitul Tiwari (8)

Large scale social recommender systems at LinkedIn
Large scale social recommender systems at LinkedInLarge scale social recommender systems at LinkedIn
Large scale social recommender systems at LinkedIn
 
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
Big Data Ecosystem at LinkedIn. Keynote talk at Big Data Innovators Gathering...
 
Modeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systemsModeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systems
 
Large scale social recommender systems and their evaluation
Large scale social recommender systems and their evaluationLarge scale social recommender systems and their evaluation
Large scale social recommender systems and their evaluation
 
Structural Diversity in Social Recommender Systems
Structural Diversity in Social Recommender SystemsStructural Diversity in Social Recommender Systems
Structural Diversity in Social Recommender Systems
 
Organizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its ApplicationsOrganizational Overlap on Social Networks and its Applications
Organizational Overlap on Social Networks and its Applications
 
Building Data Driven Products at Linkedin
Building Data Driven Products at LinkedinBuilding Data Driven Products at Linkedin
Building Data Driven Products at Linkedin
 
Social Network Analysis at LinkedIn
Social Network Analysis at LinkedInSocial Network Analysis at LinkedIn
Social Network Analysis at LinkedIn
 

Recently uploaded

Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 

Recently uploaded (20)

module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 

Related searches at LinkedIn

Editor's Notes

  1. first context of related searches at LinkedIn then design, implementation and evaluation of our related searches system
  2. Slow down Searches per second: 130, min: 8000, hour: 480000, day: 11.5M Cut down
  3. Research problems
  4. discovery, exploration, refine
  5. a screenshot of search result page
  6. explore more candidates and scoring
  7. For example, web developer -> HTML why collaborative filtering elaborate session replace to within a time window
  8. Elaborate - across individual put a real example importance of each click query fanout, popular result
  9. mechanical engineer across individual
  10. highlight - next
  11. show evaluation/analysis result about clicks on queries one term longer skip the second equation
  12. show evaluation/analysis result about clicks on queries one term longer skip the second equation
  13. describe signals
  14. mention evaluation later
  15. high level design Kafka, Voldemort citations, url to Azkaban
  16. more time here
  17. scientific and easily repeatable fast, iterative way for tuning parameters and performance P-decreases with the size of window, R-increases with time window K P/R low: predicting future searches, conservative measure; judging whether a signal can predict future behavior CF has advantage in this measure top-10 recommendations
  18. why CTR is not the only metric: hadoop->mapreduce
  19. normalized legends: elaborate CF, QRQ, partial break down figures: bigger chart
  20. for all possible queries quadratic why? 80 nodes 2 quad-core cpus: 640 cores
  21. questions, details, hiring