SlideShare uma empresa Scribd logo
1 de 38
Recruiting SolutionsRecruiting SolutionsRecruiting Solutions
Daniel Tunkelang
Head, Query Understanding
better search through
query understanding
overview
 query understanding: what is it?
 how we do query understanding at LinkedIn
 some other thoughts from search in the wild
what I’m not going to cover:
2
Information need query select from results
rank using IR model
user:
system:
tf-idf PageRank
bird’s-eye view of how a search engine works
3
Information need query select from results
rank using IR model
user:
system:
tf-idf PageRank
query understanding
4
search is a communication problem
5
6
tag: skill OR title
related skills:
search, ranking, …
tag: company
id: 1337
industry: internet
verticals:
people, jobs
intent: exploratory
query understanding pipeline
7
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query
+
annotations
query understanding pipeline
8
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query
+
annotations
9
fix obvious typos
help users spell names
spelling correction
spelling out the details
10
PEOPLE NAMES
COMPANIES
TITLES
PAST QUERIES
n-grams
marissa => ma ar ri is ss sa
metaphone
mark/marc => MRK
co-occurrence counts
marissa:mayer = 1000
marisa meyer yahoo
marissa
marisa
meyer
mayer
yahoo
spelling out the details
11
problem: corpus as well as query logs contain many spelling errors
certain spelling errors are quite frequent
while genuine words (especially names) might be infrequent
spelling out the details
12
problem: corpus & query logs contain spelling errors
solution: use query chains to infer correct spelling
[product manger] [product manager] CLICK
[marissa mayer] CLICK
query understanding pipeline
13
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query
+
annotations
query tagging: identifying entities in the query
14
TITLE CO GEO
TITLE-237
software engineer
software developer
programmer
…
CO-1441
Google Inc.
Industry: Internet
GEO-7583
Country: US
Lat: 42.3482 N
Long: 75.1890 W
(RECOGNIZED TAGS: NAME, TITLE, COMPANY, SCHOOL, GEO, SKILL )
query tagging: identifying entities in the query
15
TITLE CO GEO
MORE PRECISE MATCHING WITH DOCUMENTS
entity-based filtering
16
BEFORE
entity-based filtering
17
AFTER
BEFORE
entity-based filtering
18
BEFORE
entity-based filtering
19
AFTER
BEFORE
entity-based suggestions
20
entity-based suggestions
21
query tagging: sequential model
22
EMISSION PROBABILITIES
(learned from user profiles)
TRANSITION PROBABILITIES
(learned from query logs)
TRAINING
query tagging: sequential model
23
INFERENCE
given a query, find the most likely sequence of tags
query understanding pipeline
24
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query
+
annotations
vertical intent prediction: distribution
25
JOBS
PEOPLE
COMPANIES
(probability distribution over verticals)
vertical intent prediction: relevance
26
[company]
[employees]
[jobs]
[name search]
query understanding pipeline
27
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query
+
annotations
28
query expansion: name synonyms
29
query expansion: job title synonyms
30
query expansion: signals
[jon] [jonathan] CLICK
trained using query chains:
[programmer] [developer] CLICK
symmetric but not transitive!
[francis] ⇔ [frank]
[franklin] ⇔ [frank]
[francis] ≠ [franklin]
[software engineer] [software developer] CLICK
context based!
[software engineer] => [software developer]
[civil engineer] ≠ [civil developer]
query understanding pipeline
31
spellcheck
query tagging
vertical intent prediction
query expansion
raw query
structured query
+
annotations
32
what else can we learn from search in the wild?
don’t guess when it’s better to ask
33
vs.
clarify then refine
34
computers books
give users transparency, guidance, and control
35
think beyond individual search queries
36
Gene Golovchinsky, FXPAL
know when you don’t know
37
Claudia Hauff, Query Difficulty for Digital Libraries [2009]
38
Daniel Tunkelang
dtunkelang@linkedin.com
https://linkedin.com/in/dtunkelang

Mais conteúdo relacionado

Mais procurados

10 search engines every recruiter should be using and how
10 search engines every recruiter should be using and how10 search engines every recruiter should be using and how
10 search engines every recruiter should be using and how
RecruitingDaily.com LLC
 
stackconf 2021 | Weaviate Vector Search Engine – Introduction
stackconf 2021 | Weaviate Vector Search Engine – Introductionstackconf 2021 | Weaviate Vector Search Engine – Introduction
stackconf 2021 | Weaviate Vector Search Engine – Introduction
NETWAYS
 
stackconf 2022: Introduction to Vector Search with Weaviate
stackconf 2022: Introduction to Vector Search with Weaviatestackconf 2022: Introduction to Vector Search with Weaviate
stackconf 2022: Introduction to Vector Search with Weaviate
NETWAYS
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
Trey Grainger
 

Mais procurados (20)

10 search engines every recruiter should be using and how
10 search engines every recruiter should be using and how10 search engines every recruiter should be using and how
10 search engines every recruiter should be using and how
 
Netflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time TravelNetflix Recommendations Feature Engineering with Time Travel
Netflix Recommendations Feature Engineering with Time Travel
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
 
How to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank ProjectHow to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank Project
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Knowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender SystemsKnowledge Graph Embeddings for Recommender Systems
Knowledge Graph Embeddings for Recommender Systems
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Search summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSearch summit-2018-ltr-presentation
Search summit-2018-ltr-presentation
 
stackconf 2021 | Weaviate Vector Search Engine – Introduction
stackconf 2021 | Weaviate Vector Search Engine – Introductionstackconf 2021 | Weaviate Vector Search Engine – Introduction
stackconf 2021 | Weaviate Vector Search Engine – Introduction
 
Semantic search
Semantic searchSemantic search
Semantic search
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
stackconf 2022: Introduction to Vector Search with Weaviate
stackconf 2022: Introduction to Vector Search with Weaviatestackconf 2022: Introduction to Vector Search with Weaviate
stackconf 2022: Introduction to Vector Search with Weaviate
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
 
Thought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchThought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered Search
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 

Semelhante a Better Search Through Query Understanding

Keynote Peter Skomoroch - skills, reputation, and search
Keynote   Peter Skomoroch - skills, reputation, and searchKeynote   Peter Skomoroch - skills, reputation, and search
Keynote Peter Skomoroch - skills, reputation, and search
lucenerevolution
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedIn
Daniel Tunkelang
 
Path 101 Opportunity
Path 101 OpportunityPath 101 Opportunity
Path 101 Opportunity
path101
 

Semelhante a Better Search Through Query Understanding (20)

Hr analytics and graphs : job recommendations
Hr analytics and graphs : job recommendationsHr analytics and graphs : job recommendations
Hr analytics and graphs : job recommendations
 
Next generation linked in talent search
Next generation linked in talent searchNext generation linked in talent search
Next generation linked in talent search
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 
Skills, Reputation, and Search
Skills, Reputation, and SearchSkills, Reputation, and Search
Skills, Reputation, and Search
 
Keynote Peter Skomoroch - skills, reputation, and search
Keynote   Peter Skomoroch - skills, reputation, and searchKeynote   Peter Skomoroch - skills, reputation, and search
Keynote Peter Skomoroch - skills, reputation, and search
 
KEYNOTE: Skills, Reputation and Search
KEYNOTE: Skills, Reputation and SearchKEYNOTE: Skills, Reputation and Search
KEYNOTE: Skills, Reputation and Search
 
Find and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedInFind and be Found: Information Retrieval at LinkedIn
Find and be Found: Information Retrieval at LinkedIn
 
Learn to Rank search results
Learn to Rank search resultsLearn to Rank search results
Learn to Rank search results
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Path 101 Opportunity
Path 101 OpportunityPath 101 Opportunity
Path 101 Opportunity
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search System
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Search Ranking Across Heterogeneous Information Sources
Search Ranking Across Heterogeneous Information SourcesSearch Ranking Across Heterogeneous Information Sources
Search Ranking Across Heterogeneous Information Sources
 
Alok das cv updated
Alok das cv updatedAlok das cv updated
Alok das cv updated
 
A Recruiter's Checklist (Strings Attached)
A Recruiter's Checklist (Strings Attached)A Recruiter's Checklist (Strings Attached)
A Recruiter's Checklist (Strings Attached)
 
Activate 2019 Opening Keynote, Will Hayes, CEO, Lucidworks
Activate 2019 Opening Keynote, Will Hayes, CEO, LucidworksActivate 2019 Opening Keynote, Will Hayes, CEO, Lucidworks
Activate 2019 Opening Keynote, Will Hayes, CEO, Lucidworks
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
Gary Conaway 6-21-11
Gary Conaway 6-21-11Gary Conaway 6-21-11
Gary Conaway 6-21-11
 
You Don't Know SEO
You Don't Know SEOYou Don't Know SEO
You Don't Know SEO
 
Shall we search? Lviv.
Shall we search? Lviv. Shall we search? Lviv.
Shall we search? Lviv.
 

Mais de Daniel Tunkelang

Enterprise Intelligence
Enterprise IntelligenceEnterprise Intelligence
Enterprise Intelligence
Daniel Tunkelang
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine Learning
Daniel Tunkelang
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
Daniel Tunkelang
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal Journey
Daniel Tunkelang
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?
Daniel Tunkelang
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem
Daniel Tunkelang
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
Daniel Tunkelang
 

Mais de Daniel Tunkelang (20)

Query Understanding and Ecommerce
Query Understanding and EcommerceQuery Understanding and Ecommerce
Query Understanding and Ecommerce
 
Semantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesSemantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce Queries
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query Understanding
 
MMM, Search!
MMM, Search!MMM, Search!
MMM, Search!
 
Enterprise Intelligence
Enterprise IntelligenceEnterprise Intelligence
Enterprise Intelligence
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?
 
Data Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityData Science: A Mindset for Productivity
Data Science: A Mindset for Productivity
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine Learning
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional Context
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal Journey
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of Needs
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and Context
 
Scale, Structure, and Semantics
Scale, Structure, and SemanticsScale, Structure, and Semantics
Scale, Structure, and Semantics
 
Strata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkStrata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of Microwork
 
Recommendations as a Conversation with the User
Recommendations as a Conversation with the UserRecommendations as a Conversation with the User
Recommendations as a Conversation with the User
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Better Search Through Query Understanding