SlideShare uma empresa Scribd logo
1 de 111
Inside Search
Metro, NYC
November 15, 2017
Hello
Hello
Hello
Hello
Hello
EXERCISE
Exercise
Tell me about you.
1. What’s your position?
2. What kind of organization are you
from?
3. Why are you interested in search?
Agenda
don’t worry
1. IR Basics
2. Search in the Modern Era
Agenda
Information
Retrieval 101
InformationUser
IR 101
InformationUser
IR 101
“pluto” Documents
IR 101
pluto?
Inverted Index
Inverted Index
1
2
3
“pluto and goofy...”
“pluto the dwarf planet...”
“stranger things...”
“pluto and goofy”
pluto
and
goofy
Inverted Index
Inverted Index
1pluto 2
planet 2
stranger 3
Inverted Index
1pluto 2
planet 2
stranger 3
Inverted Index
1 2
Ranking
Term
Frequency
Inverse
Document
Frequency
Term Frequency
Document Frequency
1
2
“pluto and goofy...”
“pluto the dwarf planet...pluto”
Ranking
1
2
Term Frequency = 1
Term Frequency = 2
Document Frequency = 2
Ranking
1
2
TFIDF = 1/2 = 0.5
TFIDF = 2/2 = 1
Ranking
1
2
“pluto and goofy...”, TFIDF = 0.5
“pluto the dwarf planet...pluto”, TFIDF = 1
Ranking
IR 101
Index Time Query Time
Forward Index
Tokenize
Inverted Index
Matching
Ranking
Multiple Terms
IR 101
Multiple Terms
“pluto planet”
“pluto planet”
pluto
planet
Multiple Terms
Multiple Terms
1pluto 2
planet 2
stranger 3
Multiple Terms
OR AND
Multiple Terms
1pluto 2
planet 2
1 2
Multiple Terms
Term Frequency
Document Frequency∑
Multiple Terms
score = tf-idfpluto + tf-
idfplanet
1
2
“pluto and goofy...”
“pluto the dwarf planet...”
Multiple Terms
1
2
tf-idfpluto = ½ tf-idfplanet = 0/2
tf-idfpluto = ½ tf-idfplanet = 1/1
Multiple Terms
Document Frequencypluto = 2
Document Frequencyplanet = 1
1
2
½ + 0 = 0.5
½ + 1 = 1.5
Multiple Terms
1
2
“pluto and goofy...”, TFIDF = 0.5
“pluto the dwarf planet...”, TFIDF = 1.5
Multiple Terms
Measurement
IR 101
Measurement
Precision Recall
How good are the results? How many of the good
documents are in the results?
Measurement
Precision Recall
# Good Results
# Results
# Good Results
# Good Documents
Measurement
What’s good?
2
1
“pluto and goofy…”
“pluto the dwarf planet...”
Measurement
“pluto planet”
5
1
“Pluto lost its status…”
“pluto the dwarf planet...”
Measurement
“pluto”
Measurement
Precision Recall
EXERCISE
Exercise
Find a precision and a recall problem.
Analysis
IR 101
5
1
“Pluto lost its status…”
“pluto the dwarf planet...”
Analysis
“pluto”
“Pluto lost its status...”
Pluto
lost
its
status
Analysis
Pluto
lost
its
status
Analysis
pluto
lost
its
status
Analysis
Analysis
1. Tokenize
2. Transform each Token
Analysis
What happens if I search for Pluto?
1pluto 2
Analysis
Index Time Query Time
Forward Index
TokenizeAnalysis
Inverted Index
TokenizeAnalysis
Matching
Ranking
Analysis
What happens if I search for planets?
1planet 2
Analysis
planetsplanets
Analysis
precisionrecall
stemming
lemmatization
aggressive
stemming
planetsjournalism
Analysis
Whitespace
Tokenizer
StemLowercase
Analysis Chain
EXERCISE
Exercise
Think about
1. Tokenization
2. Normalization
3. Stemming
Find 3 analysis-related problems.
Beyond the 70s
Machine-Learned
Relevance
Beyond the 70s
Machine-Learned Relevance
1. Supervised (Learning to Rank)
2. Unsupervised
Learning to Rank
TF-IDF ≠ relevance
Learning to Rank
“pluto”
0.8
Document
Model
Learning to Rank
“pluto”
Yes/No
Document
Learning to Rank
Learning to Rank
Problems
1. Require lots of data
Learning to Rank
Problems
1. Require lots of data
2. Difficult to train
3. Difficult to administer/serve
Learning to Rank
Problems
1. Require lots of data
2. Difficult to train
3. Difficult to administer/serve
4. Bias
SKIN WINS
Unsupervised
Unsupervised
How do you know it’s any good?
Search is UX
Beyond the 70s
Search is UX
Search ≠ Ranking
Search is UX
Search is UX
Tools
1. Facets
Search is UX
Tools
1. Facets
2. Results
Search is UX
Tools
1. Facets
2. Results
3. Autosuggest
Search is UX
Tools
1. Facets
2. Results
3. Autosuggest
4. Suggestions
Search is UX
Reasons to focus on UX
1. Low-Intent Traffic
2. Leverage
3. Mobile
4. Lower-Stakes
EXERCISE
Exercise
Suggest 3 UX
enhancements.
Think about
1. Exploration
2. Disambiguation
3. Mobile
Tools
1. Facets
2. Results
3. Autosuggest
4. Suggestions
Query
Understanding
Beyond the 70s
Query Understanding
keywords ≠ ideas
f you wanted to make this one statement
as well.
Or another one.
Search results for “dress shirt”
Query Understanding
keywords ideas
Query Understanding
“dress shirt”
Category:
Clothing > Shirts > Dress Shirts
Query Understanding
Tokenization
Stemming
Synonyms
Language Detection
Spelling Correction
Entity Recognition
Query Expansion
Query Relaxation
Query Classification
Query Parsing
Query Segmentation
Knowledge Graphs
Query Understanding
Too many tools.
Too many theoretical
problems.
Query Understanding
Start with data:
1. What are people looking
for?
2. When are they not
successful?
Query Understanding
Search Exit
f you wanted to make this one statement
as well.
Or another one.
Search results for “dress”
Query Understanding
“fabric” Supply!Model
Query Understanding
Query Understanding
Query Understanding
“fabric” Supply!You
Search results for “fanny pack”
Search results for “bum bag”
Users and documents
speak different languages.
Hat?
Query Understanding
User Data Metadata
Query Understanding
Search results for “red nascar”
EXERCISE
Exercise
Optimize some important
or underperforming
queries.
Problems
1. Precision
2. Recall
3. Exploration
Tools
1. Analysis
2. Understanding
3. Metadata
4. UX
Search as a System
gio@relatedworks.io
www.relatedworks.io

Mais conteúdo relacionado

Semelhante a Inside Search

Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...Lucidworks
 
IBM IOD Conference 2011 Opening Keynote Deck
IBM IOD Conference 2011 Opening Keynote DeckIBM IOD Conference 2011 Opening Keynote Deck
IBM IOD Conference 2011 Opening Keynote DeckJeff Jonas
 
ScrumRio 2015 - Agile: The Power of i(n)teration
ScrumRio 2015 - Agile: The Power of i(n)terationScrumRio 2015 - Agile: The Power of i(n)teration
ScrumRio 2015 - Agile: The Power of i(n)terationNuno Rafael Gomes
 
The Triforce of UX: Empathy, Curiosity, Humility
The Triforce of UX: Empathy, Curiosity, HumilityThe Triforce of UX: Empathy, Curiosity, Humility
The Triforce of UX: Empathy, Curiosity, HumilityBrandon Ward
 
2450 f15 04-norman3_as_delivered
2450 f15 04-norman3_as_delivered2450 f15 04-norman3_as_delivered
2450 f15 04-norman3_as_delivereddrewmargolin
 
Codebase orienteering, how to gain confidence with an unknown codebase
Codebase orienteering, how to gain confidence with an unknown codebaseCodebase orienteering, how to gain confidence with an unknown codebase
Codebase orienteering, how to gain confidence with an unknown codebaseMauro Murru (brainrepo)
 
Intuition & Use-Cases of Embeddings in NLP & beyond
Intuition & Use-Cases of Embeddings in NLP & beyondIntuition & Use-Cases of Embeddings in NLP & beyond
Intuition & Use-Cases of Embeddings in NLP & beyondC4Media
 
The Kipling-Zachman lens
The Kipling-Zachman lensThe Kipling-Zachman lens
The Kipling-Zachman lensRichard Veryard
 
The filter bubble
The filter bubbleThe filter bubble
The filter bubblefleong
 
The How and Why of Feature Engineering
The How and Why of Feature EngineeringThe How and Why of Feature Engineering
The How and Why of Feature EngineeringAlice Zheng
 
@ScrumRio 2015 - Agile: The Power of I(n)teration
@ScrumRio 2015 - Agile: The Power of I(n)teration@ScrumRio 2015 - Agile: The Power of I(n)teration
@ScrumRio 2015 - Agile: The Power of I(n)terationXekin.org
 
Introduction to ML and Decision Tree
Introduction to ML and Decision TreeIntroduction to ML and Decision Tree
Introduction to ML and Decision TreeSuman Debnath
 
Meetup. Working tips on how to become a new productive version of yourself
Meetup. Working tips on how to become a new productive version of yourselfMeetup. Working tips on how to become a new productive version of yourself
Meetup. Working tips on how to become a new productive version of yourselfIT Arena
 
Paraphrase Detection in NLP
Paraphrase Detection in NLPParaphrase Detection in NLP
Paraphrase Detection in NLPYuriy Guts
 
Exploratory Testing As A Quest
Exploratory Testing As A QuestExploratory Testing As A Quest
Exploratory Testing As A QuestChrishoneybee
 

Semelhante a Inside Search (20)

Year 1 AI.ppt
Year 1 AI.pptYear 1 AI.ppt
Year 1 AI.ppt
 
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
 
IBM IOD Conference 2011 Opening Keynote Deck
IBM IOD Conference 2011 Opening Keynote DeckIBM IOD Conference 2011 Opening Keynote Deck
IBM IOD Conference 2011 Opening Keynote Deck
 
ScrumRio 2015 - Agile: The Power of i(n)teration
ScrumRio 2015 - Agile: The Power of i(n)terationScrumRio 2015 - Agile: The Power of i(n)teration
ScrumRio 2015 - Agile: The Power of i(n)teration
 
The Triforce of UX: Empathy, Curiosity, Humility
The Triforce of UX: Empathy, Curiosity, HumilityThe Triforce of UX: Empathy, Curiosity, Humility
The Triforce of UX: Empathy, Curiosity, Humility
 
2450 f15 04-norman3_as_delivered
2450 f15 04-norman3_as_delivered2450 f15 04-norman3_as_delivered
2450 f15 04-norman3_as_delivered
 
wendi_ppt
wendi_pptwendi_ppt
wendi_ppt
 
A Stranger in a Strange Land
A Stranger in a Strange LandA Stranger in a Strange Land
A Stranger in a Strange Land
 
Codebase orienteering, how to gain confidence with an unknown codebase
Codebase orienteering, how to gain confidence with an unknown codebaseCodebase orienteering, how to gain confidence with an unknown codebase
Codebase orienteering, how to gain confidence with an unknown codebase
 
Intuition & Use-Cases of Embeddings in NLP & beyond
Intuition & Use-Cases of Embeddings in NLP & beyondIntuition & Use-Cases of Embeddings in NLP & beyond
Intuition & Use-Cases of Embeddings in NLP & beyond
 
The Kipling-Zachman lens
The Kipling-Zachman lensThe Kipling-Zachman lens
The Kipling-Zachman lens
 
The filter bubble
The filter bubbleThe filter bubble
The filter bubble
 
The How and Why of Feature Engineering
The How and Why of Feature EngineeringThe How and Why of Feature Engineering
The How and Why of Feature Engineering
 
Dec2018 istanbul-2
Dec2018 istanbul-2Dec2018 istanbul-2
Dec2018 istanbul-2
 
@ScrumRio 2015 - Agile: The Power of I(n)teration
@ScrumRio 2015 - Agile: The Power of I(n)teration@ScrumRio 2015 - Agile: The Power of I(n)teration
@ScrumRio 2015 - Agile: The Power of I(n)teration
 
Introduction to ML and Decision Tree
Introduction to ML and Decision TreeIntroduction to ML and Decision Tree
Introduction to ML and Decision Tree
 
Pissing against the wind
Pissing against the windPissing against the wind
Pissing against the wind
 
Meetup. Working tips on how to become a new productive version of yourself
Meetup. Working tips on how to become a new productive version of yourselfMeetup. Working tips on how to become a new productive version of yourself
Meetup. Working tips on how to become a new productive version of yourself
 
Paraphrase Detection in NLP
Paraphrase Detection in NLPParaphrase Detection in NLP
Paraphrase Detection in NLP
 
Exploratory Testing As A Quest
Exploratory Testing As A QuestExploratory Testing As A Quest
Exploratory Testing As A Quest
 

Último

complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitterShivangiSharma879191
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringJuanCarlosMorales19600
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 

Último (20)

complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter8251 universal synchronous asynchronous receiver transmitter
8251 universal synchronous asynchronous receiver transmitter
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineering
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 

Inside Search