SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
#pubcon
Semantics and Search
Presented by:
Upasna Gautam
aka Pas
#pubcon
Objectives
•What is semantic search?
•What is NOT semantic search?
•How does Google make it work?
•How can you make it work?
#pubcon
SEO:
Then and Now
#pubcon
SEO: Then & Now
Back then:
•Keyword-focused:
• Text retrieval system relied on exact match keywords
• Weighted documents by keyword frequency
•Unable to distinguish synonyms and homographs
• Synonym: Words that share the same meaning (e.g. car and
automobile)
• Homograph: More than one meaning depending on context
(e.g. “charge)
#pubcon
SEO: Then & Now
Now:
•Driven by intent and context
•Provide relevant answers to
complex and vague queries
#pubcon
SEO: Then & Now
#pubcon
SEO: Then & Now
Now:
•“best vegan tacos austin”
•“late night texmex delivery austin”
•“best happy hour margaritas 78701”
#pubcon
SEO: Then & Now
Now:
Search Experience Optimization
#pubcon
SEO: Then & Now
What enabled search
engines to understand our
queries on an
intelligent level?
#pubcon
Hummingbird
2013
#pubcon
What is Semantic Search
Semantics:
A branch of linguistics that studies the relationship between words and
sentences and their actual meanings.
Semantic Search:
The improvement of search accuracy by understanding intent and
context, using various on-site elements to crawl, index, and serve
relevant results.
#pubcon
What is Semantic Search
•Entity Optimization
•Knowledge Graph
•Structured Data
•Information Architecture
•Co-occurrence and Clustering
#pubcon
What is Semantic Search:
Entity Optimization
Paul Haahr – Google Ranking Engineer – SMX 2016
#pubcon
What is Semantic Search:
Knowledge Graph
•Understands relationships between things
•Stores and understands the intelligence between
different entities
•Not just a catalog of objects, but a data model for
inter-relationships
#pubcon
What is Semantic Search:
Structured Data
•Google is a data-driven machine that needs to be
fed in order for it to learn
•Feed it structured data – it’s a piece of intelligence
the crawler uses to build semantic relevance and
authority
•This is how entities are indexed!
#pubcon
What is Semantic Search:
Information Architecture
•Allows for a crawler to clearly understand content and how it’s connected
•Provide a clear and hierarchical path of information
•Lends to a good UX
•The RIGHT approach is the most LOGICAL approach
•Must read: Information Architecture for the World Wide Web [3rd Edition, by
Peter Morville]: https://www.amazon.com/Information-Architecture-World-Wide-
Web/dp/0596527349
#pubcon
What is Semantic Search:
Co-Occurrence and Clustering
Word Co-Occurrence Clustering
• Generates topics from words frequently occurring together
Weighted Bigraph Clustering
• Uses URLs from Google search results to induce query similarity and
generate topics
The combination of these two methods demonstrated greater usefulness
and accuracy when compared to Latent Semantic Analysis.
Read the patent here:
https://pdfs.semanticscholar.org/dcf7/05ba07ee1b73fda0c94e9d01b2474173e470.pdf
#pubcon
What is Semantic Search:
Co-Occurrence and Clustering
Word Co-Occurrence
• A set of words anchors serve as initial topics, which are then
generalized to other words co-appearing with the same queries.
• Topics are created using hierarchical clustering on query
similarity, which measures to what extent two queries agree on their
intersections with the list of words in each topic.
Bigraph Clustering
• Uses organic results to create a bigraph with a set of queries and a set
of URLs as nodes. Weights of the graph are computed with the
impression and click data.
• Bigraph clustering works very well even if the queries do not share
common words
#pubcon
Latent Semantic Indexing
Is NOT
Semantic Search
#pubcon
BUT…
#pubcon
• Learning the mathematical relevance helps to understand search
on a functional level
• LSI uses Singular Value Decomposition which is a linear algebraic
factorization for many of our modern algorithms
• It is not a way to “do SEO”
• LSI KEYWORDS ARE NOT A THING
#pubcon
Latent Semantic Indexing
Latent Semantic Indexing (LSI):
•Mathematical algorithm based on Singular Value Decomposition (SVD)
•Text indexing and retrieval method
•How terms and concepts are related
#pubcon
Latent Semantic Indexing
•LSI works by projecting a large multi-
dimensional space down into a smaller
number of dimensions
•Semantically similar words get
bunched together
•Boundary blurring allows LSI to go
beyond exact keyword matching
#pubcon
Latent Semantic Indexing
•LSI uses Singular Value Decomposition (SVD) to decompose this matrix
•Preserves information about relative distances between document vectors
•Collapsed into smaller dimensions
•Information is lost and words are superimposed on one another
#pubcon
Latent Semantic Indexing
•Noise reduction
•Reveal similarities that were latent
•Similar terms become more similar, while dissimilar things remain distinct
This method is a widely used technique to unveil latent themes in text
data, as these models learn the hidden topics by understanding
document level word co-occurrence patterns.
#pubcon
Latent Semantic Indexing
Short texts, such as search queries, tweets or instant messages suffer from
data sparsity, which causes problems for traditional topic modeling
techniques. Unlike proper documents, short text snippets do not provide
enough word counts for models to learn how words are related and to
disambiguate multiple meanings of a single word.
*This is why the binary co-occurrence/clustering model works better*
#pubcon
Key Takeaways
#pubcon
Key Takeaways
•Craft and optimize content for topics and concepts, not just
keywords
•Use structured data to feed crawler the semantic intelligence it
needs to understand your site better
•Align the information architecture of your website to the
consumer journey
•Navigation, sitemaps, page structure, content organization
•Stop saying/using “LSI keywords”
•The best approach is the most logical approach!
#pubcon
The End

Mais conteúdo relacionado

Semelhante a Semantics and Search by Upasna Gautam at PubCon Austin 2018

State of Search 2017 - Semantics and Science - Upasna Gautam
State of Search 2017 - Semantics and Science - Upasna GautamState of Search 2017 - Semantics and Science - Upasna Gautam
State of Search 2017 - Semantics and Science - Upasna GautamUpasna Gautam
 
Conductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For You
Conductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For YouConductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For You
Conductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For YouConductor
 
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick StoxA Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stoxpatrickstox
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...Paul Shapiro
 
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...Bill Slawski
 
The New Content SEO - Sydney SEO Conference 2023
The New Content SEO - Sydney SEO Conference 2023The New Content SEO - Sydney SEO Conference 2023
The New Content SEO - Sydney SEO Conference 2023Amanda King
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
Semtech bizsemanticsearchtutorial
Semtech bizsemanticsearchtutorialSemtech bizsemanticsearchtutorial
Semtech bizsemanticsearchtutorialBarbara Starr
 
You Don't Know SEO
You Don't Know SEOYou Don't Know SEO
You Don't Know SEOMichael King
 
MongoDB meetup at Hike
MongoDB meetup at HikeMongoDB meetup at Hike
MongoDB meetup at HikeBharvi Dixit
 
Staff study talk/ on search engine & internet in 2008
Staff study talk/ on search engine & internet in 2008Staff study talk/ on search engine & internet in 2008
Staff study talk/ on search engine & internet in 2008Sujit Chandak
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Lucidworks
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingSimon Hughes
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search enginesPhil Bradley
 
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web CrawlerIRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web CrawlerIRJET Journal
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesOpenSource Connections
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectorsSimon Hughes
 
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...Grokking VN
 

Semelhante a Semantics and Search by Upasna Gautam at PubCon Austin 2018 (20)

State of Search 2017 - Semantics and Science - Upasna Gautam
State of Search 2017 - Semantics and Science - Upasna GautamState of Search 2017 - Semantics and Science - Upasna Gautam
State of Search 2017 - Semantics and Science - Upasna Gautam
 
Conductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For You
Conductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For YouConductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For You
Conductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For You
 
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick StoxA Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
A Technical Look at Content - PUBCON SFIMA 2017 - Patrick Stox
 
Upsana Gautam - Advanced Search Summit Napa 2019
Upsana Gautam - Advanced Search Summit Napa 2019Upsana Gautam - Advanced Search Summit Napa 2019
Upsana Gautam - Advanced Search Summit Napa 2019
 
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
 
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
 
The New Content SEO - Sydney SEO Conference 2023
The New Content SEO - Sydney SEO Conference 2023The New Content SEO - Sydney SEO Conference 2023
The New Content SEO - Sydney SEO Conference 2023
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Semtech bizsemanticsearchtutorial
Semtech bizsemanticsearchtutorialSemtech bizsemanticsearchtutorial
Semtech bizsemanticsearchtutorial
 
You Don't Know SEO
You Don't Know SEOYou Don't Know SEO
You Don't Know SEO
 
MongoDB meetup at Hike
MongoDB meetup at HikeMongoDB meetup at Hike
MongoDB meetup at Hike
 
Staff study talk/ on search engine & internet in 2008
Staff study talk/ on search engine & internet in 2008Staff study talk/ on search engine & internet in 2008
Staff study talk/ on search engine & internet in 2008
 
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
Vectors in Search – Towards More Semantic Matching - Simon Hughes, Dice.com
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic Matching
 
Evaluating search engines
Evaluating search enginesEvaluating search engines
Evaluating search engines
 
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web CrawlerIRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
IRJET-Deep Web Crawling Efficiently using Dynamic Focused Web Crawler
 
Haystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon HughesHaystack 2019 - Search with Vectors - Simon Hughes
Haystack 2019 - Search with Vectors - Simon Hughes
 
Searching with vectors
Searching with vectorsSearching with vectors
Searching with vectors
 
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
TechTalk #13 Grokking: Marrying Elasticsearch with NLP to solve real-world se...
 

Último

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Semantics and Search by Upasna Gautam at PubCon Austin 2018

  • 1. #pubcon Semantics and Search Presented by: Upasna Gautam aka Pas
  • 2. #pubcon Objectives •What is semantic search? •What is NOT semantic search? •How does Google make it work? •How can you make it work?
  • 4. #pubcon SEO: Then & Now Back then: •Keyword-focused: • Text retrieval system relied on exact match keywords • Weighted documents by keyword frequency •Unable to distinguish synonyms and homographs • Synonym: Words that share the same meaning (e.g. car and automobile) • Homograph: More than one meaning depending on context (e.g. “charge)
  • 5. #pubcon SEO: Then & Now Now: •Driven by intent and context •Provide relevant answers to complex and vague queries
  • 7. #pubcon SEO: Then & Now Now: •“best vegan tacos austin” •“late night texmex delivery austin” •“best happy hour margaritas 78701”
  • 8. #pubcon SEO: Then & Now Now: Search Experience Optimization
  • 9. #pubcon SEO: Then & Now What enabled search engines to understand our queries on an intelligent level?
  • 11. #pubcon What is Semantic Search Semantics: A branch of linguistics that studies the relationship between words and sentences and their actual meanings. Semantic Search: The improvement of search accuracy by understanding intent and context, using various on-site elements to crawl, index, and serve relevant results.
  • 12. #pubcon What is Semantic Search •Entity Optimization •Knowledge Graph •Structured Data •Information Architecture •Co-occurrence and Clustering
  • 13. #pubcon What is Semantic Search: Entity Optimization Paul Haahr – Google Ranking Engineer – SMX 2016
  • 14. #pubcon What is Semantic Search: Knowledge Graph •Understands relationships between things •Stores and understands the intelligence between different entities •Not just a catalog of objects, but a data model for inter-relationships
  • 15. #pubcon What is Semantic Search: Structured Data •Google is a data-driven machine that needs to be fed in order for it to learn •Feed it structured data – it’s a piece of intelligence the crawler uses to build semantic relevance and authority •This is how entities are indexed!
  • 16. #pubcon What is Semantic Search: Information Architecture •Allows for a crawler to clearly understand content and how it’s connected •Provide a clear and hierarchical path of information •Lends to a good UX •The RIGHT approach is the most LOGICAL approach •Must read: Information Architecture for the World Wide Web [3rd Edition, by Peter Morville]: https://www.amazon.com/Information-Architecture-World-Wide- Web/dp/0596527349
  • 17. #pubcon What is Semantic Search: Co-Occurrence and Clustering Word Co-Occurrence Clustering • Generates topics from words frequently occurring together Weighted Bigraph Clustering • Uses URLs from Google search results to induce query similarity and generate topics The combination of these two methods demonstrated greater usefulness and accuracy when compared to Latent Semantic Analysis. Read the patent here: https://pdfs.semanticscholar.org/dcf7/05ba07ee1b73fda0c94e9d01b2474173e470.pdf
  • 18. #pubcon What is Semantic Search: Co-Occurrence and Clustering Word Co-Occurrence • A set of words anchors serve as initial topics, which are then generalized to other words co-appearing with the same queries. • Topics are created using hierarchical clustering on query similarity, which measures to what extent two queries agree on their intersections with the list of words in each topic. Bigraph Clustering • Uses organic results to create a bigraph with a set of queries and a set of URLs as nodes. Weights of the graph are computed with the impression and click data. • Bigraph clustering works very well even if the queries do not share common words
  • 19. #pubcon Latent Semantic Indexing Is NOT Semantic Search
  • 21. #pubcon • Learning the mathematical relevance helps to understand search on a functional level • LSI uses Singular Value Decomposition which is a linear algebraic factorization for many of our modern algorithms • It is not a way to “do SEO” • LSI KEYWORDS ARE NOT A THING
  • 22. #pubcon Latent Semantic Indexing Latent Semantic Indexing (LSI): •Mathematical algorithm based on Singular Value Decomposition (SVD) •Text indexing and retrieval method •How terms and concepts are related
  • 23. #pubcon Latent Semantic Indexing •LSI works by projecting a large multi- dimensional space down into a smaller number of dimensions •Semantically similar words get bunched together •Boundary blurring allows LSI to go beyond exact keyword matching
  • 24. #pubcon Latent Semantic Indexing •LSI uses Singular Value Decomposition (SVD) to decompose this matrix •Preserves information about relative distances between document vectors •Collapsed into smaller dimensions •Information is lost and words are superimposed on one another
  • 25. #pubcon Latent Semantic Indexing •Noise reduction •Reveal similarities that were latent •Similar terms become more similar, while dissimilar things remain distinct This method is a widely used technique to unveil latent themes in text data, as these models learn the hidden topics by understanding document level word co-occurrence patterns.
  • 26. #pubcon Latent Semantic Indexing Short texts, such as search queries, tweets or instant messages suffer from data sparsity, which causes problems for traditional topic modeling techniques. Unlike proper documents, short text snippets do not provide enough word counts for models to learn how words are related and to disambiguate multiple meanings of a single word. *This is why the binary co-occurrence/clustering model works better*
  • 28. #pubcon Key Takeaways •Craft and optimize content for topics and concepts, not just keywords •Use structured data to feed crawler the semantic intelligence it needs to understand your site better •Align the information architecture of your website to the consumer journey •Navigation, sitemaps, page structure, content organization •Stop saying/using “LSI keywords” •The best approach is the most logical approach!