Upasna Gautam, Manager, Search, Ziff Davis
Become fluent in voice search form, function, and success. Learn how Google processes sound and conducts speech modeling; the four voice search quality metrics Google applies; and how to enhance your own strategy with tactics for targeting content by searcher need states.
5. Keyword-Focused
• Text retrieval system
• Relied on exact-match
• Weighted documents by keyword frequency
Unable to Distinguish Synonyms and Homographs
• Synonym: Words that share the same meaning (“car” and “automobile”)
• Homograph: Words having than one meaning depending on context (“charge”)
SEO: Then and Now
Back Then:
@upasnagautam#C3NY |
6. Driven by Intent and Context
Relevant Answers to Specific
and Vague Queries
SEO: Then and Now
Now:
@upasnagautam#C3NY |
8. “best vegan tacos austin”
“late night tex mex delivery austin”
“best happy hour margaritas 78701”
SEO: Then and Now
Now:
@upasnagautam#C3NY |
9. SEO: Then and Now
Now:
@upasnagautam#C3NY |
Search Experience Optimization
10. SEO: Then and Now
@upasnagautam#C3NY |
What enabled search engines
to understand our queries
on an intelligent level?
11. SEO: Then and Now
@upasnagautam#C3NY |
The Hummingbird Update
(2013)
12. What is Semantic Search
(and What is it Not)?
@upasnagautam#C3NY |
13. A branch of linguistics that studies the relationship between words and
sentences and their actual meanings.
What is Semantic Search?
Semantics:
@upasnagautam#C3NY |
The improvement of search accuracy by understanding intent and
context, using various on-site elements to crawl, index, and serve
relevant results.
Semantic Search:
14. What is Semantic Search?
ENTITY OPTIMIZATION
KNOWLEDGE GRAPH
STRUCTURED DATA
INFORMATION ARCHITECTURE
CO-OCCURRENCE & CLUSTERING
@upasnagautam#C3NY |
15. What is Semantic Search?
ENTITY OPTIMIZATION
@upasnagautam#C3NY |
Paul Haahr_Google Ranking Engineer_SMX 2016
16. What is Semantic Search?
KNOWLEDGE GRAPH
@upasnagautam#C3NY |
Understands relationships between things
Stores and understands the intelligence
between different entities
Not just a catalog of objects, but a
data model for inter-relationships Why don’t you explain this
to me like I’m 5?
17. What is Semantic Search?
STRUCTURED DATA
@upasnagautam#C3NY |
• Google is a data-driven machine that needs to be fed in order for it to learn
• Pieces of intelligence the crawler uses to build semantic relevance & authority
• This is how entities are indexed
• Speakable Schema is HERE & it’s just the beginning for voice search markup
18. What is Semantic Search?
INFORMATION ARCHITECTURE
@upasnagautam#C3NY |
• Allows for a crawler to clearly understand content and how it’s connected
• Provide a clear and hierarchical path of information
• Lends to a good UX
• The RIGHT approach is the most LOGICAL approach
• Must read: Information Architecture for the Web and Beyond [4th Edition, by
Peter Morville]: https://www.amazon.com/Information-Architecture-Beyond-
Louis-Rosenfeld/dp/1491911689
19. What is Semantic Search?
CO-OCCURRENCE & BIGRAPH CLUSTERING
@upasnagautam#C3NY |
Word Co-Occurrence Clustering
Generates topics from words frequently occurring together
Weighted Bigraph Clustering
Uses URLs from Google search results to induce query similarity &
generate topics
The combination of these two methods demonstrated greater usefulness and
accuracy when compared to Latent Semantic Analysis.
Read the patent here:
https://pdfs.semanticscholar.org/dcf7/05ba07ee1b73fda0c94e9d01b2474173e470.pdf
20. What is Semantic Search?
CO-OCCURRENCE & BIGRAPH CLUSTERING
@upasnagautam#C3NY |
Word Co-Occurrence
A set of words anchors serve as initial topics, which are then generalized to other
words co-appearing with the same queries.
Topics are created using hierarchical clustering on query similarity, which
measures to what extent two queries agree on their intersections with the list of
words in each topic.
Bigraph Clustering
Uses organic results to create a bigraph with a set of queries and a set of URLs as
nodes. Weights of the graph are computed with the impression and click data.
Bigraph clustering works very well even if the queries do not share common words.
21. What is Semantic Search?
LATENT SEMANTIC INDEXING
IS NOT SEMANTIC SEARCH
@upasnagautam#C3NY |
23. What is Semantic Search?
Learning the mathematical relevance
helps to understand search on a
functional level
LSI uses Singular Value Decomposition
which is a linear algebraic factorization
for many of our modern algorithms
It is not a way to “do SEO”
LSI KEYWORDS ARE NOT A THING
@upasnagautam#C3NY |
24. What is Semantic Search?
Latent Semantic Indexing (LSI):
Mathematical algorithm based on Singular Value
Decomposition (SVD)
Text indexing and retrieval method
How terms and concepts are related
Projects a large multi-dimensional space down into a
smaller number of dimensions
Semantically similar words get bunched together
Boundary blurring allows LSI to go beyond exact
keyword matching
@upasnagautam#C3NY |
25. What is Semantic Search?
Latent Semantic Indexing (LSI):
Noise reduction
Reveal similarities that were latent
Similar terms become more similar, while dissimilar things remain distinct
This method is a widely used technique to unveil latent themes in text data, as these models learn
the hidden topics by understanding document level word co-occurrence patterns.
@upasnagautam#C3NY |
26. What is Semantic Search?
Latent Semantic Indexing (LSI):
Short texts, such as search queries, tweets or instant messages suffer from data
sparsity, which causes problems for traditional topic modeling techniques. Unlike
proper documents, short text snippets do not provide enough word counts for
models to learn how words are related and to disambiguate multiple meanings of a
single word.
*This is why the binary co-occurrence/clustering model works better*
@upasnagautam#C3NY |
28. The Voice Search Framework
Automatic Speech Recognition
Automatic Speech Recognition (ASR),
fueled by deep learning neural networking,
is the system that powers applications like
speech transcription and voice search.
@upasnagautam#C3NY |
29. The Voice Search Framework
Automatic Speech Recognition
ASR is the FORM behind the voice search FUNCTION.
@upasnagautam#C3NY |
30. The Voice Search Framework
Automatic Speech Recognition
How do humans do it?
Human articulation produces sound waves which
the ear conveys to the brain for processing.
@upasnagautam#C3NY |
New phone who dis
31. The Voice Search Framework
@upasnagautam#C3NY |
Automatic Speech Recognition
How Do Machines Do it?
32. The Voice Search Framework
@upasnagautam#C3NY |
Google’s Voice Search Quality Metrics
“We strive to find metrics that illuminate the end-user experience, to make sure that we
optimize the most important aspects and make effective tradeoffs. We also design metrics
which can bring to light specific issues with the underlying technology.” -GOOG
•Google has defined and uses a set of metrics
to track the quality of its voice search system.
•They use these metrics to drive their
research directions as well as provide insight
and guidance for solving specific problems
and tuning system performance.
33. The Voice Search Framework
@upasnagautam#C3NY |
Google’s Voice Search Quality Metrics
•Word Error Rate (WER)
•Semantic Quality (Webscore)
•Perplexity (PPL)
•Out-of-Vocabulary Rate (OOV)
•Latency
Google Voice Search Case Study
34. The Voice Search Framework
@upasnagautam#C3NY |
Google’s Voice Search Quality Metrics
The SERP has evolved into a
dynamic, purchase-driven environment,
with the integration of product carousels,
featured snippets with product rankings,
research carousels,
and of course,
the shopping carousel.
35. The Voice Search Framework
@upasnagautam#C3NY |
Google’s Voice Search Quality Metrics
A High-Quality UX is a Fast UX
From the time it takes to detect end-of-speech to the time it takes to
render search results, time is of the essence for speech processing.
“It is generally desirable to reduce any user noticeable latency, and in certain
circumstances, may be desirable to reduce latency even if improved speed
comes at the cost of reduced quality ASR results.” -GOOG
37. Tactical Takeaways
@upasnagautam#C3NY |
• Craft and optimize content for topics and concepts, not just keywords
• Use structured data to feed crawler the semantic intelligence it needs to
understand your site better
--Speakable Schema is HERE and it’s just the beginning
• Align the information architecture of your website to the consumer
journey
--Navigation, sitemaps, page structure, content organization
• Invest in speed optimization
• Provide answers to SPECIFIC questions about your products and services
(Featured/Rich Snippets!)