Google is using Large Language Models and Machine Learning in the algorithms that rank your sites and show them to users.
This talk will help you better understand from BERT to Rank Brain to Neural Matching and SGE, how they work, and what you should do about it.
2. @schachin
Kristine Schachinger
Kristine Schachinger
• Started at a front-end dev & designer
Claim to Fame – Designed Reba McEntire’s site
• Started in SEO 2005
• Consultant 2009 – Present
• Some sites I have worked with:
GoodRx, Vice Media, Zappos, Instacart, Healthline, Jack in the Box, Discover,
USA.gov, Salon.com, Paychex,com, AndroidHeadlines.com, Patch Media etc
• Judge: US Search Awards, UK Search Awards, EU Search Awards
and since I said yes to all the Search Awards during the pandemic, there might be more.
• Specialties: Site Auditing, Site Recoveries, Technical SEO, and all the rest.
• Articles in: WIX SEO, Search Engine Journal, Marketing Land, Search Engine Land,
and Search Engine Watch -- among others.
• Speaker: BrightonSEO San Diego, iGaming, Affiliate Summit West, BarbadosSEO,
UngaggedUK/US, State of SearchLeeds, Pubcon, SMX, RIMC, SXSWi -- and others.
9. @schachin
Kristine Schachinger
In ONE SECOND today, there were
http://www.internetlivestats.com/google-search-statistics/
http://www.internetlivestats.com/google-search-statistics/
15. @schachin
Kristine Schachinger
Google Myth: AI, machine learning, & deep learning are all the same thing
While artificial intelligence (AI) is a convenient and commonplace term, it has no
widely agreed-upon technical definition. One helpful way to think about AI is as the
science of making things smart. Much of the recent progress we’ve seen
in AI is based on machine learning (ML), a subfield of AI where
computers learn and recognize patterns from examples, rather
than being programmed with specific rules. There are many different
ML techniques, but deep learning is a particularly popular one right now. Deep
learning is based on neural network technology, an algorithm whose architecture is
inspired by the human brain and can learn to recognize pretty complex patterns, such
as what “hugs” are or what a “party” looks like.
https://ai.google/static/documents/exploring-6-myths.pdf
16. @schachin
Kristine Schachinger
Google
Myth: AI is approaching
human intelligence
“While AI systems are
nearing or outperforming
human beings at
increasingly complex tasks
like generating musical
melodies or playing the
game of Go, they remain
narrow and brittle, and lack
true agency or creativity.”
https://ai.google/static/documents/exploring-6-myths.pdf
17. @schachin
Kristine Schachinger
Google
THERE ARE THREE PLACES GOOGLE APPLIES MACHINE LEARNING
IN THE ORGANIC SEARCH ENGINE.
+ PRE-SCORING
LANGUAGE MODELS
+ AD HOC POST-SCORING
RANK BRAIN
NEURAL MATCHING
+ LIVE RANKING FACTORS
HELPFUL CONTENT UPDATE
THE BIG DADDIES! SGE and MUM ARE IN A CLASS BY ITSELF.
19. @schachin
Kristine Schachinger
In the beginning there was…
Word2Vec the Embedded Word Model
Semantic Search.
https://www.tensorflow.org/tutorials/representation/word2vec
20. @schachin
Kristine Schachinger
Word Embedding
Vector space models (VSMs) represent
(embed) words in a continuous vector space
where semantically similar words are
mapped to nearby points
('are embedded nearby each other').
Word2Vec
https://www.tensorflow.org/tutorials/representation/word2vec
23. @schachin
Kristine Schachinger
• Words go in.
• Words get assigned a mathematical address in a vector.
• Similar and related words sit close to each other in the vector space.
• Words are retrieved based on your query and the words it locates in the “best fit” vector.
• These word “interpretations” are used to return results.
Begging of Semantic Search.
27. @schachin
Kristine Schachinger
Sesame Street and Search
What is BERT?
Natural Language Processing pre-training called Bidirectional
Encoder Representations from Transformers, or BERT.
Moving from NLU into early NLP
28. @schachin
Kristine Schachinger
Google
https://searchengineland.com/how-google-uses-artificial-intelligence-in-google-search-379746
BERT. ”BERT, Bidirectional Encoder Representations from Transformers, came in 2019, it is a neural
network-based technique for natural language processing pre-training. looking at the sequence of words
on a page, so even seemingly unimportant words in your queries are counted for in the result.”
• Year Launched: 2019
• Used For Ranking: No
• Looks at the query and content language
• All languages
• Language Training Model: Prescoring
• Very commonly used for many queries
• Can you optimize for it? No
30. @schachin
Kristine Schachinger
https://bensen.ai/elmo-meet-bert-recent-advances-in-natural-language-embeddings/
BERT, or Bidirectional Encoder Representations from Transformers, improves upon
standard Transformers by removing the unidirectionality constraint by using a masked language
model (MLM) pre-training objective. The masked language model randomly masks some of the tokens
from the input, and the objective is to predict the original vocabulary id of the masked word based only
on its context. Unlike left-to-right language model pre-training, the MLM objective enables the
representation to fuse the left and the right context, which allows us to pre-train a deep bidirectional
Transformer. In addition to the masked language model, BERT uses a next sentence prediction task
that jointly pre-trains text-pair representations.
There are two steps in BERT: pre-training and fine-tuning. During pre-training, the model is trained on
unlabeled data over different pre-training tasks. For fine-tuning, the BERT model is first initialized with
the pre-trained parameters, and all of the parameters are fine-tuned using labeled data from the
downstream tasks. Each downstream task has separate fine-tuned models, even though they are
initialized with the same pre-trained parameters.
Sesame Street and Search: BERT Definition
31. @schachin
Kristine Schachinger
LLMs can go forward and backwards
to predict an unknown (masked) term and/or sentence.
Also uses root words, so play for player/playing/played are the same
This allows them to derive context for what is being written.
Previous models were based on word vectors (entities and knowledge graphs)
LLM Transformers are Bidirectional
https://blog.google/products/search/search-language-understanding-bert/
32. @schachin
Kristine Schachinger
Sesame Street and Search: Why is BERT Special?
BERT can disambiguate words from the sentence and apply meaning forward and backward to those
words in order to predict a masked word using those applied contexts. This is SUPER EFFICIENT!
33. @schachin
Kristine Schachinger
Because BERT can go forward and backwards
to predict an unknown (masked) term and/or sentence.
Also uses root words, so play for player/playing/played are the same
Sesame Street and Search: Why is BERT Special?
https://blog.google/products/search/search-language-understanding-bert/
34. @schachin
Kristine Schachinger
Why are LLMs So Special?
Large Language modeling can determine the meaning of words in context
so it can better predict the next word in the sentence.
These sentences mean two different things forward and backward.
35. @schachin
Kristine Schachinger
How does this work? Transformers
What are transformers?
A transformer in language processing is a type of computer program
that is designed to understand and generate text.
It does this by using a special type of algorithm called self-attention.
Self-attention allows the program to look at all the words in a
sentence or a piece of text at once, and understand how they relate
to each other, rather than just one word at a time like traditional
methods. This way it can better understand the meaning of the text,
and can generate text that is more similar to how a human would
write.
37. @schachin
Kristine Schachinger
Simply put BERT or language modeling is
“Language modeling – although it sounds formidable –
is essentially just predicting words in a blank.”
38. @schachin
Kristine Schachinger
Why does it matter to us as SEOs?
It mostly doesn’t.
It was a breakthrough in Language Model
Processing, because it is …
+ VERY Fast
+ Uses fewer resources
+ Provides better understanding of content
42. @schachin
Kristine Schachinger
Rank Brain.
Rank Brain & Neural Matching & the
Document Relevancy Model (DRAM)
“Document relevance ranking, also known as adhoc retrieval
is the task of ranking documents from a large collection using
the query and the text of each document only.”
Rank Brain.
43. @schachin
Kristine Schachinger
Rank Brain vs Neural Matching.
Both are used to re-ordered the results post retrieval
according to “ad hoc retrieval” methods and ”dynamic relevancy”
Ranking with ONLY the document text
• https://www.searchenginejournal.com/google-neural-matching/271125/
• http://www2.aueb.gr/users/ion/docs/emnlp2018.pdf
59. @schachin
Kristine Schachinger
• When do you see it?
• Relationships between entities & search intent are weak or unknown
• -- enter Rank Brain.
• Behind the scenes, data is continually fed into the machine
learning process, to make results more relevant the next time.
• Can be combined with other algorithms such as neural matching
• No way to optimize for it
• BUT you can help prevent your page from getting one of these
results check your results for your queries.
Make sure Google is NOT CONFUSED.
Rank Brain.
60. @schachin
Kristine Schachinger
• When do you see it?
• Relationships are weak or unknown
• -- enter Rank Brain.
• Behind the scenes, data is continually fed into the machine
learning process, to make results more relevant the next time.
• Can be combined with other algorithms such as neural matching
• No way to optimize for it
• BUT you can help prevent your page from getting one of these
results check your results for your queries.
Make sure Google is NOT CONFUSED.
Rank Brain.
63. @schachin
Kristine Schachinger
Google
https://searchengineland.com/how-google-uses-artificial-intelligence-in-google-search-379746
Neural matching. Neural matching was released in 2018 - expanded to the local search results in 2019.
Neural matching does specifically help Google rank search results and is part of the POST ad-hoc
ranking algorithms.
Links CANNOT affect this ranking sort.
• Year Launched: 2018
• Used For Ranking: Yes (but post scoring)
• Looks at the query and content language
• Works for all languages
• Very commonly used for many queries
• Applied post scoring ad hoc
• Can you optimize for it? Yes and No
70. @schachin
Kristine Schachinger
Rank Brain vs Neural Matching.
RankBrain helps Google better relate pages to concepts.
Neural Matching helps Google better relate words to searches.
• Rank Brain = page concepts
• Neural Matching = linking words to the page concepts
“…neural matching, – AI method to better connect words to concepts.” - Google
https://www.seroundtable.com/google-explains-neural-matching-vs-rankbrain-27300.html
73. @schachin
Kristine Schachinger
Google Helpful Content Update
“Our classifier for this update runs continuously, allowing it to monitor newly-launched sites and
existing ones. As it determines that the unhelpful content has not returned in the long-term, the
classification will no longer apply.
This classifier process is entirely automated, using a machine-learning model.”
https://developers.google.com/search/blog/2022/08/helpful-content-update
75. @schachin
Kristine Schachinger
Google Helpful Content Update
Main Points
• Ranking signal NOT an update
• First known ranking signal that has machine learning
• Continually rolling but with delays, so can take 2-3
months to catch-up with your site
• Sitewide but severity based on the number of issued
pages
• Other factors can lessen the devaluation (like
content quality on other pages)
• Seems to target what Panda and Penguin did with an
additional focus on the quality of “usefulness” or
“helpfulness”
• Is your content differentiating itself?
DALL-E image for “Angry SEO”
76. @schachin
Kristine Schachinger
Helpful Content + Page Experience
“Helpful content generally offers a good page
experience. That's why today, we've added a
section on page experience to our guidance on
creating helpful content and revised our help
page about page experience. We think this all will
help site owners consider page experience more
holistically as part of the content creation
process…”
https://developers.google.com/search/blog/2023/04/page-experience-in-search
HCU + Page Experience.
78. @schachin
Kristine Schachinger
Google
Myth: can’t detect AI content.
AI systems can predict that content is likely
created by AI.
How?
AI cannot create anything. It is only able to
use what is knows to detect patterns and then
in the case of content, use those patterns to
“write content”
So, AI can recognize patterns of how AI would
“write” and determine a likelihood that this
item is written by AI.
It is not 100%, but it can be done.
Google has an algorithm that detects AI
repurposed scraped content.
https://ai.google/static/documents/exploring-6-myths.pdf
79. @schachin
Kristine Schachinger
Google
Myth: can’t detect AI content.
AI systems can predict that content is likely
created by AI.
How?
AI cannot create anything. It is only able to use
what is knows to detect patterns and then in the
case of content, use those patterns to “write
content”
So, AI can recognize patterns of how AI would
“write” and determine a likelihood that this item
is written by AI.
It is not 100%, but it can be done.
Google has an algorithm that
detects AI repurposed scraped
content.
https://www.seroundtable.com/google-ai-plagiarized-content-34495.html
80. @schachin
Kristine Schachinger
Some of the general parameters that are used to train language models include:
AI Content, Google, and the HCU.
Google says AI Content is okay IF it provides value and it not “spammy”,
But since it is writing what it trained on how does provide value?
81. @schachin
Kristine Schachinger
Some of the general parameters that are used to train language models include:
How does Google define “Spammy” content?
AI Content, Google, and the HCU.
82. @schachin
Kristine Schachinger
Some of the general parameters that are used to train language models include:
https://developers.google.com/search/blog/2022/08/helpful-content-update
Google and
the Helpful Content Update.
AI Content, Google, and the HCU.
85. @schachin
Kristine Schachinger
GoogleMUM (Multitask Unified Model)
“…has the potential to transform how Google helps you with complex tasks. MUM
uses the T5 text-to-text framework and is 1,000 times more powerful than BERT.
MUM not only understands language, but also generates it.”
Built on top of BERT.
____________
Possible related patent
https://www.searchenginejournal.com/what-is-google-mum/407844/
https://blog.google/products/search/introducing-mum/
https://www.fastcompany.com/90681337/google-mum-search
86. @schachin
Kristine Schachinger
“The choice of multimodal models fits Google because of the increased number of non-text
based sources, such as video in the form of livestreams or similar, and audio files, as in the
case of podcasts. To develop MUM, Google trained the algorithm "across 75 different
languages and many different tasks at once" to refine its comprehension of information and
digital details.
MUM also considers knowledge across languages, comparing a query to sources that aren’t
written in the user's native language to bring better information accuracy.
As a result Google claims MUM is 1,000 times more powerful than
BERT.”
https://www.cmswire.com/digital-marketing/what-marketers-can-expect-from-google-mum/
GoogleMUM (Multitask Unified Model)
87. @schachin
Kristine Schachinger
Reid acknowledges that MUM carries its own risks. “Any time you’re training a model based on
humans, if you’re not thoughtful, you’ll get the best and worst parts,” she says. She emphasizes
that Google users human raters to analyze the data used to train the algorithm and then assess
the results, based on extensive published guidelines.
“Our raters help us understand what is high quality content, and that’s what we use as
the basis,” she says. “But even after we’ve built the model, we do extensive testing, not
just on the model overall, but trying to look at slices so that we can ensure that there is
no bias in the system.”
The importance of this step is one reason why Google isn’t
deploying all its MUM-infused features today.”
https://www.cmswire.com/digital-marketing/what-marketers-can-expect-from-google-mum/
GoogleMUM (Multitask Unified Model)
91. @schachin
Kristine Schachinger
AI is ever-changing and unfixed.
Don’t waste the time and resources on gaming it.
But you can make it easier for the machine
learning to get it right.
Do you optimize for Machine Learning?
97. @schachin
Kristine Schachinger
Simple answer to a very complex issue?
Do your normal query research,
check the SERPs for Rank Brain issues
and then just write naturally.
Using specificity (topical hubs) PLUS
depth & breadth to create holistic content.
98. @schachin
Kristine Schachinger
Write holistic content? Does your content have depth, breadth, & semantic relationships?
Use terms that are semantically related. Image search is great for showing related terms.
102. @schachin
Kristine Schachinger
What is Structured Data?
Structured data for SEO purposes is on-page markup that
enables search engines to better understand the information
currently on your site’s web pages, and then use this information
to improve search results listing by better matching user intent.
108. @schachin
Kristine Schachinger
We can help give Google a clearer understanding.
That helps us help Google better answer
the questions users ask
and to better surface our content for those users
We give our data meaning
Google Understands
112. @schachin
Kristine Schachinger
Well Formed Text & Parsey McParseFace.
http://www.kurzweilai.net/google-open-sources-natural-language-understanding-tools
Ray Kurzweil on Google NLU
113. @schachin
Kristine Schachinger
Questions = Well Formed Text
https://ai.google/research/pubs/pub47323
“Understanding natural language queries is fundamental to many practical NLP
systems. Often, such systems comprise of a brittle processing pipeline, that is not
robust to "word salad" text ubiquitously issued by users. However, if a query
resembles a grammatical and well-formed question, such a pipeline is able to
perform more accurate interpretation, thus reducing downstream compounding
errors.”
117. @schachin
Kristine Schachinger
Takeaways.
• Think Search Queries NOT Simple Keywords
• Write in natural language
• Write using holistic content
• Focus on depth and breadth with related terms
• Add Structured Data
• Use well formed text (ie questions) when you can.
Takeaways.