SearchLove Boston 2017 | Will Critchlow | Building Robot Allegiances

Knowing ranking factors won’t
be enough
How to avoid losing your job to a robot
@willcritchlow

I’m going to tell you about a robot
that understands ranking factors
better than any of you
...but before I get to that, let’s look at a bit of history...

Unsurprisingly, I got
an answer

But it got me thinking
about how, in 2009,
the results would
have looked more like
this.

In 2009, it would have
looked more like this.
With every title
containing the
keyphrase.

In 2009, it would have
looked more like this.
With every title
containing the
keyphrase.
Most at the beginning.

OK. Maybe wikipedia
would have been #1.

My mental model for ~2009 ranking factors had
three different modes:

One in the
hyper-competitive
head
My mental model for ~2009 ranking factors had
three different modes:
One in the
competitive
mid-tail
...and
one
in
the
long-tail

One in the
hyper-competitive
head

Tons of perfectly on-topic
pages to choose from
One in the
hyper-competitive
head

So pick only perfectly-on-topic pages
One in the
hyper-competitive
head

(*) Page authority, but the
domain inevitably factors into
that calculation. This is why
so many homepages ranked
One in the
hyper-competitive
head
...and rank by authority (*)

This resulted in a mix
of homepages of
mid-size sites, and
inner pages on huge
sites
One in the
hyper-competitive
head

But the general way
to move up was
through increased
authority
One in the
hyper-competitive
head

Kind of search
result
Pages ranking To move up...
Head Homepages of mid-size
sites and inner pages of
massive sites. All
perfectly-targeted.
Improve authority.
Mid-tail
Long-tail

One in the
hyper-competitive
head
One in the
competitive
mid-tail

Wealth of ROUGHLY
on-topic pages to
choose from
One in the
competitive
mid-tail

PERFECTLY on-topic
could do well even on
a relatively weak site
One in the
competitive
mid-tail

Rank the roughly
on-topic pages by
authority x “on-topicness”
One in the
competitive
mid-tail

Move up with better
targeting or more
authority
One in the
competitive
mid-tail

Kind of search
result
massive sites. All
perfectly-targeted.
Improve authority.
Mid-tail Perfectly on-topic pages
on relatively weak sites
plus roughly on-topic on
bigger sites.
Improve targeting or
authority.
Long-tail

One in the
hyper-competitive
head
One in the
competitive
mid-tail
...and
one
in
the
long-tail

In the long-tail, a site
of arbitrary weakness
could rank if it was the
most relevant
...and
one
in
the
long-tail

Otherwise, massive
sites rank with
off-topic pages that
mention something
similar
...and
one
in
the
long-tail

Generally, move up
with better targeting
...and
one
in
the
long-tail

Kind of search
result
massive sites. All
perfectly-targeted.
Improve authority.
bigger sites.
authority.
Long-tail Arbitrarily-weak on-topic
pages and
roughly-targeted deep
pages on massive sites.
Improve targeting.

Kind of search
result
massive sites. All
perfectly-targeted.
Improve authority.
bigger sites.
authority.
Long-tail Arbitrarily-weak on-topic
pages and
roughly-targeted deep
pages on massive sites.
Improve targeting.
So that was
~2009

It’s not so simple any more.
Google is harder to understand these days.

PageRank
(the first algorithm to
use the link structure of
the web)
We know how we got to ~2009...

Information
retrieval
PageRank

Information
retrieval
PageRank Original research

Information
retrieval
PageRank Original research TWEAKS
...with growing complexity in subsequent years

Particularly this comment from a user called Kevin Lacker (@lacker):

I was thinking about it like it was a
math puzzle and if I just thought
really hard it would all make sense.
-- Kevin Lacker (@lacker)

Hey why don't you take the square
root?
-- Amit Singhal according to Kevin Lacker (@lacker)

oh... am I allowed to write code that
doesn't make any sense?
-- Kevin Lacker (@lacker)

-- Amit Singhal according to Kevin Lacker (@lacker)
Multiply by 2 if it helps, add 5,
whatever, just make things work
and we can make it make sense
later.

3 big reasons:
High-
dimension
Non-linear
Discontinuous

High-
dimension
Non-linear
Discontinuous

You might know what any one of
the levers does, but they can
interact with each other in complex
ways
This is what a high-dimensional function looks like

We sell custom cigar humidors. Our
custom cigar humidors are handmade. If
you’re thinking of buying a custom cigar
humidor, please contact our custom
cigar humidor specialists at
custom.cigar.humidors@example.com
What this needs is another mention of [cigar humidors]

With no mentions of [cigar] or [humidor] this
page would be unlikely to rank
And yet you can clearly go too far, and have the effect turn negative.
This is called nonlinearity.
The cigar example is taken directly from Google’s quality guidelines.

Discontinuities are steps in the
function
Think about so-called “over-optimization” tipping points

Think about category pages:
Do you recommend removing “SEO text”?
We’ve tested it, so we know the answer.

If you said “yes”, congratulations
(+3.1% organic sessions in a split-test)

Unless you’re responsible for this site
No effect / possible negative effect

No, but I’m still pretty good at this
You’re thinking this to yourself right now.

I promised to tell you about a robot
that is better than even
experienced SEOs...
Well. It turns out all we needed was a coin to flip. You’re all fired.

It’s only going to get worse under Sundar Pichai

Who knows who this is?
(This is the only CC-licensed photo of him on the internet)

John Giannandrea - Google’s head of search
Sundar’s choice to lead search after Amit. Previously running machine learning.

...and of course Jeff Dean is doing Jeff Dean things
(c.f. Chuck Norris)

Jeff Dean puts his pants on one leg
at a time, but if he had more legs,
you would see that his approach is
O(log n).
Source: Jeff Dean facts

Once, in early 2002, when the
search back-ends went down, Jeff
Dean answered user queries
manually for two hours.
Result quality improved markedly during this time

When Jeff Dean goes on vacation,
production services across Google
mysteriously stop working within a
few days.
This was reportedly actually true

The original Google Translate was
the result of the work of hundreds of
engineers over 10 years.

Director of Translate, Macduff
Hughes said that it sounded to him
as if maybe they could pull off a
neural-network-based replacement
in three years.

Jeff Dean said “we can do it by the
end of the year, if we put our minds
to it”.

Hughes: “I’m not going to be the one
to say Jeff Dean can’t deliver speed.”

A month later, the work of a team of
3 engineers was tested against the
existing system. The improvement
was roughly equivalent to the
improvement of the old system over
the previous 10 years.

Hughes sent his team an email. All
projects on the old system were to be
suspended immediately.
[Read the whole story ]

Background reading:(backchannel, bloomberg)

How to avoid losing your job to a
robot
This is what you promised, Will.

Let’s start by
understanding
some robot
weaknesses

Ooh. Ooh.
I know this one.
-- robot

“It’s a leopard. I’m like 99% sure.”

Computers are better than humans at
classification, but struggle with adversaries
Read more about this here -- Cheetah, Leopard, Jaguar

We don’t fully understand all ML mistakes
See: adversarial AI

And when you’re trying to fool the machine...
See: adversarial AI

You get some really wild examples
See: adversarial AI

Lesson:
We expect adversarial abilities to
take a step backwards
They will remain good at classifying bad links but will be likely to fall
prey to weird outcomes in adversarial situations

We’re going to see new kinds of
bugs

Rules of ML [PDF] outlines engineering lessons
from getting ML into production at Google

That document also has a section on trying to
understand what the machines are doing

But human explainability may not
even be possible
Not every concept a neural network uses fits neatly into a concept for
which we have a word. It’s not clear this is a weakness per se, but...

...this means that engineers won’t
always know more than we do
about why a page does or doesn’t
rank
The big knowledge gap of the future is data - clickthrough rates,
bounce rates etc.

Check out Tom Capper’s presentation on how
engineers’ statements can be misleading

...and remember the confounding split-tests
It’s already not always as simple as “feature X is good”
Which all means we may need to be more independent-minded and do
more of our own research

Michael Lewis’ latest book is
about Kahneman and Tversky
spelling.
It recounts a story about a piece
of medical software that existed
in the 1960s.

It was designed to encapsulate
how a range of doctors
diagnosed stomach cancer from
x-rays.

It proceeded to outperform those
same doctors despite only
containing their expertise.
Real people have biases, and fool
themselves.
Encapsulate your own expert
knowledge.

At Distilled, we use a
methodology we call the
balanced digital scorecard.
This encapsulates our beliefs
about how to build a
high-performing business.
Applying it helps avoid our own
biases.

Also, while we are talking about
books, The Checklist Manifesto is
an important part of avoiding the
same cognitive biases.

Focus on consulting skills
I’ve written a few things about
this (DistilledU module, writing
better business documents, using
split-tests to consult better).
Use case studies and creativity.
Computers are better at
diagnosis than cure.
This means: getting things done,
convincing organizations,
applying general knowledge,
learning new things.

We are going to need to be
better than ever at debugging
things.
I wrote about debugging skills for
non-developers here.
A lot of the story of enterprise
consulting is going to be about
figuring out why things have
gone wrong in the face of sparse
or incorrect information from
Google.

Disregard expert surveys
Firstly, there are all the problems
outlined in the search result pairs
study - both in the ability of
experts to understand factors,
and in your ability to use the
information even if they do.
Secondly, they are broken with
another bias called the “law of
small numbers” from Lewis’ book.
PS - I say this as a participant in
many of them
Me

Equally, building your digital
strategy on what Google tells you
to do will become an even worse
idea than it already is.

This is why we have been investing so much in split-testing
Check out odn.distilled.net if you haven’t already. The team will be happy to
demo for you.
We served ~5 billion requests last quarter and recently published
everything from response times to our +£100k / month split test.

Let’s recap
1. Even in a world of 200+ “classical” ranking factors, humans were bad at
understanding the algorithm

Let’s recap
2. Machine learning will make this worse, and is accelerating under Sundar

Let’s recap
3. There are things computers remain bad at, and rankings will become more
opaque even to Google engineers

Let’s recap
4. We remain relevant by:
a. Using methodologies and checklists to capture human capabilities and
avoid our biases

Let’s recap
avoid our biases
b. Becoming great consultants and change agents

Let’s recap
avoid our biases
c. Debugging the heck out of everything

Let’s recap
avoid our biases
d. Avoiding being misled by experts or Google

Let’s recap
avoid our biases
d. Avoiding being misled by experts or Google
Testing!

What about that robot I promised
you?
The coin flip wasn’t really it

The specifics of DeepRank
We started with a broad range
of unbranded keywords from
our STAT rank tracking.
For each of the URLs ranking in
the top 10, we gathered key
metrics about the domain and
page - both from direct crawling
and various APIs.
We turned this into a set of pairs
of URLs {A,B} with their
associated keyword, metrics,
and their rank ordering.
Gather and
process
training data

We started with a broad range
of unbranded keywords from
our STAT rank tracking.
For each of the URLs ranking in
the top 10, we gathered key
metrics about the domain and
page - both from direct crawling
and various APIs.
We turned this into a set of pairs
of URLs {A,B} with their
associated keyword, metrics,
and their rank ordering.

We have so far trained on just 10
metrics for a relatively small
sample (hundreds) of keywords.
Our current version is only a few
layers deep with only 10 hidden
dimensions.
The current training samples 30
pairs at a time and trains against
them for 500 epochs.
Train the
model
Gather and
process
training data

Train the
model
Gather and
process
training data
Model
The next task is to get way more
metrics for thousands of
keywords.
This will enable us to train a
much deeper model for much
longer without overfitting.
We also have some more
hyperparameter tuning to do.

To run the model, we input a
pair of pages with their
associated metrics.
New
input

We get back a probability of
page A outranking page B.
Model
Probability-
weighted
predictions
New
input

The goal is a winning combination
of human and machine
Human + computer beats computer (for now)

● Mobius strip
● Confusion
● Signal box
● Cigar
● Discontinuity
● Confidence
● Burt Totaro
● Sundar Pichai
● John Giannandrea
● Chuck Norris
● Jeff Dean
● Fencing
● Keyboard
Image credits
● Go
● Robot
● Leopard print sofa
● Leopard
● Bug
● Lego robots
● Iron Man
● Boston

SearchLove Boston 2017 | Will Critchlow | Building Robot Allegiances

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a SearchLove Boston 2017 | Will Critchlow | Building Robot Allegiances

Semelhante a SearchLove Boston 2017 | Will Critchlow | Building Robot Allegiances (20)

Mais de Distilled

Mais de Distilled (20)

Último

Último (20)

SearchLove Boston 2017 | Will Critchlow | Building Robot Allegiances