3. So.. What is it?
IS: Topical relevance
Topic Models are algorithms used to uncover
hidden thematic structures in a collection of data
… statistical model for abstract themes.
Every document has a several topics, and when
used together create a documents theme.
IS Not: Keyword usage, TF*IDF or co-occurance
www.Virante.org
@JakeBohall
4. So….... What is it?
PubCon Vegas
Marketing Vegas
SEO Drinking
SEM Gambling
Networking Convention
Center
Analytics Money
Affiliates Late Nights
www.Virante.org
@JakeBohall
5. How it works…
Create a topical model of the English language using a dictionary
restricted sample of 1,000,000 random Wikipedia articles
Accept a keyword and build an ideal document based on content that
ranks for that term
Accept content and compare to an ideal model
Build confidence score that these your content is related to the
keyword more than two randomly selected Wikipedia articles are
related to one another.
www.Virante.org
@JakeBohall
6. seoMoz deserves a bunch of credit!
Ben Hendrickson was
… a Senior Scientist at seoMoz researched this and is
now a Software Development Engineer at Google
… just a coincidence?
They built a model and gave us LDA scoring tool
What they got right
…High LDA scores do correlate to rankings
www.Virante.org
@JakeBohall
7. What happened?
Data reported wrong.
… reported .32 vs .17, suddenly the ‘buzz’ is about an error as
opposed to this amazing discovery
Experimental Validation.
…Did not run experiments to determine the method through
which this impacts rankings or traffic
(keyword breadth / long tail being the primary vector)
Did not refine the model based on experiments
seoMoz had too many other awesome projects happening
www.Virante.org
@JakeBohall
8. What did we do about it?
Built a tool using collocation to increase LDA scoring
Spent 1.5 years in R&D fleshing out our own LDA model 4x
Did an organic traffic study to try and find causation
1. nTopic modified content saw an organic traffic lift of 17.5%, random
keyword modified content saw a 10% drop and unmodified content saw
drop in traffic of 15%
2. We now know unequivocally that improving topical relevancy can
increase organic traffic.
www.Virante.org
@JakeBohall
10. What this doesn’t mean
This does not prove that LDA or topic modeling is used in Google’s
algorithm
We cannot determine the exact mechanism by which inserting nTopic
recommended terms increase Google traffic
BUT …
We can provide evidence that nTopic recommended terms do
increase Google traffic.
… http://www.ntopic.org/causal-study.php
www.Virante.org
@JakeBohall