An infinite loop occurs when a sequence of instructions in a computer program loops endlessly without a terminating condition. The document discusses issues with having too many URLs on a site that can lead to crawl budget problems with search engines like Google. It provides tips on using XML sitemaps, internal linking strategies, canonicalization, and keeping content fresh to help search engines better crawl and index a large site within the finite crawl budget.
8. Infinite Loop Definition:
An infinite loop is a sequence of
instructions in a computer program which
loops endlessly, either due to the loop
having no terminating condition, having
one that can never be met, or one that
causes the loop to start over. ..
12. Rank
CRAWL
A ranking metric for ‘no’ to ‘low’ PageRank
pages??
Pages crawled more often rank higher
Get ‘low’ to ‘no’ PageRank pages crawled more
than competitors = YOU WIN
16. DON’T BE AFRAID
of hard 404’s
Use 410’s where
you can
Giraffe
AVOID
soft 404’s
17. ENSURE THAT
Dynamic variables / parameters
are checked for validation
Don’t render to just any old
thing with a ‘200 OK’ response
code or return a soft 404
HOW WILL YOU KNOW IF
THERE’S A PROBLEM?
You won’t
20. Get Those Low Level
Pages Crawled - Often
Whichever way
you can
Pass equity to
Siblings as
Well as children
21. Visit the internal links section on GWT
Most Important Page 1
Most Important Page 2
Most Important Page 3
IS THIS YOUR
BLOG?? HOPE
NOT
22. CANONICALISATIONIn web search and search engine optimization (SEO), URL
canonicalization deals with web content that has more than one
possible URL. Having multiple URLs for the same web content
can cause problems for search engines - specifically in
determining which URL should be shown in search results.[2]
Example:
•http://wikipedia.com
•http://www.wikipedia.com
•http://www.wikipedia.com/
•http://www.wikipedia.com/?source=asdf
All of these URLs point to the homepage of Wikipedia,
but a search engine will only consider one of them to
be the canonical form of the URL.(source - Wikipedia)
23. Deal Well With
Near & near
duplicate content
Via
canonicalization,
301’s & Content
Build Out
24. STOP LYING & ‘GET
FRESH’
Genuine ‘last
modified dates’
are ALL important
- FORGET PRIORITY
25. "It's not that Google will
penalize you, it's the
opportunity cost for dirty
architecture based on a finite
crawl budget" (A.J.Kohn)
(BLIND FIVE YEAR OLD)
REMEMBER THIS