We have already discussed on-site content in our previous video series sessions, so I will not dive deep into this. When I analyze a website, I look at the website from its highest page (normally the home page) and then take a look at the content.
Optimize Your On-Page – Robots and Sitemaps (Part II of II)
1. Optimize Your On-Page – Robots and Sitemaps
(Part II of II)
virginiaseo.org /blog/optimize-your-on-page-robots-and-sitemaps-part-2/
Chris Dill
This post is part two in a series, please read part one: Optimize Your On-Page – Meta Tags,
Technical Analysis (Part I of II)
On-Site Content
We have already discussed on-site content in our previous video series sessions, so I will not
dive deep into this. When I analyze a website, I look at the website f rom its highest page (normally
the home page) and then take a look at the content. I then navigate through the site, and absorb
the content as I go. Af ter doing this f or 10 or so minutes, I ask myself a couple of questions:
Is the content consistent throughout the site?
Is all of the content on-topic?
Does the content match the meta tags and keyword the site is using?
Is the content hard to understand?
By answering these questions, we can get a f eel f or the overall content of the website, and
develop or modif y a strategy f rom there. If all of your meta tags are aimed at optimizing your
business f or local traf f ic, but all of your content is optimized f or national traf f ic- there is going
to be a conf lict of interest, so to speak.
Sitemap.xml
A sitemap is literally a map of your site. It contains all pages and is organized in a logical
structure. When search engines crawl your site, they look at your sitemap f or inf ormation and to
ensure all of your pages get crawled. Sitemaps can also contain inf ormation about the f requency
in which your site is updated. This tells Google that they should crawl pages that change a lot
(like your blog roll) more of ten than static pages (like your about page).
There are a lot of sitemap generators online, and we will discuss one f ree website based one,
and then we will look at a f ew WordPress Addons.
XML-Sitemaps.com- http://www.xml-sitemaps.com/
This site is self -explanatory- plug in your
website and hit start and it will generate the
sitemap you need. This tool is great f or a one
time look or creation, but you will have to
manually update this sitemap as you add
content, which is a little bit tedious. So I pref er
the use of addons- and every major and minor
CMS has one or more.
Once you generate your sitemap with this tool,
it needs to be located at the root of your website. Speak with your web hosting company or web
2. designer if you’re not comf ortable with this, but uploading the XML f ile using FTP to the site root
is pretty quick and painless. If you need us to help you out with this we will do it f ree of charge,
just send us a note! The ending result needs to be domain.com/sitemap.xml
The best tool f or WordPress sitemaps is of course Yoast. Install the Yoast WordPress plugin
and go to the sitemap tab. Conf igure the f ew options there and you are of f running.
The generated sitemap can then be viewed by clicking the link at the top of the setting page:
Another tool if you are looking ONLY f or sitemaps (let’s say you already use another SEO suite
addon) is Google XML Sitemaps, by Arnee . Install this plugin and go to the plugin settings page
and you will be walked through how to set it up.
So by using these addons, you generate a sitemap which is automatically placed at the root of
your site. Now in order to make sure the major search engines (in this case Google and Bing)
crawl your sitemap, you need to submit it.
Go to Google Webmaster Tools > Crawl > Sitemaps. To add your sitemap, you click the Add/Test
Sitemap button and then tell Google where your sitemap is. Give Google 2 days to crawl it and
come back and you will see metrics about how many pages have been submitted to the Google
Index, and how many Google actually has in the index. These values should be the same. If we
look at Colonial Diving Schools sitemap, we see that there are many more submitted than are in
the index. This is due to the event system which Colonial uses, and should be solved with some
exclusions f or the sitemap.xml and also robots.txt, which we will discuss now.
Robots.txt
Robots.txt is a special f ile that Google and
other search engines read and respect. If you
use this document to say “Do not crawl pages
A, B, and C” Google makes a very good attempt
at f ollowing those rules. This is important to
keep certain pages f rom showing up in search.
Some great examples of pages that you might
not want to show up in search are:
Login Pages
Registration Pages
Cart Pages
Certain Forms
Pages that are necessary f or your site but not accessed by visitor
These pages might be needed f or your site- f or instance my site needs a login page, in order f or
customers to log in. Without it the UX is severely hampered. But I do not want Google to put my
login page into search results- so I ask nicely using my robots.txt. Every website should have one
of these, even if it is not used to restrict anything. You can f ind an awesome post about
robots.txt on Yoast.com.
Google has a help post on this, and they also of f er you some pointers on making a “def ault”
robots.txt. Designing a robots.txt is done on a per site basis, but here is the sample that I use f or
Colonial Driving School:
# robots.txt
3. User-agent: *
Disallow: /wp-content/plugins/*
Disallow: /login/
Disallow: /login/*
Sitemap: http://colonialdrivingschool.com/sitemap_index.xml
As you can see I am blocking crawlers f rom reaching the login page, I am blocking plugins f rom
presenting web pages to the crawler, and I am also declaring where the sitemap is. Let’s say that I
do some research and I notice that a certain page of the site is being crawled but coming up with
errors. Typically you see this with e-Commerce sites which have dynamic cart and product
creation- you might see a bunch of 404 errors f or a product sorting query page. This sort of
issue can be solved by adding that page to robots.txt so Google knows not to crawl them.
SEO Expert at Virginia SEO
Christopher Dill is a Christian entrepreneur who loves web design, marketing,
and anything on a computer. He is the creator and author of The Dill Design, a
local Virginia web design company. He also runs Virginia SEO, which is a SEO and
inbound marketing company. Chris is currently f inishing up his Master of
Inf ormation Systems at University of Phoenix, and works by day as a Senior
Network Engineer.
44
Latest posts by Chris Dill (see all)
A Fight To The Death: Inbound Vs. Outbound Marketing - April 13, 2014
Optimize Your On-Page – Robots and Sitemaps (Part II of II) - April 7, 2014
SEO Video Training – Session 3: On-Page, Meta Tags, Robots, and Sitemap.xml - March
31, 2014