Website Architecture for Search Engines

Konferenz
Suchmaschinen-Optimierung
Kongresshaus Zürich

Website Architecture for Search Engines

Joe Spencer
Spencer e-Strategies
13/10/2010

1

Common SEO Strategy

On-Page Optimization
Off-Page Optimization
• Keyword in Title • Backlinks
• Keyword in Meta Tags • More Backlinks
• Keyword in Content
• Unique Content Per Page
• SEO Landing Pages

Joe Spencer | 13.10.2010

Advanced SEO Stragegy

On-Page Optimization
Off-Page Optimization
• Keyword in Title • Backlinks
• Keyword in Meta Tags • More Backlinks
• Keyword in Content
• Unique Content Per Page
• SEO Landing Pages

Technical Optimization
• URL Structure
• Code Optimization
• HTTP Headers
• Robots.txt
• XML Sitemap Files


Website Architecture for SEO

Technical Optimization
• Site Structure
• Multi-Language Websites
• Duplicate Content Issues
• URL Structure
• URL Canonicalization
• Javascripts
• W3C Validation
• Website Navigation
Features
• Restricting Indexing
• HTTP Headers


HTML Code Requirements

• The content area should be positioned high in the HTML code

• W3C Validation
http://validator.w3.org/

• All HTML should be lower case

• Remove Comments

• Avoid Frames and iFrames

• Use External CSS and JavaScript files

• Uncompressed size of HTML files should be 25kb or less.

• Use Gzip compression to compress the files


CSS Code Requirements

• Avoid Inline Styles

• Use external CSS files

• Position the external CSS file links in the HTML Header


Javascript Code Requirements

• Use of Inline Javascripts should be avoided

• Use external Javascript files

• Limit the sizes of the Javascript files to 50kb


Max Number of HTTP Request

When a page loads into a browser HTTP Request are sent to the server for each file
that is required to be downloaded to view the page.

The number of HTTP Request should be 20 or less per page to reduce loading time.


URL Structure

●
Avoid dynamic URLs

●
Use lower-case characters

●
Best to use dashes (-) rather than underscores (_)

●
Directories should contain index.html or default.html file for the default page.
Avoid using intro.html or other generic names for the default page.

●
Use URL Rewrites for creating Search Engine Friendly URLs


Flat URL Structure

Home Page
Category 1 Category 2 Category 3

Page Page Page Page Page Page Page Page Page
1 2 3 1 2 3 1 2 3

• Don't go more than 2-3 levels deep in your category structure.

• Include targeted keywords for the categories and page names.

Example:
The URL for page three in category three would look like:
http://www.yourdomain.com/theme3/page3.htm


URL Rewrites

• Allows the placement of targeted keywords in the URLs
Example: http://www.mydomain.com/targeted-keyword/

• Insure that all pages load from a single URL otherwise this will create
URL Canonicalization Issues


URLs for Multi-Language Websites

Each language should be included in separate categories.

Examples:
Default Language: http://www.mydomain.com/
English Language: http://www.mydomain.com/en/
German Language: http://www.mydomain.com/de/


URL Canonicalization

Often URL for web pages can be indexed with different URLs which creates
Duplicate Content Issues.

Common Homepage Example:
http://www.yourdomain.com/
http://yourdomain.com/
http://www.yourdomain.com/index.html
http://yourdomain.com/index.html

Common Dynamic URL Example:
http://www.yourdomain.com/index.php?&page=1
http://www.yourdomain.com/index.php?page=1&parameter=123


URL Canonicalization Tag

The URL Canonicalization Tag allows you to specify the preferred version of a URL.

<link rel="canonical" href="http://www.mydomain.com/">


Controling Web Crawlers

3 Ways to Control Web Crawlers

• Robots.txt files

• Robots Meta Tags

• NoFollow Tags


Robots.txt Files

The Robots.txt File is used to restrict search engine spiders from indexing pages.


Robots.txt Files

Robots.txt Command Examples
• User-agent: *
Defined the Search Spider

• Disallow: /form
Defined a restriction for directory /form

• Disallow: /*ln0
Restricts all URLs containing ln0

• Disallow: /*utm_source=
Restricts all URLs containing utm_source=

• Disallow: /*feed.xml
Restricts all URLs containing a file called feed.xml

• Disallow: /*.pdf$
Restricts all URLs containing a .pdf file extensions

For more information: http://www.robotstxt.org/


Robots Meta Tags

The Robots Meta Tags are used to control which pages are indexed and followed by
search engine spiders.

1st Option: By default, pages without robots meta tags will allow the pages to index
into cache and follow the links.

2nd Option: <meta name="robots„ content="noindex,nofollow">
This restrict the spider from indexing the page into cache and following the links on
the page.

3rd Option: <meta name="robots" content="noindex">
This only restricts the spider from indexing the page into cache.

4th Option: <meta name="robots" content="nofollow">
This restricts the spider from following the link on the page.


Examples of Pages to Restrict

Examples of Types of Pages to Restrict from Robots

• HTML Sitemaps = NoIndex/Follow

Restricts the page from being indexed but allow the robots to follow links on the page.

• About Us = NoIndex/NoFollow

Restricts the page from being indexed & restrict robots from following the links on the page.

• Privacy Policy = NoIndex/Follow

Restricts the page from being indexed & restrict robots from following the links on the page.


NoFollow Tags

• The rel=”nofollow” tag is used for restricts web crawlers from following links.
Some external links and navigational links may require the nofollow tag.

• The NoFollow Tag doesn’t prevent spiders from actually following and indexing
the linked page.

<a href=”url” title=”title” rel=”nofollow”>link text</a>


NoFollow Links Example

Example uses footer links from Google Sites.

All of the links marked in red are using NoFollow Tags.


Type of Links to NoFollow

• Navigational links which are on every page
Examples: Contact Us, About Company, Privacy Policy, pages using SSL and ect.

• Cross Domain Links
Any link to a website sharing the same C-Class IP.

• Advertisements
Affiliate or other form of advertising links.

• External Links
External links which are not involved in a link partnership.


XML Sitemap Files

XML Sitemaps Deliever URLs to Search Engines

http://www.seostrategyworkshop.ch/sitemap.xml

seostrategyworkshop.ch 23
Joe Spencer | Page

HTTP Headers for SEO

HTTP Header can be used to inform search engine spiders the propose of the page

●
HTTP 301 Permanent Redirect

●
HTTP 302 Temporary Redirect

●
HTTP 404 Page Not Found Error

●
HTTP 503 Service Unavailable


HTTP Redirect Headers


HTTP 301 Redirect Headers

301 redirect headers are used to inform search engines that a page has
permanently moved to a new URL.

●
Always use 301 Redirects when moving pages to new URLs.

●
Limit the number of 301 Redirects to 1 per URL


404 Page Not Found Errors

404 HTTP Header responses inform search engine spiders that a URL doesn't
contain a page.

• Use a custom 404 Page

• Include a search feature and other useful content on the custom 404 page


503 Service Not Available

503 HTTP Header responses inform search engine spiders that a URL is temporary
unavailable.

• Use during release process

• Use during maintenance


For more information about SEO

www.seo-netzwerk.com www.seostrategyworkshop.com
March 2 & 3, 2011

This presentation is available at:

http://www.spencerestrategies.com/seo-konferenz/


Joe Spencer

SEO Consultant
Spencer e-Strategies
Phone: +41-(0)44-586-8775
Fax: +41-(0)43-430-2162
Email: joe@spencerestrategies.com
Skype: spencer-estrategies

Website: http://www.spencerestrategies.com/
LinkedIn: http://www.linkedin.com/in/joespencer
Xing: http://www.xing.com/proﬁle/Joe_Spencer

30

Website Architecture for Search Engines

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a Website Architecture for Search Engines

Semelhante a Website Architecture for Search Engines (20)

Último

Último (20)

Website Architecture for Search Engines