There are quite a few duplicate content misconceptions circulating in the SEO community.
Even though a lot has been said by Google's Matt Cutts about the exaggerated fear some people have in regards to a few lines of duplicate content on their sites, many still do not understand what content duplication is, or whether their site is at risk.
So, let's tackle certain tricky questions that concern duplicate content and put some common myths to rest.
http://webmeup.com/blog/duplicate-content-myths.html
Image credits (all used under CC license)
http://www.flickr.com/photos/wjhleonard/
http://www.flickr.com/photos/allaneroc/
http://www.flickr.com/photos/outdoorstudios/
http://www.flickr.com/photos/70276469@N00/
5. Website owners who are not so good at web
design think that the only way to produce
duplicate content is to purposefully replicate a
piece of text on multiple pages.
http://webmeup.com/blog/duplicate-content-myths.html
6. What they don't realize is that some of their
site's pages may be accessible via multiple ULRs
(which
may
happen
for
various
reasons), which, in turn, would automatically
lead to content duplication.
http://webmeup.com/blog/duplicate-content-myths.html
7. Ideally, each piece of content should have only
one URL associated with it:
http://webmeup.com/blog/duplicate-content-myths.html
11. Hence, if there are pages on your site that have
multiple URLs pointing to them, you need to
take care of that!
http://webmeup.com/blog/duplicate-content-myths.html
12. To solve that, one should use canonical tags, an
XML sitemap, a robots.txt file or other means
that aid the canonicalization process.
http://webmeup.com/blog/duplicate-content-myths.html
13. Also, more information on how to tackle these
structure issues are given in this guide to SEOfriendly URL architecture.
http://webmeup.com/blog/duplicate-content-myths.html
15. In case you have duplicate URLs on a site,
closing duplicates from getting indexed with a
robots.txt is a bad idea.
http://webmeup.com/blog/duplicate-content-myths.html
16. A better solution is to allow search engines to
crawl these URLs, but mark them as duplicates.
http://webmeup.com/blog/duplicate-content-myths.html
17. That can be done by using
the rel="canonical" link element, the URL
parameter handling tool, or 301 redirects.
http://webmeup.com/blog/duplicate-content-myths.html
19. Some SEOs truly believe that having even a small
amount of duplicate content on your site can
lead to a penalty.
In an overwhelming number of cases, however,
it can't.
http://webmeup.com/blog/duplicate-content-myths.html
20. According to Matt Cutts, having a Terms and
Conditions template or a Disclaimer message
across all pages of your site won't get you
penalized.
http://webmeup.com/blog/duplicate-content-myths.html
21. Check out this video to learn more:
http://www.youtube.com/watch?v=ViwkEeOKxM
http://webmeup.com/blog/duplicate-content-myths.html
22. NB!
At the same time, Google still advises one to
keep the amount of text in that repeated
message to a minimum.
http://webmeup.com/blog/duplicate-content-myths.html
24. Although Google penalizes sites for duplicate
content quire seldom (usually such sites are pure
spam), it could easily dish out a penalty to a site
that:
http://webmeup.com/blog/duplicate-content-myths.html
25. Has nothing but scraped content
Scrapes images, auto-translates pages, or uses
automated apps/software to spin
content prior to publication
Purposefully creates pages with nearly
identical content to rank them for various
locations/keywords
http://webmeup.com/blog/duplicate-content-myths.html
26. In all other cases, your site is unlikely to get
penalized for duplicate content.
After all, 25-30 % of the Web is duplicate
content because people quote other people, and
the same information gets shared on it a lot.
http://webmeup.com/blog/duplicate-content-myths.html
28. There's been a lot of discussion on the Web
about Google being or not being able to tell the
original creator of a content piece.
http://webmeup.com/blog/duplicate-content-myths.html
29. Some people would say Google replies on
publication date to track the authentic author
BUT
multiple instances of hijacked search results (a
scraper site outranking the original) disprove
that.
http://webmeup.com/blog/duplicate-content-myths.html
30. Thus, according to Dan Petrovic, there are
certain signals you can send Google to let it
know you're the original author.
http://webmeup.com/blog/duplicate-content-myths.html
31. These are:
Claiming your Google Authorship
Specifying canonical URLs
Sharing a newly published piece on Google+,
etc.
http://webmeup.com/blog/duplicate-content-myths.html
33. Type 1.
That is legitimate news sites/information hubs
that sometimes feature previously published
content.
They often provide original commentary and
analysis of the piece they cover. Such sites
always credit the original content creator.
http://webmeup.com/blog/duplicate-content-myths.html
34. Type 2.
Content syndication sites that produce no
content of their own.
They scrape content off multiple websites
(often it is imagery) and give no credit to the
original content creators whatsoever.
http://webmeup.com/blog/duplicate-content-myths.html
36. So, if your site belongs to the 1st type and you
have syndicated content on it, you have nothing
to worry about.
If you are type 2, getting a penalty is just a
matter of time!
http://webmeup.com/blog/duplicate-content-myths.html
38. You may think that translating the copy from
your English-language site and publishing it on a
regional domain/subdomain is never a problem.
http://webmeup.com/blog/duplicate-content-myths.html
39. Well, sometimes it is.
http://webmeup.com/blog/duplicate-content-myths.html
40. These are the cases when Google can classify a
translated copy as duplicate content:
http://webmeup.com/blog/duplicate-content-myths.html
41. You translated it with an automatic tool and
just dumped it on your site;
* (in which case it would qualify as automatically
generated content)
You copied your English-language content
without change to the regional site.
http://webmeup.com/blog/duplicate-content-myths.html
42. So, when creating a foreign site for your biz,
tailor its content for the segment of users you
are
trying
to
reach
with
it.
Most likely, they would want a slightly different
message than the one you have for Englishspeaking audiences.
http://webmeup.com/blog/duplicate-content-myths.html