4 Steps to Remove Wordpress Duplicate Content



What is Duplicate Content?

Duplicate content generally refers to substantive blocks of content within orTwins across domains that either completely match other content or are appreciably similar. Most of the time when we see this, it’s unintentional or at least not malicious in origin: forums that generate both regular and stripped-down mobile-targeted pages, store items shown (and — worse yet — linked) via multiple distinct URLs, and so on. In some cases, content is duplicated across domains in an attempt to manipulate search engine rankings or garner more traffic via popular or long-tail queries.

What is not considered Duplicate Content?

Translations are not considered as duplicate content by Google.

What Does Google suggests?

Understand your CMS: Make sure you’re familiar with how content is displayed on your Web site, particularly if it includes a blog, a forum, or related system that often shows the same content in multiple formats.

If you ever get tainted to use article distribution as a technique for building inbound links to your website forget about it. For many, this strategy has run into problems because of the Google duplicate content filter.

Wordpress has a big problem when it comes to cloning the content.

Normally the number of pages indexed by should be equal with the number of posts+ the number of pages. Wordpress creates pagination, search, trackbacks, author, categories, archieves whith excerpt or even full content of the post.

This will have a bad effect in Google Search Rankings.

Just remove this duplicate content and see the your organic traffic increasing.

1. With www or without www.

You should chose your favorite canonical Url. (with www or without www)

Google indexes both www.cucirca.com and cucirca.com giving them 2 different pageranks.

The best way is to modify your .htaccess file to make a server redirect from the non www to the www version.

To do this add the following lines to your .htaccess file:

RewriteEngine On
RewriteCond %{HTTP_HOST} ^cucirca\.com [NC]
RewriteRule (.*) http://www.cucirca.com/$1 [L,R=301]

Replace cucirca.com with your own domain.

2. With / or without /

I took a close look at google analytics. At the top content tab I see that the best 2 referral urls are almost the same:

http://www.cucirca.com/2007/02/21/13-places-to-watch-tv-online-for-free/

http://www.cucirca.com/2007/02/21/13-places-to-watch-tv-online-for-free

So this is again duplicate content :(

What to do?

Use .htaccess file to make all urls end with a “/”.

Add the following code to your .htacceess file, just under the www rule:

RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://www.cucirca.com/$1/ [L,R=301]

3. Use robots.txt to remove duplicate content

How to Get More Natural Search Traffic With robots.txt
4. Noindex Follow

Create the following Php Code and add it to your header.php before </head> tag.

<?php if ( $paged > 1 ) {

echo ‘<meta name=”robots” content=”noindex,follow” /> ‘;

}?>

<?php if (is_author() ) {

echo ‘<meta name=”robots” content=”noindex,follow” /> ‘;

}?>

<?php if (is_trackback() ) {

echo ‘<meta name=”robots” content=”noindex,follow” /> ‘;

}?>

<?php if (is_search() ) {

echo ‘<meta name=”robots” content=”noindex,follow” /> ‘;

}?>

<?php if (is_date() ) {

echo ‘<meta name=”robots” content=”noindex,follow” /> ‘;

}?>

It is wise to let the spiders index your categories because there are excerpts of the posts and they’re not considered duplicate.

This is my side of the story. I’ve tested all the above tips and the results are starting to show.
If you have any suggestions I’ll be glad to discuss them.

Resources for this post: Google Webmaster Central

A little test for me too see how duplicate works(I will share the results in a future post):

  • SEO test on duplicate content
  • SEO test on duplicate content
  • SEO test on duplicate content

  • Missed an episode? Download it now! Click here.


    AddThis Social Bookmark Button AddThis Feed Button

    If you enjoyed this post Subscribe to the Free Newsletter

    Additional Reading:
    How to add Sitemap Autodiscovery in Robots.txt
    How to get out of Google Supplemental Results
    Links for 2007-04-22
    Neimple’s First Theme: Pearl
    Why you should upgrade to Wordpress 2.1.2
    I’ve Upgraded to Wordpress 2.2
    How to Get More Natural Search Traffic With robots.txt
    I’ve Upgraded to Wordpress 2.0.6
    Wordpress 2.1 Hidden Editor Buttons
    5 Reasons Why You Should Translate Your Blog



    11 Responses to “4 Steps to Remove Wordpress Duplicate Content”

    Leave a Reply






    tvshowsaddict

    Firefox 3