Something that I’ve wanted to get off my back for some time is all of the contradictory hearsay that I always hear about the duplicate content theory. For the longest time this has bewildered me and I would shift back and forth to what I would believe about it.
From one individual I would hear that it exists and that if you submit the same article to several of article directories without editing them (especially the anchor text) then your rankings will sink down quickly into the dark green ocean of Google. On the contrary, I also heard the complete opposite.
Many webmasters would say by no means does it exist and that duplicate content was just a myth developed by Google to scare webmasters from submitting the same content to Google, and to also help them combat against unethical webmasters from copying the work of other webmasters so that Google can retain their quality within their network.
I heard from the gurus, and I heard from the everyday webmaster that swore that their website dropped in rankings because of duplicate content and that they were being punished unethically by Google.
So, since their were so many contradictory statements I was confused and baffled to who I should believe and decided not to take any one word for it no matter who they were and I decided to carry out my own testing to see what all of the fuss was about and more importantly what is the truth about this.
So overtime I have conducted a good amount of case studies to discover everything I possibly could about this because Google at times can be very fickle.
So, take this from a guy who has reluctantly spent lots of tortuous nights trying to figure out all of this nonsense. The duplicate content myth DOES, and DOES NOT exist.
Why it Does Exist
You can’t have duplicate content coming from the same exact server. That’s why in early 2007 when I started heavily deploying new wordpress sites in new niche markets I would sometime get indexing problems within the search engines.
At first I thought that I was building links to fast, I then thought that Google was punishing sites that get links from solely web 2.0 sites… I then looked deeply and noticed that only my wordpress sites were having funny stuff happening to them within the search engines.
When I analyzed the source code I found out what it was. I was having duplicate content flowing from the same server and that will get your site punished and beaten up badly by Google.
What took me so long to figure it out is because I was just having content published on my site.
I didn’t notice that I was accidentally creating duplicate content right under the table.
Here is how it goes. With wordpress when you make a new blog entry you will have the option to categorize and tag your content which are more of web 2.0 terms.
With web 1.0 you could only post content, web 2.0 have a whole bunch of more cool features then boring 1.0 stuff in it. However, what I didn’t notice was that when an individual clicks on archives or a category or tag url on your blog they will be redirected to the SAME content from your blog, just with a different url.For example, let’s say that site A has an imaginary url forillustrational purposes only of: 1. www.websiteurlA.comNow, let’s say that you post this entry under a category in your blog known as credit card, and also in the tag section known as credit card… here is how the two separate urls will look. 2. With the category function using the keyword “credit card” the url generated will look something like this: www.websiteurlA.com/category/credit-card/. 3. With the tag function using the keyword “credit card” the url will generated will look something like this: www.websiteurlA.com/tag/credit-card/.
All of these urls are different but every url will display the SAME EXAC T content which will cause some strange stuff to happen in the search engine results, mainly indexing conflicts from my experience but the good news is that I have never got banned for it.
So, now you may wonder what I did to correct this. Simple, I made the category and tag urls non follow within wordpress. When I made this section of my site “non follow” I basically told Google to skip over this part of my site and to not pass any page rank on to it.
This prevents duplicate content from leaking through your server which means that you shouldn’t have any funny stuff happening with your urls in Google which is exactly what you want.
If you have duplicate content coming from the same server then Google will penalize you for it. The content has to be in the form of text, Google bots haven’t evolved yet so that they can detect duplicate content in the form of audios, videos, and images very well.
With the rate that technology is going then perhaps in the future but I don’t see them forming an advanced algorithm to detect this anytime soon (because it will be an extremely difficult task).
Why it Does Not Exist
This is one of the most popular hoaxes known in the webmastercommunity and that is Google will burn a hole through your site ifit has several listings of the same exact content within their system.
So in other words, if site A takes site B’s content and then put it on their site, Google will penalize site A for plagiarism and will ban plus sink them down in their dark Green Ocean of drowning websites. NOT SO QUICK!
I have run multiple of tests to verify this and haven’t found enough concrete evidence to declare this theory to be true. Don’t get me wrong, websites do get punished in Google, but it may just be a coincidence that some of the sites that do get punished have duplicate content on them. It could be possible that the webmaster are carrying out practices that are against Google’s tos that causes them to get imprisoned in Google’s database of confined websites.
What my tests showed me
I have ran multiple of link baits in several of niche markets–some were very successful and some not so successful.
The ones that were successful created a lot of natural one way links to my site. However, not all of the time will a webmaster just link to a site.
They sometimes may copy ALL of a webmaster’s content and put it on their site and then link back to the original source.
Some don’t link back at all and they are practicing very bad marketing ethics which is stealing content which will probably always be a problem though on the internet. Anyhow, the ones that copy content and link to a site do just fine in the search engines and go about their daily routines unpunished.
I constantly hear webmasters give out bad advice to the public. Another myth that I constantly hear is to not submit to multiple of article directories because the webmaster’s site will get banned from Google because of duplicate content. From my testing, No way!
I have done this time after time without any problems. Remember, the duplicate content can’t come from your own server, but I find it ok if it comes across different servers with unique ips. Just think about it, if a website automatically gets penalized by Google for having multiple of content in the search engines then the guys at the top of the seos won’t be there very long.
What some black hat marketer may just do is submit an article from the site to 1000s of directories by using automated software and have all of their competitors vanished in a heartbeat.
Or, they could use their competitor’s URL in some type of mass blog commenting software and get them banned as well. Google is a lot smarter then that, they have some of the best engineersin the world working on their side.
Google has to leave some space for “natural duplicate content” because sometimes content is so attractive that it can’t but helped to be linked to.
If the duplicate content algorithm operates in a simplistic way then content that gets copied onto an online forum and linked back to the original source would punish the webmaster’s site for example which would make search engine optimization that much more difficult, and luckily it doesn’t work that way.














July 13th, 2009 at 11:00 am
I used to publish my articles, but now I wander should I stop doing this, because the risk of duplicate content penalty. Should I stop publish my articles on article directories?
August 16th, 2009 at 2:37 pm
Cool article about the duplicate content paradox. Guess i need to test drive some of your advices. Thanks mate.