DMN3 Blog

DMN3 Blog - written & maintained by Robert M Brecht, Ph.D.

Website Duplicate Content Poorly Understood

Monday, September 14, 2009

There are essentially two issues that one should be concerned with regarding Duplicate Content on the Internet:

  1. When is it recognized as the “Same Content”
  2. What action will the search engines take for “same content” sites

Authoring Internet and website copy can be exasperating when someone plagiarizes your content. In my last post, Website Content Optimization Can Be Frustrating, I discussed theft of content and what little can be done to protect it. Today I would like to blog about how such theft might create a duplicate content issue, and clarify its place in web copy.

Let’s begin by defining Duplicate Content as it relates to the online world and, in particular, search engine rankings. Duplicate content is when essentially the same information appears multiple times either within a domain or across multiple domains or websites. When that happens, one or more of the sites where it appears can face repercussions by a search engine.

When is it recognized as the duplicate content: The following is not an exhaustive treatment of this subject, but rather my understanding of what I have read on credible blogs on the subject - e.g. those sanctioned by Google (Matt Cutt and others). Remember the content must be substantially the same to be considered duplicate content. Obviously the search engines are not about to tell us what “substantially” really means. What the search engines are attempting to do is indexing and serving up distinct information when someone searches on a topic.

Let’s begin with what duplicate content is not:

  • Excerpts, quotes, summaries taken from other websites that are combined with additional original content so that the excerpt, quote or summary is only part of a larger context. Believe it or not, I still hear people who should know better caution against using such things. In both my experience and readings on the subject, I cannot find an instance when such content is combined with original content and has resulted in a search engine taking action regarding duplicate content. If someone else has a contradictory experience, I would be very interested in learning the particulars.
  • Translations of the same content into another language will not be interpreted as duplicate content by the search engines.


Actions Taken: Google has two methods to deal with duplicate content. They include a penalty that impacts rankings, or removing the domain from their index entirely, or implementing a duplicate content filter.

A duplicate content filter is utilized for sites that are attempting to manipulate rankings and deceive users through the utilization of duplicate content from other sites. When the search engine indexes pages with duplicate content, the search engine will choose one version to display in its search results.

When Google discovers such duplicate content on different sites, it utilizes various parameters to determine which site is the original one. The result is that Google serves what it determines to be the original site in its search engine rankings and filters the duplicate content (those who have stolen your content) out. Assuming that Google has indexed the content prior to its theft, then the consequences of such theft should be negligible on your search engine rankings.

While it is certainly annoying to have your work plagiarized or just stolen by scapers, it is comforting to know that in almost all cases it will not have a negative impact on your search engine ranking for the page on which the content resides. …Perhaps there is a little justice out there in the Internet world.


Recent Posts


Tags


Archive


Friends

eMarketing Blogs Directory
MarketingScoop
Blog TopList
BlogCatalog