First, you need to understand that not all duplicate content is the same kind. You need to appreciate
some differences.
- Reprints: This is duplicate content published on multiple sites with the permission of the
copyright holder. These are the articles that you or others create and then distribute to
create links back to your site or to sites that are relevant to the content of yours. Reprints
are not bad duplicate content, but they can get your site thrown into the realm of Similar
Pages, which means they’ll be buried behind other results.
- Site Mirroring: This is the kind of duplication that can cause one or more of your sites
to be delisted from a search engine. Site mirroring is literally keeping exact copies of your
web site in two different places on the Internet. Web sites used to practice site mirroring
all the time as a way to avoid downtime when one site crashed. These days, server capabilities
are such that site mirroring isn’t as necessary as it once was, and search engines
now “dis-include” mirrored content because of the spamming implications it can have.
Spammers have been known to mirror sites to create a false Internet for the purpose of
stealing user names, passwords, account numbers, and other personal information.
- Content Scraping: Content scraping is taking the content from one site and reusing it
on another site with nothing more than cosmetic changes. This is another tactic used by
spammers, and it’s also often a source of copyright infringement.
- Same Site Duplication: If you duplicate content across your own web site, you could
also be penalized for duplicate content. This becomes especially troublesome with blogs,
because there is often a full blog post on the main page and then an archived blog post
on another page of your site. This type of duplication can be managed by simply using a
partial post, called a snippet, that links to the full post in a single place on your web site.
Of these types of duplicate content, two are especially harmful to your site: site mirroring and content
scraping. If you’re using site mirroring, you should consider using a different backup method
for your web site. If you’re using content scraping you could be facing legal action for copyright
infringement. Content scraping is a practice that’s best avoided completely.
Even though reprints and same-site duplication are not entirely harmful, they are also not helpful.
And in fact they can be harmful if they’re handled in the wrong way. You won’t win any points with
a search engine crawler if your site is full of content that’s used elsewhere on the Web. Reprints, especially
those that are repeated often on the Web, will eventually make a search engine crawler begin to
take notice.
Once it takes notice, the crawler will try to find the original location of the reprint. It does this by
looking at where the content appeared first. It also looks at which copy of an article the most links
point to and what versions of the article are the result of content scraping. Through a process of
elimination, the crawler narrows the field until a determination can be made. Or if it’s still too difficult
to tell where the content originated, the crawler will select from trusted domains.
Once the crawler has determined what content is the original, all of the other reprints fall into order
beneath it or are eliminated from the index.
If you must use content that’s not original, or if you must have multiple copies of content on your
web site, there is a way to keep those duplications from adversely affecting your search rankings.
By using the or tags, you can prevent duplicated pages from being
indexed by the search engine.
The tag should be placed in the page header for the page that you don’t want to be
indexed. It’s also a good idea to allow the crawler that finds the tag to follow links that might be
on the page. To do that, your code (which is a meta tag) should look like this:
That small tag of code tells the search engine not to index the page, but to follow the links on the
page. This small snippet of code can help you quickly solve the problem of search engines reading
your duplicate content.
If you liked this article don't forget to subscribe and submit your website to our Submission web Directory.
| < Prev | Next > |
|---|



