What Is Duplicate Content, And How To Avoid It In Your Blog?

What Is Duplicate Content?

Duplicate content or double content. Technically, this means that there is uniform content on several URLs on the web. It can be both internally on your website or different websites (domains and subdomains).

It does not matter whether the two URLs are from the same website or two or more other websites since it is deemed duplicate content in all respects.

And this, my friend, may harm your project if you don’t control it. Even if you assume you are not creating duplicated content, other websites may be doing it, “drawing inspiration” from your content or outright plagiarizing it.

So let’s review several methods you may use to avoid this content. But first, you are undoubtedly asking what happens if this occurs, and that is what we are going to witness first 😉

How duplicate content may affect you

Apart from the obvious, duplicating and distributing your content on another website is full-blown plagiarism and copyright infringement, and it may affect you at the SEO level.

If Google detects that your website has duplicate content, either within the same website or that the content of one or more URLs is similar to that of other websites, two things can happen:

  • Either it does not show that content in the results of the search,
  • Or penalizes the web in the event that it detects it as a common practice.

When search engines discover more than one URL with the same (or similar) content, it might give them uncertainty about which to display. Therefore they can opt not to show any or the first one they have indexed in their index.

When does duplicate content occur?

Internal duplicate content

Duplicate content can be caused by consciously or unconsciously reusing your texts across pages. It can be, for example, category texts, product texts, or website texts that describe your company and your services.

It is a huge job to make these kinds of texts unique and, at the same time, make them interesting. You can outsource it to your SEO or a communication agency if you have limited time.

The advantage of choosing an SEO agency is that the texts are written based on what people are searching for. The disadvantage can be, if the copywriter at your SEO agency is not skilled enough, that the text will be anything from uninspiring and boring to gossip for gossip. You should, of course, avoid that.

When prioritizing your texts, start with the most important products and services. It can be texts for the pages that potentially bring you the most visits, texts on the products or services that sell best, or something else entirely.

Just remember: Rome wasn’t built in a day.

Technical errors in your system can also cause internal duplicate content. I describe some of these problems later in the article.

External duplicate content

In addition, duplicate content can also occur across different websites, e.g., if you publish your content on pages other than your own. I would call it external duplicate content.

A classic is to publish the blog post both on your blog and on external blog platforms such as e.g., LinkedIn or Medium. This does not mean it will cause you problems.

External duplicate content can also arise when you use product texts from your supplier that your competitors have also used. Or, if your product texts are also included in a product feed, such as affiliates, use them. This can be a problem because the same content can be syndicated on many different pages.

Finally, there is the completely grim, unfortunately, the well-known problem of others copying your website’s content. If they do it without asking permission, it is illegal, and you are within your rights to claim compensation for copyright infringement.

If you suspect this is happening, try a service like WhoCopied.me. Once you have added this service to your page, you will be notified if someone copies your text.

How to prevent duplicate content

There are numerous techniques to attempt to prevent duplicate content.

·        The first is obvious: do not replicate the texts of other websites you discover.

But no matter how clear it may be, you can’t fathom the number of individuals who do it, and no longer new people on the Internet, but those who, presumably, is committed to providing web design services, internet marketing, etc. To pee and not drop, sure.

·        Another method to prevent it is to be cautious with the links that lead to the various URLs of your website.

If your website is built using WordPress, by default, the URLs of your website will load with the slash at the end

Example: https://financebode.com/what-is-a-subdomain/

If you click on the link, you will notice that it opens with the trailing slash (/). Now, copy and paste this URL in the browser bar: https://financebode.com/what-is-a-subdomain

As you can see, it does not have the slash at the end, but when the page is accessed, it is immediately redirected to the same URL with the slash at the end.

This is acceptable. However, there are circumstances in which the URL loads in both ways: with the slash and without the slash. This might be deemed duplicate content by Google and other search engines, having two copies of the same content.

In such a situation. You will have to establish a redirect, so the URLs load with or without a slash, but not both ways.

You also have to watch caution while generating internal connections on the web. If they have the slash at the end, build the connection with it, and so versa.

As Google says: “Do not link to https://www.example.com/page/, https://www.example.com/page, and https://www.example.com/page/index.htm.” Create the links constantly with the same format. And the same for the websites that connect to you.

·        Another related issue may be that your website loads with HTTP:// and with HTTPS:// after having installed an SSL certificate, causing a duplicate of each of the URLs.

Ideally, in this scenario, it would only load with HTTPS://. If you don’t know how to do it, I propose you look at this post.

·        Finally, and harder to regulate. This is to make sure that other websites do not post content obtained from your articles.

There is a lot of “smart” that, instead of developing their content. They are committed to stealing the content of others and posting it as it is on their websites. Or they even alter it by an astounding 1% to attempt to make it “not obvious” (I’ve experienced both).

As I stated, it is more difficult to regulate. Since it is impossible to monitor all the websites that exist to discover if there is one that duplicates your content. Still, there are several techniques to detect it that I will remark on below.

Tools to identify duplicate content

There are various methods and approaches to attempt to discover duplicate content, either inside your website or outside of it. I will reveal some of them.

SEMrush

Among all the tools that SEMrush contains, there is one to carry out an SEO audit, in which it notifies you whether duplicate content has been found inside your website.

Although SEMrush is a premium product (and not cheap), you may take advantage of the free 14-day trial you will discover in this post.

Copyscape

Copyscape is a well-known program that, given a URL, will search the Internet to identify texts that match the one found at the supplied URL and provides you a list of websites that have the same or very similar content.

At the time of writing this, I have placed the URL of my web design service page in Copyscape, and I have identified a handful of websites that have struck down my material and copied it almost as is on their service pages.

Grammarly

Grammarly is another popular tool to detect duplicate content on the Internet. Still, instead of entering a URL as above, you have to paste the piece of text you want to locate, and the tool will trawl the Internet to find the same or similar text.

Other approaches to identifying duplicate content

Apart from the tools described above (and others that exist) that specialize in discovering duplicate text, there are other methods to detect it, and I am going to tell you a few of the ways in which I have discovered this sort of content.

One of these is through utilizing Google Analytics. Checking the websites from where I got traffic, I opened one of those websites. I discovered that the owner had cloned an article from my blog with identical internal connections.

Another manner in which I have spotted this sort of content is by utilizing a program to examine the backlinks my website gets (SEMrush, Ahrefs, Search Console…).

Although this strategy is ideal for checking for duplicate content on websites connecting to you, it is also fascinating to identify it.

In this method, I have observed a handful of times, at least, that new websites were connecting me to the same URL and with the same anchor text as previous websites that had been linking to me for some time.

When accessing these new websites… bingo! They had duplicated the article from those websites connecting to me (one of them was a piece published by me as a guest), preserving the same links, of course.

How duplicate content may harm you?

In the same way that you are not interested in having your website punished for duplicate content, you are not interested in punishing websites linked to you for the same reason, thereby losing a link.

Apart from plain friendship, of course. In the same manner that I would want to be told, I have notified other websites that their content was being copied so that they may take the required actions.

Those who like to plagiarize content in this manner are new websites with zero authority. An “additional” link to your website is not a benefit for you either since it has almost any value.

How to report plagiarized content

The first approach, in my opinion, needs to be to send a note to the medium that has stolen your content, alerting them that you have found content taken from your website on theirs and that they have X days to delete it before reporting it as content. Plagiarized.

The most typical thing is that they apologize and erase it. However, do you recall that a while ago, I indicated that creating this post, I identified plagiarism from my web design service page?

Well, the owner of the website has answered upset, stating that I was thinking that it was his responsibility, that it had been the fault of an external firm (of course, I don’t believe it), that I had lost respect for instilling threats, blah, etc………… Come on, anything except apologizing.

At another time, they apologized to me, and their faces sank with humiliation, but they also blamed an “external” person.

Wow, it appears like all tiny website owners have a budget to engage external persons to generate content… wink, wink 😉 Of course, everyone must employ the same individual who is committed to duplicating the content of others.

But I have explained that the standard procedure is generally to delete the copied content. The reasons they make afterward and who they blame is another thing that neither goes nor comes to you.

Suppose a few days after notifying them, they do not show signs of life, and the plagiarized content is still there. In that case, you can use this Google form to request the removal of this content, with the option “I want to report an intellectual property problem (infringement of copyright, circumvention, etc.).”

Conclusion

Now that you know what duplicate content is and how to prevent it, I advise you to research to see if you can locate this content on your website or outside of it.

If your website has some exposure, it is not strange that sooner or later, you discover your content duplicated by other websites, but now you know what to do to attempt to rectify it

If you have been a victim of plagiarism, I invite you to tell us about your case in the Facebook group! Outlining how you dealt with the problem and how you addressed it

FAQ

Are translations considered duplicate content?

If you translate your website from French to English, then there are two different texts, which are not duplicate content.

Not even if you do a complete one-to-one and word-for-word translation or use Google Translate.

But do you have, e.g., a .com domain for American users, a .co? UK for English, and a .com.au for Australian users. If the text on the websites is otherwise the same, then this is duplicate content. You shouldn’t have any problems if you split it up into the respective countries’ domain extensions (ccTLD). But to be on the safer side, you can use hreflang links across the pages.

If you have the language division on the same domain, e.g., .com/Uk/ and .com/us/ and .com/au/, then it is very important to use hreflinks. Otherwise, you will have duplicate content problems.

Can images generate duplicate content?

No. You may use the same images across your website or on different websites. It is not considered duplicate content. Duplicate content is only a text problem.

Does Google penalize duplicate content?

They do – via the downgrading of the qualitative assessment, they apply to the entire site. If there is a lot of duplicate content on your site, you can run into serious problems.

Is it duplicate content if I reuse phrases on the site?

In 99.9% of all cases: No! It only becomes a problem if a large part of a page’s content is identical to the content of other pages.