How to check if your website has duplicate content

SEO & Marketing
5th February 2020
Abigail

Technically speaking you wouldn’t be directly penalised for having duplicate content on your website, however, it certainly will impact how your site is indexed in the search engines. As 93% of online experiences begin with a search engine, it’s imperative that you aren’t restricting your websites ability to compete for the well-regarded number one spot.

Duplicate content is content on the internet that’s found on more than one single unique website address (URL).

How does duplicate content happen

Often website owners don’t even realise that they have duplicate content, which is understandable if they’ve never been made aware of what causes it to happen.

Varied URL's for the same page

Duplicate content is created when there are more than one URLs that lead to the same page. Google considers case sensitivity in URLs which is why the general rule of thumb is to have all your URLs short and lower case. Often times, people aren't aware of the content duplication aspect of SEO and this is why they don't keep their URLs in one case format. It also applied for analytical and URL tracking, for example if using UTM tags for PPC campaign tracking to a website, the different parameters and the order in which those parameters appear in the URL all account for duplications of the same page.

HTTP vs. HTTPs protocol

Similarly to how Google considers URLs case sensitive, it also considers secure and non secure connections as-well as testing and staging versions of your sites so long as the content is the exact same. In some cases, for the users with websites built in CMS (content management systems) such as Wordpress and CraftCMS, the CMS can create duplicate pages.

Content scraping

In the world of ecommerce, this is usually the number one cause for duplicated content. For example, in the product description sections of the website, manufacturers usually copy and paste, which is not good from an SEO standpoint or generally for your business. Business owners aren't aware that content such as this is scraped from similar sites and it's not actually original.

How do you solve the issue?

301 Redirect pages

A 301 redirect is a redirect that permanently passes users from one page to another with nearly a 100% link equity or its ability to rank in the SERP. This basically solves the HTTP vs HTTPS protocol issue for duplicated content as when a user accesses via the HTTPS they are forwarded to HTTP without being made aware so.

Canonical URLs

Canonical URLs are HTML elements that allow webmasters to prevent duplicate content from restricting the SEO capabilities. It basically communicates with Google and lets them know that we have a preferred version of a webpage, and from the point of installing that code that is the correct version of the page.

Google search console

In the past, you could select the preferred domain name through a setting on Google search console, however in the summer of 2019, this feature was removed and therefore you would have to look at redirect pages and canonical URLs as a solution to the content duplication issue.

Final thoughts

In summary, duplicated content is one of the leading factors in harming a websites SEO capabilities so it's important that you get checked out. At Welford we can offer you an SEO audit by where we will completely scrape through your website and give you actionable next steps to optimising your content so get in touch today!

https://moz.com/learn/seo/duplicate-content

https://www.hobo-web.co.uk/duplicate-content-problems/

How to check if your website has duplicate content

​How does duplicate content happen

​How do you solve the issue?

​Final thoughts

How does duplicate content happen

How do you solve the issue?

Final thoughts