If you’ve spent time reading about SEO, you’ll have seen the term duplicate content. But what is duplicate content, and does it have a big effect on our website’s search visibility?
As a topic shrouded in myth, it’s worth exploring the underlying concepts and explaining why many have the wrong idea.
Google won’t penalize your website for having content copied from another website. There is no such penalty. However, if your website is full of content you’ve copied from other websites, Google might choose to rank you on the second page or not at all, spelling disaster!
Let’s explore what is duplicate content and what to do if your website has plenty of it.
What is duplicate content?
The best example of what is duplicate content is to look at news websites. Associated Press, one of the biggest press agencies in the world sells stories to all of the major news outlets, who republish them in full.
Doing this creates a ton of duplicate content across the web. But Google doesn’t punish any of them. Instead, it uses other factors to rank the results.
If you read the BBC news, as I do, you’ll notice they’ve started adding an analysis section of original content to create a slightly different article that might rank higher.
Another example is creating two identical pages on your website. It could happen by accident or as the result of technical issues. Your site could be available on both HTTP and HTTPS, thus creating a complete copy.
How does Google define duplicate content?
As we want Google to show our website on the first results page, it’s worth listening to what they have to say. Google defines duplicate content:
“Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin.”
The last sentence is highly important. If your copying and republishing large blocks of text on your site, Google see this copied content and consider that you have malicious intent. You’re trying to deceive Google into ranking your site higher by adding more content.
It’s worth remembering:
“Google tries hard to index and show pages with distinct information.”
Hence, there needs to be some originality to what you’re publishing and not just yet another similar piece of content.
If Google finds lots of malicious intent on a website, it might take action:
“Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results”
Again, it’s important to think of something unique you can provide the reader that’s worthy of Google displaying on the first page of results and not another rehash of an overdone topic.
Does my website have duplicate content?
Thankfully, there are plenty of tools to find duplicate content. One of the best-known duplicate content checkers is CopyScape.com. Simply insert a link in the box on the homepage, and CopyScape will return several results, presented a bit like Google’s search result pages.
From there, you can look at each result and see how it compares to the original. Checking any written content is a must if you’re working with a freelance writer from Fiverr or Upwork.
Siteliner is another useful tool that searches for internal duplicate content. It crawls your website to find duplicate content on your own site. A very powerful tool that’s worth adding to your toolkit.
Can you stop people copy content from your site?
Sadly, you can’t stop people from stealing your work. Sadly, the web is full of stolen content. And while you’ve poured hours into your article, there’s nothing stopping anyone from scraping your page and republishing it as their own.
Short of sending the lawyers in, you do have few options. You could ask that they attribute you as the author and link back to the original post or only reference part of your article. Another option is to ask them, nicely, to take the page down.
If neither of these ideas works or appeals to you, you can ask the hosting company to take action. A final option is to fill out the Removing Content From Google form as they are obligated under the Digital Millennium Copyright Act to remove malicious content.
How much original text does a page need to be considered “Unique”?
SEO heavyweight Kyle Roof ran an experiment to determine how much original text should a page have for it to be unique. His case study is well worth reading as the results were shocking.
He concluded that for a page to be determined unique, it needs to have at least 50% original text. So bear that in mind the next time you create an article!
Should you panic after finding duplicate content on your site?
While you should fix the issue quickly, don’t panic. Google is a highly intelligent data processor that’s quite good at discovering and handling duplicate content.
If it finds multiple versions of a page, it will only ever show the best version, usually the original article or page. Don’t block Googlebot in your robots.txt from crawling these URLs as this could cause more problems.
Instead of going nuclear, there are a few things you should do:
- Allow robots to crawl these URLs
- Use rel=canonical to mark the content as duplicate (more on this below)
- Use 301 redirects to send users and crawlers to the canonical URL, allowing you to delete the duplicate page(s).
One way to fight duplicate content fighting is to use rel=”canonical”. By using this piece of code, you tell Google what the original URL is of a piece of content, we call this the canonical URL.
Use 301 redirects
After creating a canonical URL, you should delete the duplicate page or post and use a 301 redirect to send any traffic that the deleted page was getting to the correct page.
Remember to create a 301 redirect, or your user will see a 404 error page and have a poor experience.
Be part of the solution by publishing original, authoritative content
Focus on creating original articles that are authoritative, interesting, and matches the user’s search intent. It doesn’t matter if you’re writing the articles or working with freelancers; focus on wooing your visitors with informative articles.
By romancing your visitors, you build trust while reducing friction in the sales process. Your content can act as a big magnet that pulls your target clients into your orbit while convincing them you have the solution to their problem.
Another way to boost your content efforts is to watch for thin content. Typically these are pages with under 100 words. If you can, keep building these out; so they become essential resources.
However, if these pages shouldn’t be indexed by Google (such as cart, contact, etc.), either block them using robots.txt or add the noindex Robots Meta tag.
Struggling to drive sales from your website?
Our 1-to-1 SEO website content marketing coaching service is designed for you. Book Free call