Canonical Problems

Matt Cutts
Google Engineer
Thursday, Feb. 12, 2009

These URLs are all different:

  • www.essentialmarketer.com
  • essentialmarketer.com
  • www.essentialmarketer.com/
  • essentialmarketer.com/
  • www.essentialmarketer.com/index.html
  • essentialmarketer.com/index.html
  • www.essentialmarketer.com/Home.aspx
  • essentialmarketer.com/Home.aspx

How to fix duplicate content issues?

  • Change your Content Management System (CMS) to generate only the urls you want. "Normalize" urls
  • Pick one "canonical" url and ensure you link consistently within your site
  • Make all the non-canonical urls do a permanent (301) HTTP redirect to the canonical/preferred url
  • Google's Webmaster Tools: specify www vs. non-www
  • Break ties in Google by submitting your preferred url in a sitemaps file

Tough Duplicate Content Issues

  • Sometimes can't generate permanent/301 redirects
  • Can't help how people link to you
  • Uppercase/lowercase paths
  • Session IDs
  • Tracking codes, analytics, and landing pages
  • Sorting by ascending vs. descending
  • Breadcrumbs (the user's previous web page)

New Option for Duplicate Content

Canonical Link Element at page level

On http://www.example.com/page.html?sid=asdf314159265

<head>
...
<link rel="canonical" href="http://example.com/page.html"/>
...
</head>

(Don't forget the final / at the end of the link tag.)

Questions and Answers

Q: Does this work across domains?
A: No, only on the same domain

Q: Does this work across subdomains/hosts?
A: Yes. So zeta.zappos.com could suggest www.zappos.com as a canonical url

Q: Can I use this to suggest http://example.com be the canonical url instead of https://example.com?
A: Yes, absolutely

Q: What's the difference between this and a 301/perm redirect?
A: They are very similar, but sometimes you don't have the easy ability to generate 301/permanent HTTP redirects
Q: Do the pages have to be bit-for-bit identical?
A: No, but they should be similar. Slight differences are okay

Q: Can I use relative or absolute urls?
A: Yes, but we highly suggest that you use absolute urls. This is a powerful tool, and absolute urls leave less room for error

Q: Can you follow a chain of canonicals?
A: We may, but don't count on it. Point directly to the final url
Q: What if I point to a 404? Or have an infinite loop? Or I point to an uncrawled url? Or www/non-www conflict?
A: Search engines will handle it as best we can. Don't cross the streams!

Thanks

  • Joachim Kupke: Google engineer who did heavy lifting
  • Yahoo and Microsoft: for support of this link element too
  • Wikia: for trying this out on their wiki pages
  • Lots of webmasters for giving us feedback on this

 

Resources

Blog post on Google webmaster blog:

Yahoo blog post: http://ysearchblog.com/2009/02/12/fighting-duplication-adding-more-arrows-to-your-quiver/

Microsoft: http://blogs.msdn.com/webmaster/archive/2009/02/12/partnering-to-help-solve-duplicate-content-issues.aspx

Ask: http://blog.ask.com/2009/02/ask-is-going-canonical.html

Google Help Center documentation:
http://google.com/support/webmasters/bin/answer.py?answer=139394

Joost de Valk: WordPress, Magento, and Drupal