Canonical Problems
Matt Cutts
Google Engineer
Thursday, Feb. 12, 2009
These URLs are all different:
- www.essentialmarketer.com
- essentialmarketer.com
- www.essentialmarketer.com/
- essentialmarketer.com/
- www.essentialmarketer.com/index.html
- essentialmarketer.com/index.html
- www.essentialmarketer.com/Home.aspx
- essentialmarketer.com/Home.aspx
How to fix duplicate content issues?
- Change your Content Management System (CMS) to generate only the urls you want. "Normalize" urls
- Pick one "canonical" url and ensure you link consistently within your site
- Make all the non-canonical urls do a permanent (301) HTTP redirect to the canonical/preferred url
- Google's Webmaster Tools: specify www vs. non-www
- Break ties in Google by submitting your preferred url in a sitemaps file
Tough Duplicate Content Issues
- Sometimes can't generate permanent/301 redirects
- Can't help how people link to you
- Uppercase/lowercase paths
- Session IDs
- Tracking codes, analytics, and landing pages
- Sorting by ascending vs. descending
- Breadcrumbs (the user's previous web page)
New Option for Duplicate Content
Canonical Link Element at page level
On http://www.example.com/page.html?sid=asdf314159265
...
...
(Don't forget the final / at the end of the link tag.)
Questions and Answers
Q: Does this work across domains?
A: No, only on the same domain
Q: Does this work across subdomains/hosts?
A: Yes. So zeta.zappos.com could suggest www.zappos.com as a canonical url
Q: Can I use this to suggest http://example.com be the canonical url instead of https://example.com?
A: Yes, absolutely
Q: What's the difference between this and a 301/perm redirect?
A: They are very similar, but sometimes you don't have the easy ability to generate 301/permanent HTTP redirects
Q: Do the pages have to be bit-for-bit identical?
A: No, but they should be similar. Slight differences are okay
Q: Can I use relative or absolute urls?
A: Yes, but we highly suggest that you use absolute urls. This is a powerful tool, and absolute urls leave less room for error
Q: Can you follow a chain of canonicals?
A: We may, but don't count on it. Point directly to the final url
Q: What if I point to a 404? Or have an infinite loop? Or I point to an uncrawled url? Or www/non-www conflict?
A: Search engines will handle it as best we can. Don't cross the streams!
Thanks
- Joachim Kupke: Google engineer who did heavy lifting
- Yahoo and Microsoft: for support of this link element too
- Wikia: for trying this out on their wiki pages
- Lots of webmasters for giving us feedback on this
Resources
Blog post on Google webmaster blog:
Yahoo blog post: http://ysearchblog.com/2009/02/12/fighting-duplication-adding-more-arrows-to-your-quiver/
Microsoft: http://blogs.msdn.com/webmaster/archive/2009/02/12/partnering-to-help-solve-duplicate-content-issues.aspx
Ask: http://blog.ask.com/2009/02/ask-is-going-canonical.html
Google Help Center documentation:
http://google.com/support/webmasters/bin/answer.py?answer=139394
Joost de Valk: WordPress, Magento, and Drupal