The Rel Canonical Tag is Not The Solution to Every Known URL Problem

Rel Canonical TagWhen Google, Yahoo! and Microsoft joined forces and endorsed the rel=canonical tag back in 2009 SEOs across the land jumped for joy. Rel canonical does fulfill a very useful purpose buying a site owner time to correct their content duplication issues. Unfortunately, many site owners and developers are beginning to use the rel canonical tag as a band-aid for just about every URL problem. This is a dangerous practice that could produce a devastating result.

First, let’s discuss what the rel canonical tag is and why it is needed. Search engines want a unique piece of content to reside on one URL on your website. That means the search engines do not want to find your wildly popular article about Kony2012 republished on your site at multiple URLs:

www.bestsocialmediaever.com/kony2012/
www.bestsocialmediaever.com/kony2012-article/
www.bestsocialmediaever.com/kony2012-social-media-impact/
www.bestsocialmediaever.com/kony-2012-video-project/

Having the exact or very similar content on multiple URLs will trigger a duplicate content penalty. The above example of URLs highlight the SPAMMY way of duplicating content. Often times duplicate content issues are caused by a CMS that is running amok. Below is an example of a CMS creating a new version of your URL each day by appending a paramater:

www.bestsocialmediaever.com/kony2012/
www.bestsocialmediaever.com/kony2012?=wednes2/
www.bestsocialmediaever.com/kony2012?=thurs3/
www.bestsocialmediaever.com/kony2012?=fri4/
www.bestsocialmediaever.com/kony2012?=sat5/
www.bestsocialmediaever.com/kony2012?=sun6/

Notice the nasty parameter code being added everyday? This is not an intentional act of the site owner trying to create multiple URLs for the Kony2012 article. Nevertheless, a duplicate content penalty will soon loom over this site. The best course of action is to slap a rel canonical tag in the <head> section of this page essentially informing the engines, “Yes, we do have duplicate content for this page. However, please treat the specified URL in our rel canonical tag as the sole location for this content.”

For our example the rel canonical tag would appear like this in the <head> section of the source code:

<link rel=”canonical” href=”www.bestsocialmediaever.com/kony2012/”/>

Every duplicate page would have this rel canonical tag in the source code so the engines would know the intent of the site owner is to assign that content to one URL. After you have added that tag your job is not complete. Now you must FIX YOUR CMS URL WRITTING PROBLEM. Think of rel canonical as a real band-aid.

You cut your arm and blood comes pouring out. You clean the wound and slap a band-aid on the gash. In a few days your body repairs the cut and you no longer need the band-aid. That is rel canonical. Site owner discovers CMS URL writing problem and uses the rel canonical tag. Then the site owner resolves the URL writing problem, gets rid of all the dupe URLs so only the original URL is present and then the site owner can remove the rel canonical tag.

Unfortunately, I am seeing the rel canonical tag “resolving” many more issues on a permanent level. The frequent and most common use is to resolve the problem above, but the site owner never fixes the CMS parameter URL issue.

A deceptive and diabolical use of the tag can be found with sites that use content provider partners. Site A publishes celebrity gossip and becomes insanely popular. Site A cannot keep up with the demand for creating content. They reach out to Site B and strike a content syndication deal where Site B will allow Site A to reuse content from Site B. The same articles on Site B will also reside on Site A at the same time. I have seen contracts where Site B requires Site A add a rel canonical tag on the Site A content pointing to the original URL on Site B. This ruins any and all search value of having Site B’s content on Site A. The content on Site A will not surface on the search engine results pages. Ensure your content syndication agreements do not require the use of a rel canonical tag.

That last example is a sneaky way to manipulate the rel canonical, but the worst use is substituting 301 redirects for rel canonical. When you change your URL structure or move your entire site to a new domain you will ALWAYS NEED TO UTILIZE 301 REDIRECTS.

I recently saw a high revenue website change their URL structure and then use rel canonical to point to the new URLs. So the legacy URLs are still live and the new URLs are live. The site owner did not use 301 redirects, but slapped a rel canonical on the legacy URLs pointing to the content on the new URLs. Do not do this. Rel canonical was never intended to supplant the redirect process. What would happen if the rel canonical tags were put on the new site pointing back to the legacy site? The new site would simply not exist in the eyes of the engines.

Do not mess around with rel canonical because if used improperly it can become a shriveled up, and foul smelling band-aid. You do not let your band-aids for cuts become that way, so do not let that happen to your website.

35 thoughts on “The Rel Canonical Tag is Not The Solution to Every Known URL Problem

  1. “but the site owner never fixes the CMS parameter URL issue” you rightly said and raise the pick the pulse of question. Is this really to change the CMS or some modifications in CMS.

    Garth O’Brien plz tell how we can handle Cross domain rel=canonical tag when we don’t ownes the other site FTP like you showed the example above.

    Cross domain rel=canonical tag really works in SEO.

    Regards
    Sourabh Rana
    @kmadhav

  2. If you do not own the other site, then you are not going to be able to add a rel=canonical tag to that site. The rel=canonical tag is located in the section of a site which requires you have direct access to the site’s source code. No access no chance of adding a rel=canonical.

  3. Question, I have a business that’s changing their name. I got the new site up http://www.westhartfordmassagespot.com with all duplicate content from the original site http://www.flowofenergy.com.

    I want to do a cross-domain rel=”canonical” from the original site to the new one. What is the best way to accomplish this? Both are WordPress sites. I am currently running All in One SEO and will probably be installing the Yoast WordPress SEO plugin soon, but still, I don’t know where to start.

    Will I be placing canonical code on each page or is there some code I can place in the header which will take care of all duplicate pages site wide? Thx for any suggestions.

  4. Hi Garth!

    I have an ecommerce site on asp.net – I have asked for my designer to put in site wide 301′s from non-www to www versions of all pages and from http://www.domain.com.au/default.aspx to http://www.domain.com.au and 301′s where done from all pages on previous php version (same domain… http://www.domain.com.au/shop). We don’t appear to have parameter issues, and no duplicate content (once non-www and www is sorted out).

    I acquired another showcase website (no cart) when I purchased another bricks and mortar store… it currently shows off our products with links back to our product pages to buy. I don’t want google to think I am sneakily trying to get two bites of the serp cherry, so I was going to 301 all pages on the acquired site to appropriate pages on my site… then tell google via webmaster tools about domain name change.

    Does that sound right? I should have no need for rel canonical, should I?

    Also, since my new asp.net site went live, I have noticed I am accumulating soft 404 errors in webmaster tools relating to parameter issues from old php site…. does this mean the http://www.domain.com.au/shop site has not been deleted yet as I requested? Is the subdomain still live somewhere to be creating these errors?

    Sorry for the 20 questions! If I don’t tell my designer PRECISELY what I want done, it gets half done or not done at all!

    With thanks, Cath

    • Eeeek – sorry! I used URL domain as an example, but forgot it is a real estate search site here in aus… Please ignore/delete link….too early here on a Sunday morning, and I haven’t had my coffee!

    • Cath,

      This is a difficult one to answer without reviewing your actual websites. However, I can quickly state you are correct that you would not use rel=canonical for this task. If your original site and the one you acquired are virtual copies then I would recommend using 301 Redirects from the acquired site to your primary site. If they are selling different products then my answer might be different.

      Check and see if any of the only .php pages are indexed. In the Google search box enter site:www.domain.com and then try site:domain.com. That will produce every page indexed by Google. Or you should be able to download a file within the Google Webmaster Tools for that same list. If you see some .php pages in that list try to visit that URL and see what happens. Does it redirect? If not then the 301s were not implemented for the whole site or correctly.

      If after doing some of that leg work and you have not resolved the issue then you need someone to look at the site(s). Hope that helps!

  5. Yes! And I will also add that the canonical tag is NOT used to establish you as the main author of a particular page across all other URLs/domains.

    You would NOT believe how many at my previous agency (a large agency that does work for Fortune 500 companies) actually THOUGHT that the canonical tag established authorship.

    When I broke it to them that the canonical tag only helped to establish ONE authoritative version of the content on one distinct URL within the same domain, my fellow agency mates were flabbergasted. And I was exasperated from this fundamental “unknowledge”

  6. Hi Garth. Can I ask you something about the rel=canonical? Do you think it is a good practice to combine it with the robots noindex,follow tag too on these duplicated URLS?

    Thanks for your opinion.

    • Yes. That said, it is merely a band-aid and you should use those tactics to ensure the proper page gains authority when the duplication problem is unintended. It you are talking about using this combination for ad campaigns utilizing tracking parameters, then certainly.

      Blocking the parameters with the Robots.txt and Meta Robots directive will prevent the engines from indexing those URLs. However, bloggers and other site owners might backlink to your site using the parameter URL. That is why you would want to use Rel=Canonical.

  7. hi am started new blogs with custom domains http://www.cinerak.co.in and http://www.cinerak.in .web master tools not indexing sitemaps and google indexing links after 2 or 3 days (whenever i check site:www.cinerak.co.in) .why this happened i have seen some custom domain sites indexing but why to me? Is There any other option to submit sitmaps like atom.xml and rss.xml ..i tink we can use sitemap.xml in web master tools for blogger.pls replay me boss

    • Shruthi,

      There are quite a few issues with your sites, but I will limit this response to your indexation issues.

      1. http://www.cinerak.in/
      In your robots.txt you list your sitemap.xml at: Sitemap: http://www.cinerak.in/feeds/posts/default?orderby=UPDATED. When I visit that URL I am prompted with a pop-up that wants me to select a reader. Your sitemap.xml must adhere to the sitemap.xml requirements at http://www.sitemaps.org/protocol.html.

      Your sitemap.xml should render like mine: http://www.garthobrien.com/sitemap.xml

      2. http://cinerak.co.in/
      You have the same issue with this site as well.

      3. Sitemap.xml files are a wishlist of URLs you want indexed. It does not mean those URLs will be indexed. That said, Google has indexed about 26 pages on one site and 36 on the other. To improve indexation I would implement proper sitemap.xml and I would follow best SEO practices. Like I said earlier, there are many SEO related issues with your site. Enough that you will not be able to compete in the “movie” niche with the issues you need to resolve.

      I would pick up this book and give it a good read < http://amzn.to/NO6uJy >.

  8. Okay, I really messed this up but need help. I wanted to start a site so I named it domain.info. Thought I would change it to a .com, but that name was taken, so I changed the name alltogehter to domain.com.
    I added quite a few pages and better information to the site and really it was like going to balls.com and also seeing cars, engines, tires, etc. if you get my drift. All relevant, but added more than just the name sort of speaking. The problems is this:
    I use Godaddy and therefore cannot do a 301 redirect. They have a domain forward that I used. I used the domain forward from the .info site and forwarded it to the 2nd .com- Now, since I changed the name again (which I am keeping!) as it makes more sense, I was only able to do a domain forward with this as well. I did do a change of address with GWT from the .info to the 2nd .com and just changed it to the NEW .com and the 2nd .com also did a change of address to the NEW .com

    I realize I made a HUGE mistake, but this was my first website and I wanted to make it better. I’m proud of myself for what I figured out how to do, but disgusted at the fact of all the problems I know this is going to cause.

    Questions are:

    (1)HELP! what do I do?

    (2)Godaddy swears up and down that their domain forwarding is a 301 as they have a choice of 301 or 302.
    (3)The 2nd .com is exact duplicate to the NEW.com ( I just changed the name)
    (4) Do I need to add a rel=canonical
    (5) what to I add and where?

    All three domains are being redirected (forwarded) to the NEW domain, but I don’t want to get in trouble, I’m just an idiot who didn’t realize the consequences I am finding out I am facing.

    Thank you!

    • Lisa,

      Quite a bit to chew on with that comment. :)

      Here is the quick and dirty answer. Have one website and do not create multiple websites that have the same content. It seems that you create and improve upon a past website with a new version.

      Here are your steps:

      1. Locked down your domain name
      2. Use 301 redirects on all the former domains and point them to your final domain choice
      3. Remove the older sites

      Rel Canonical should only be used for pages on the same domain. Using that tag on Site A pointing to Site B is not using the tag for its intended purpose.

      This is the proper way to use 301 Redirects. Site A is live. Site B has a new domain name and will replace Site A. Site B goes live. 301 Redirects are uploaded to Site A pointing to the new URLs at Site B. All the web pages on Site A are deleted.

      Now if you add in a third Site C there is a new layer of “complication.” Site C goes live and replaces Site B. Update the redirects on Site A to now point to Site C. Create redirects for Site B pointing to Site C. Delete the pages of Site B.

      Avoid chain redirects and update the redirects on Site A.

      Redirects can be implemented in a variety of ways which is usually determined by the backend way your site is hosted. Most of my sites leverage the .htaccess file for 301s. If you are unfamiliar with anything I just wrote then you really should tap a friend that knows what they are doing or hire someone to help you.

      To see if a domain is redirecting as a 301 or 302, visit this link: http://www.rexswain.com/httpview.html. Enter your domain in the text box and click submit. Under the “Receiving Header” section you will see either HTTP/1.1·301·Moved·Permanently(CR)(LF) or 302. If it says 200 then it is not being redirected and that page is live.

      Hope that helps. If not please keep asking questions and I will answer as best I can.

      • It is the same website though I think I am a little confused. Everything is still the same, I just deleted some pages from the very first domain, then changed it to a better one, then a better name. (all same content, except added and took away a pages as It progressed)

        GoDaddy only has a domain forwarding which they said is the same as a 301
        If I delete any pages, it will actually delete the current pages because it’s always been the same site, just a change of domain name.

          • New domain = new website. :) If the HTTP Viewer says they are 301 redirecting and the other domains are pointing to the right final domain then you are good to go.

  9. Okay, almost done…(I appreciate all this!) I will tell others about you here)

    Do I need to delete any pages then? Just wanted to make sure because if I do, then it will delete the sites pages..

    I did a change of address with GWT for domain 1, and 2 to change to domain 3. Added updated sitemap for domain 3 and deleted sitemaps 1 and 2 (I hope I did that right)

    Is there anything else I need to do? This is my first site and I have a lot to learn!
    Thank you again for all your help!

    • Great stuff, but I’m still lost on what I need to do. On my site I publish monthly maintenance bulletins (110 to date). On the html version the pictures and charts can be zoomed for better viewing, but I also link a pdf version that can be downloaded and printed.
      I’ve been noticing that google only indexes the pdf versions.
      One example is a bulletin that the html page was visited (per google analytics) 1396 times last month. My web stats say the pdf version was downloaded 7367 times.

      I’m told I need to use the rel=”canonical”, but don’t know where or how to use it. First I thought it went in the link to the pdf, but I take it I need it on the html page? And I need to put the address of that page with the statement, not just rel=”canonical”?

      The bulletins are in this section of the site.
      http://widman.biz/boletines/boletines.html

      • Interesting question. I personally would keep both the HTML URL and the PDF version crawlable for the engines. Since you have more PDF downloads than pages views I would ensure all your PDF documents have links to your site. I would link to your homepage and of course the HTML URL version of the PDF.

        The rel=canonical tag works on webpages. I do not believe it would work on a PDF because it is a media asset and not a webpage. The tag is placed in the source code of your duplicate pages which cannot be done with a PDF.

        Once the canonical tag is on the duplicate versions of URL A, then the engines credit URL A from the authority gained by the dupe pages at URL A1, URL A2 and URL A3. The PDF is not a “webpage” in the classic sense. Hope that answered your query.

  10. Garth – So, can you use rel canonical when you are moving your site content to a new domain, in my case because of a Penguin penalty? Would you rel canonical each page, such as http://www.mycurrrentsite.com/about-us/ to http://www.mynewsite.com/about-us/, until you have moved all the pages over? Then you shut down the http://www.mycurrentsite.com and move forward only with http://www.mynewsite.com. If you pursued this approach, would you get slapped with a Panda penalty?

    Thanks,
    David

    • David,

      No. The rel canonical tag works only on the domain. It tells the engines that Page A has dupe URLs at A1, A2 and A3. The same content on those pages are on the “parent” or original page A. Engines please pass all authority accrued by the dupe pages, A1, A2 and A3, to page A only.

      If you are moving to another domain, then you must use 301 redirects. I have not done this for a site impacted by Penguin. My assumption is the new site will suffer because the external link profile will remain the same, just for a different domain.

      To bounce back from Penguin you are going to need to do the following:

      1. Stop trying to game linking (if you are/were)
      2. Clean up the “bad” links (Both engines have a disavow tool)
      2a. Did you get a bad links message from Google?
      2b. Be careful using the disavow link tool
      3. Promote your content through Social Media and engage (the links will come)
      4. If you have access to some of your optimized anchor text links, change them to brand, image, URL or blank links

      If the site is beyond repair because you have been link building for a decade, then you may need to launch a new site and do not use the redirects. Start over from scratch.

  11. Great article Garth! However, I’ve seen many bloggers or website owners reuse contents or same articles from others websites or blogs and at the end of the article or content they put link to the original article and sometimes an author bio too! Now do you think this is enough to avoid duplicate content issue or the rel=canonical tag still has to be placed in cases like these?

    • Rel Canonical resolves duplicate content issues on the same domain. It should not impact or transfer authority from a URL on Site A to a URL on Site B.

      • Thanks for replying Grath! So, it means that when I’m reusing or republishing someone else’s content on my site, it’s necessary to use rel canonical to avoid duplicate content issue on my site, right?

  12. Hi Garth,
    I found your article is very nice info about 301 redirection, and i read all the above question and your answer but till my concept is not clear.
    my first question is if my website open with www or without www format what should do on first priority.

    I am not a advance web developer to create 301 redirection file. so please suggest me rel canonical code or any suggestion.

    • If I understand your question I recommend using www. Convention and mainstream media always say, “Go to our website at www. .com.”

      We have been trained that a website always starts with www. So go with that.

  13. Hello Garth,

    I use wordpress CMS in my site http://www.spysafari.com/ . In the middle of the home page I use a feature content(SPY SOFTWARE FEATURES), also that feature content used in some inner pages(like cell phone spy, iphone spy.. etc), but not used in the pages that are linked from top menu bar.
    You know in wordpress use common header in home page as well as all inner pages. So how i can use this Canonical url for some inner pages and redirect to home pages.

    Please Help me. ..

    Thanks in advance..

    • Simple answer: Rewrite the content on the internal pages and make it unique to the content you populate on the homepage. That is a much better approach.

      If you slap a Canonical tag on the internal pages pointing to the homepage you have really hurt those internal pages. And vice versa. Rewrite the content.

  14. Heya! I just wanted to ask if you ever have any issues with hackers?

    My last blog (wordpress) was hacked and I ended up losing many months of
    hard work due to no backup. Do you have any methods to stop hackers?

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>