WARNING: this is super geeky but you do need to know this, so here goes: Canonicalization is a file methodology that exposes a flaw in the modern search engine and the way it indexes websites.  If you learn to exploit the flaw, your page rank and traffic for both your website and blog will soar, if not, your website can flounder. This first part will explore the flaw and some of its impacts.  The second part will go into the gory detail of how the wrong canonicalization can literally kill a website and how to prevent that.

What is Canonicalization (C14N)

So, beyond a serious point scorer in Scrabble, what exactly is it and why do I care?  Canonicalization according to Wikipedia is the process of converting data that has more than one possible representation into a "standard" canonical representation.  A more concise description of it and how it relates to the web is Matt Cutts explanation: Canonicalization is the process of picking the best url when there are several choices, and it usually refers to home pages.

C14N is an issue – it is a source of confusion to search engines

Let’s jump right into an example.  Search engines read the following urls as if they are totally different websites:

http://technorati.com/

http://www.technorati.com/

You see the exact same thing when you go to these different urls, right? 

 

Now run a “site:” against each of those in Google:

Here’s the link to my results:

Site:http://technorati.com/  

You get 436,000 page results
 

Site:http://www.technorati.com/

You should get 299,000 page results

As you can see, Google was clearly confused by the small difference in the canonicalization of those 2 urls.  While they should have returned the same number of results but they didn’t (while many of the pages between the two sets of results had the exact same content, Google saw them as two different pages on two different sites).  This proves that what we as users know to be one website, Google believes is two. 

What is the impact of inconsistence canonicalization?

What would end up happening because you have 2 different sets of indexed pages for the same site is that some traffic will go to the www address while other traffic will go to the non www addy.  What this proves is that subdomains matter.  You need to keep your blog and website on a common subdomain to keep all pages and traffic in a place where Google and Alexa can index and measure it.

Duplicate content filters and canolicalization

Think about it.  If Google thinks these results comes from two different sites, how do you think the duplicate content filters will respond?  Exactly- when you splinter your subdomains and pull back the same content on each of them (that is inadvertent- it just happens), it can trip Google’s duplicate content filters.  Penalized for your own content on your own site.  That stings, doesn’t it?

Non-uniform URL’s caused by C14N- means you have two sites with different traffic and page rank stats fighting one another in the SERPs

I think no one will dispute the fact that page rank and SERP are related to inbound and outbound links so what happens when you have a www domain and a non-www domain?  You will get a certain amount of links to one and also links to the other but it is still the same website.  Since there is an imbalance in linking you will get different page ranks for the same site.

Consider the cost: A user does a search for “the next big thing” and Google’s indexes have this listed in two different places as demonstrated by running the site: query so you are now essentially fighting yourself for a search result.  Wouldn’t it have been better if they were always in the one index?

BTW – As nice as it may be, your web server is not terribly smart and it reports the hits to www and non-www as two separate domains so now you have to weed through the logs to find your true hits.

Now, let’s prove that search engines treat sub-domains differently and in a way that can cause uneven traffic and lack of visitors to the primary site AKA www.  So, a search engine sees your one site as two different ones. So what? They are both still you, right?  Hold up- it means you have 2 different page ranks, two different traffic statistics. 

OK, let’s stop there…  For all of you that actually got down to this part.  I am holding up my Secret Decoder Dork Ring and saying Wonder Geeks Activate!

In the next part- I’ll go through what all this really means, how it can kill your website if you have a blog on another subdomian and how to fix it easily.

 

12 Comments on Canonicalization Part 1: what’s killing your website

OCT
16
2006
491,199 Points 57 Featured Posts Localism Sponsor Outside Blog

Mary holds up her decoder and and says "Wonder Geeks Activate!"  as I stay tuned...  Luckily I have read Matt Cutt's explaination a few times so I am not totally lost here.... 

 

4:11pm • #1
35 Featured Posts
Wow, Maureen.  I am so very impressed you clearly have a decoder ring of your own!  Thanks for actually reading that.  Aren't youglad I didn't use the spell checker in word and make this whole post about cannabalism?
4:28pm • #2

Good explanation. I discovered this fact about Google when I added some of my sites in the Google SiteMap Beta. 

It's such a critical matter and such an easy fix. 

But, no one pays any attention to me.  I'm a real estate broker.  What do I know. 

What breaks me up is when I see "web design companies" with this very problem. 

Good post.  I'd give it a five but the "stars" are gone.

Lenn

Lenn Harley
4:40pm • #3
491,199 Points 57 Featured Posts Localism Sponsor Outside Blog
Lenn ... I think you aren't signed in so that is why the numbers to rate the entry aren't there.   
5:01pm • #4
534,015 Points 235 Featured Posts Localism Sponsor Outside Blog
Ok, so you ladies are so damn smart I can't stand it. Mary, I'm not sure why but I understood every thing you just said AND it makes sense. I do have a question though. I always advertise my site with the WWW but how can I control if people are putting it in without the www?  Am I better off taking the www away.
6:57pm • #5
35 Featured Posts
Excellent question.  My dear, dear Bryant.  You don't have to worry that pretty little head of yours about this problem.  Your sites has perfect canonicalization right now.  You don't have a single page off on some random subdomain so it won't matter is a user puts in the WWW or not- they get directed to the right place.  
7:12pm • #6
Oh .. I have a pain between my ears.. I'll have to reread this after I have coffee in the morning because it whizzed right over my head this evening!
8:02pm • #7
127,244 Points 18 Featured Posts Outside Blog

You amaze me Mary!

It took me 2 read throughs, but I finally got this old brain to understand what you were saying.

Am I right in thinking that you are referring mainly to one reason why a site may actually lose their correct ranking due to cannabalism? Bear with me, I'm old and I read your blogs to sharpen my brain. And before everyone jumps on me, I meant Canonicalization! Have to be careful what I say lateley.

Seriously though, is that what you are saying?

8:04pm • #8
35 Featured Posts
Karen, yes.  You can actually lose valueabl ground on your website if you aren't careful to properly canonicalize your urls.
8:39pm • #9
307,627 Points 10 Featured Posts Outside Blog

I had this problem early after I bought my first domain....

I decided to just do the NON www address forever.......

It has stuck.......

It was nice to see the idea presented by a professional who knew how to say what needed to be said....Mary... you are awesome, girl!!

:-D

9:05pm • #10
OCT
17
2006
320,579 Points 69 Featured Posts Localism Sponsor Outside Blog

"The Lovely Mary"

I love it...He does have a pretty little head.

He now understands he has two pretty little ladies watching his back to. I will be in touch.

I am almost done with my homework. Teacher. :> BAM!

TLW "The Lovely Wife"...Kum La Ka Lakka...ROAR!

7:44am • #11
AUG
27

Mary - WOW It seems as though everytime that I read one of your posts there is something else to think about. You make me think a lot and be purposeful in my blogging.

8:41pm • #12

Leave a response…



(optional)
What does the graphic say?
 
Rainmaker_large

Mary McKnight

Orlando, FL

More about me…

Sacrilicious Marketing

Office Phone: (407) 572-4638

Email Me

Helping Realtors learn to successfully write and promote their real estate blog. Online success is not magic, it's knowledge and most of time, it’s free. My focus is to give Realtors the tools and knowledge to affordably succeed online through search engine optimization, search engine marketing, blogging and proper RSS implementation.


Links

Archives

RSS 2.0 Feed for this blog

Find FL real estate agents and Orlando real estate on ActiveRain.