GoogleBot - Since I've been using Sitemeter to see how people find my blog and real estate website, I've been seeing the location: Mountainview, California a lot. It was making me a little concerned because it's many times per day, everyday. They would view many pages and the duration times were always over 3 minutes.
So, I clicked on the link to bring me to the details about the "person" and the domain name is Googlebot.com. Since my inquiring mind wanted to know, I googled googlebot and found this:
If a webmaster wishes to restrict the information on their site available to a Googlebot, or another well-behaved spider, they can do so with the appropriate directives in a robots.txt file, or by adding the meta tag
<meta name="Googlebot" content="nofollow" /> to the webpage. Googlebot requests to Web servers are identifiable by a user-agent string containing "Googlebot" and a host address containing "googlebot.com".
Currently Googlebot only follows HREF links and SRC links. Googlebot discovers pages by harvesting all of the links on every page it finds. It then follows these links to other web pages. New web pages must be linked to other known pages on the web in order to be crawled and indexed.
A problem which webmasters have often noted with the Googlebot is that it takes up an enormous amount of bandwidth. This can cause websites to exceed their bandwidth limit and be taken down temporarily. This is especially troublesome for mirror sites which host many gigabytes of data. Google provides "Webmaster Tools" that allow website owners to throttle the crawl rate.
Now I know it's not a who, it's a spider! This is the only spider I like!