Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: bots

__/ [ www.1-script.com ] on Tuesday 09 May 2006 22:35 \__

> Douglas Clark wrote:
> 
>> After three fairly constant years the number of hits I am getting from
>> bots has been double the usual for the past two months and there is no
>> slackening off this month. Is anything up? My site has had only tiny
>> alterations.
> 
> You are not alone. There has been a serious increase of overall bots
> activity. Each search engine has its own reason to send the bots out more
> often though:


Ditto. You are not alone in this.


> Google messed up their index cache during the last big update and now
> needs to catch up with Y! and Ask before users notice something bad
> happened. That's my Google theory. Due to Google's secrecy the number of
> theories out there almost equals to the number of people trying to crack
> that problem, googlers themselves included.


I have not heard this theory before. I haven't noticed any degradation in
terms of search results either. Suggesting that Google have fallen behind is
something that would make big headlines (same with studies that argue Google
lost a top position), so it's probably just wishful thinking. In operation,
I am sure that they take into consideration all such risks and replicate the
data as required. Even "Big Daddy" seems to have been corrected/re-aligned.


> Y! wants to beat Google to the largest index size and so it needs to crawl
> more pages. The more pages they get the more links to your pages they
> discover whish sets a flag for their bot to visit the site again.


Yes, that's true.


> Ask is re-branding itself into a mainstream search engine and is pumping
> some serious money into both infrastructure and marketing. I guess, Teoma
> crawls more (since last year, actually) simply because they got more
> machines to run it from.


They should probably aim for a niche if they haven't the required capacity.


> MSN is still toying with their algorithms and they look like from time to
> time they dump large chunks of data from their database and need to
> re-crawl the sites again to restore it


Yesterday I discovered that MSN put my at number 5 for 'othello'. That should
be a real embarrassment for them. I wasn't bombing that site _at all_. It
must have been their fluke, or else their algorithms remain as terrible as
ever. But then again, what else is new? *smile*


> A bunch of smaller guys are full of ambition to become the next Google,
> and so they need their own cache of your site so they can analyze it to
> death.
> 
> I hope that pretty much covers most of it. Oh yeah, and there is always
> rogue bots out there, of course, trying your site for all kinds of
> exploits, so keep your shields up!


Shields up? I am not sure about exclusions. However, it is good to keep an
eye on the logs/stats. Some ratbots can crawl an entire site within hours,
depending on its size and available bandwidth. This slows down real
visitors, crawlers and it can cost you money, as well.

Best wishes,

Roy

-- 
Roy S. Schestowitz      | "World ends in five minutes - please log out"
http://Schestowitz.com  |    SuSE Linux     ¦     PGP-Key: 0x74572E8E
  5:20am  up 12 days 12:17,  7 users,  load average: 0.89, 0.49, 0.44
      http://iuron.com - Open Source knowledge engine project

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index