__/ [Frank_X_Rizzo@xxxxxxxxxxx] on Monday 16 January 2006 02:51 \__
> Ignoramus26433 wrote:
>> On Sun, 15 Jan 2006 17:48:54 +0000, Roy Schestowitz
>> <newsgroups@xxxxxxxxxxxxxxx> wrote:
>> > __/ [Ignoramus26433] on Sunday 15 January 2006 17:35 \__
>> >> Their googlebot fetches about 1 page per second from my site, thus
>> >> making me pay insane amounts for bandwidth. I want to tell someone
>> >> there to reduce frequency of their vidits, it is
>> >> outrageous. Crawl-delay has no effect on them.
>> >> i
>> > You said this before in this newsgroup. You said that they had been
>> > hammering your server. Why don't you call them? Public number is +1 650
>> > 253 000 and non-public (formerly so?) is +1 650 330 0100 (Press 5, then
>> > 3 to get a person during daytime).
>> I will call them and report my results.
Some people in alt.www.webmaster are having similar problems. This tends to
happen in large Web sites where there is plenty of 'crawlable' content, e.g.
Matt's encyclopaedia. I don't perceive myself one of these sites, but 1.85
gigabytes of traffic have been snatched by crawlers since the beginning of
the month. This is /not/ normal. The growth in crawling load is worrying
because it's costing me. I don't get more referrals.
> Maybe try using your robots.txt file to stop them from crawling your
> entire website, until you can get their attention.
That would give the wrong impression and may lead to pages being flushed off
the index, in due time.
> Can I ask how many
> pages your website has? Also, if you have control over a firewall,
> check your logs and stop allowing the offending bot. I would bet it's a
> specific IP from google.
He has plenty of pages and I think he /does/ want to be crawled, just not to
that excessive extent.
Roy S. Schestowitz | "I think I think, therefore I think I am"
http://Schestowitz.com | SuSE Linux | PGP-Key: 0x74572E8E
7:05am up 36 days 14:16, 13 users, load average: 1.55, 1.25, 1.19
http://iuron.com - next generation of search paradigms