Re: robots.txt Variations?

Home	Messages Index

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index

Re: robots.txt Variations?

Subject: Re: robots.txt Variations?
From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
Date: Sun, 28 Aug 2005 12:13:36 +0100
Newsgroups: alt.www.webmaster
Organization: schestowitz.com / Manchester University
References: <derpab$20ep$1@godfrey.mcc.ac.uk> <Xns96C029605E38Ccastleamber@130.133.1.4>
Reply-to: newsgroups@xxxxxxxxxxxxxxx
User-agent: KNode/0.7.2

__/ On Sunday 28 August 2005 10:04, [John Bokma] wrote : \__

> Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:
> 
>> I am hoping that someone in this group can help me out. For the past
>> few months I have been spotting errors for odd variations of the file
>> robots.txt. (among others)
>> 
>> Putting mistaken bots aside, there maybe would be one error for every
>> ~100 visits, so I still have a very frequent look at the error logs
>> (trying to identify internal broken link), but I sometimes get
>> unexplained errors, e.g. so far this month:
>> 
>> /robots1.txt    8 times this month
>> /zzrobots.txt   4
>> 
>> The rest might be human errors:
>> 
>> /robots.tx      1
>> /robotsxx.txt   1
> 
> I'll check my error log...
> 
>> Is it possible that some crawlers 'extended' this type of protocol?
>> 
>> Even /sitemap.rdf has been requested twice even though I haven't
>> signed up with Google Site Maps. Can all of the above just be visitors
>> that temper with the server? They seem to come from addresses that do
>> not contain numbers, but still have obscure domains.
> 
> [192.55.214.54]       zzrobots.txt
> [205.236.116.250]             robots1.txt
> 
> And several requests for sitemap.rdf

[Sun Aug 28 07:21:56 2005] [error] [client 205.236.116.250] File does not
exist: /home/schestow/public_html/robots1.txt

which is a match (the latest error) - reverse DNS comes up with:
master.carrefourinternet.com

I have checked some of the other IP's in the past, but they appeared to have
come from completely different sources. Would inclusion in the IP deny list
be worthwhile? It's a recurring theme, but maybe a request for robots.txt
is subsequently made... and if so, why?!?!

Roy

References:
- robots.txt Variations?
  - From: Roy Schestowitz

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index