Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: robots.txt Variations?

__/ On Sunday 28 August 2005 10:04, [John Bokma] wrote : \__

> Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:
> 
>> I am hoping that someone in this group can help me out. For the past
>> few months I have been spotting errors for odd variations of the file
>> robots.txt. (among others)
>> 
>> Putting mistaken bots aside, there maybe would be one error for every
>> ~100 visits, so I still have a very frequent look at the error logs
>> (trying to identify internal broken link), but I sometimes get
>> unexplained errors, e.g. so far this month:
>> 
>> /robots1.txt    8 times this month
>> /zzrobots.txt   4
>> 
>> The rest might be human errors:
>> 
>> /robots.tx      1
>> /robotsxx.txt   1
> 
> I'll check my error log...
> 
>> Is it possible that some crawlers 'extended' this type of protocol?
>> 
>> Even /sitemap.rdf has been requested twice even though I haven't
>> signed up with Google Site Maps. Can all of the above just be visitors
>> that temper with the server? They seem to come from addresses that do
>> not contain numbers, but still have obscure domains.
> 
> [192.55.214.54]       zzrobots.txt
> [205.236.116.250]             robots1.txt
> 
> And several requests for sitemap.rdf

[Sun Aug 28 07:21:56 2005] [error] [client 205.236.116.250] File does not
exist: /home/schestow/public_html/robots1.txt

which is a match (the latest error) - reverse DNS comes up with:
master.carrefourinternet.com

I have checked some of the other IP's in the past, but they appeared to have
come from completely different sources. Would inclusion in the IP deny list
be worthwhile? It's a recurring theme, but maybe a request for robots.txt
is subsequently made... and if so, why?!?!

Roy

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index