Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: ?google=nocrawl

__/ [ Borek ] on Sunday 19 March 2006 11:00 \__

> On Sun, 19 Mar 2006 06:07:21 +0100, Roy Schestowitz
> <newsgroups@xxxxxxxxxxxxxxx> wrote:
> 
>>> http://www.mattcutts.com/blog/googlebot-keep-out/
> 
>> I read that yesterday morning, but then I thought: "might as well use
>> robots.txt". The *last* thing you would want to embed in a page is a
>> vendor-specific (branding) information. AdSense bits are more than enough
>> and _even that_ raises doubt over the Openness of the Web. Also think of
>> ms-objects in Office-generated 'HTML'.
> 
> I don't think I am going to use it, however, I like the idea of a URL part
> working as a limiter (?) - it may take any form, even
> "Borek_asks_to_end_crawling_here" :). Too bad robots.txt doesn't allow
> wildcards.

I don't think I am *ever* going to use it. Either way, I loathe the idea of a
URL part b0rking using a limiter (?). It should have taken a *general* form,
even "All_search_engines_should_not_index_further" :{. It's a good thing
that robots.txt doesn't allow wildcard. It is even better that search
engines make no attempt to support wildcards, thereby 'extending'
conventions /a la/ Google Sitemaps or Microsoft 'extending' RSS.

Best wishes,

Roy

-- 
Roy S. Schestowitz      |    "Did anyone see my lost carrier?"
http://Schestowitz.com  |    SuSE Linux    ¦     PGP-Key: 0x74572E8E
 11:40am  up 11 days  4:17,  11 users,  load average: 1.82, 1.60, 1.12
      http://iuron.com - next generation of search paradigms

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index