__/ [ Borek ] on Sunday 19 March 2006 11:00 \__
> On Sun, 19 Mar 2006 06:07:21 +0100, Roy Schestowitz
> <newsgroups@xxxxxxxxxxxxxxx> wrote:
>
>>> http://www.mattcutts.com/blog/googlebot-keep-out/
>
>> I read that yesterday morning, but then I thought: "might as well use
>> robots.txt". The *last* thing you would want to embed in a page is a
>> vendor-specific (branding) information. AdSense bits are more than enough
>> and _even that_ raises doubt over the Openness of the Web. Also think of
>> ms-objects in Office-generated 'HTML'.
>
> I don't think I am going to use it, however, I like the idea of a URL part
> working as a limiter (?) - it may take any form, even
> "Borek_asks_to_end_crawling_here" :). Too bad robots.txt doesn't allow
> wildcards.
I don't think I am *ever* going to use it. Either way, I loathe the idea of a
URL part b0rking using a limiter (?). It should have taken a *general* form,
even "All_search_engines_should_not_index_further" :{. It's a good thing
that robots.txt doesn't allow wildcard. It is even better that search
engines make no attempt to support wildcards, thereby 'extending'
conventions /a la/ Google Sitemaps or Microsoft 'extending' RSS.
Best wishes,
Roy
--
Roy S. Schestowitz | "Did anyone see my lost carrier?"
http://Schestowitz.com | SuSE Linux ¦ PGP-Key: 0x74572E8E
11:40am up 11 days 4:17, 11 users, load average: 1.82, 1.60, 1.12
http://iuron.com - next generation of search paradigms
|
|