Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: how to stop crawler to index web page?

  • Subject: Re: how to stop crawler to index web page?
  • From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
  • Date: Sat, 23 Sep 2006 12:08:37 +0100
  • Newsgroups: alt.internet.search-engines
  • Organization: schestowitz.com / ISBE, Manchester University / ITS / Netscape / MCC
  • References: <1158977626.932492.115180@m7g2000cwm.googlegroups.com> <12h96duam3rmj64@corp.supernews.com>
  • Reply-to: newsgroups@xxxxxxxxxxxxxxx
  • User-agent: KNode/0.7.2
__/ [ z ] on Saturday 23 September 2006 03:19 \__

> John wrote:
> 
>> Is there a way to stop web crawler from indexing or parsing my web
>> pages? I saw xanga.com has a feature that allows the user to do that.
> 
> Is your page on xanga.com?
> 
> If you have your own server just make a robots.txt file:
> 
> http://www.robotstxt.org/wc/robots.html
> 
> That doesn't make the web page private though.

...Could sniff user-agent and deliver different content (or none) if the
browser (from the HTTP headers) is not recognised as a valid tool, as
opposed to a bot or a grabber. I get failry good penetration as Googlebot
2.1. *grin*

Best wishes,

Roy

-- 
Roy S. Schestowitz      |    "How do I set my laser printer on stun?"
http://Schestowitz.com  |    SuSE Linux     |     PGP-Key: 0x74572E8E
 12:05pm  up 65 days  0:17,  9 users,  load average: 0.81, 0.79, 0.78
      http://iuron.com - Open Source knowledge engine project

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index