__/ [Carol W] on Sunday 20 November 2005 01:08 \__
> On Fri, 18 Nov 2005 11:54:42 +0000, Roy Schestowitz
> <newsgroups@xxxxxxxxxxxxxxx> wrote:
>>__/ [mogga] on Friday 18 November 2005 11:39 \__
>>> On Fri, 18 Nov 2005 07:42:25 +0000, Roy Schestowitz
>>> <newsgroups@xxxxxxxxxxxxxxx> wrote:
>>>>Yesterday I found out that they had indexed my genealogical pages. No
>>>>bloody idea how. Family members searched their names and that goddamn
>>>>Google blurted out too much. All is still in cache. Needn't here be an
>>>>inbound link? With Analytics they will sooner or later be able discover
>>>>all 'hidden' pages.
>>> Someone with a toolbar visits a page which is "hidden" and it isn't
>>> hidden anymore.
>>Yes, I am aware, but should it get indexed as a consequence?
> Does google honor the robots.txt file to not index or spider those
> pages -even if visited by someone who may have the toolbar installed?
I am not too sure. In fact, I wonder if it will ever drop pages from its
index as a result of /modified/ robots.txt. I really hope so because it
indexed many pages I did not intend for it to ever have access to.
I don't think robots.txt gives a compelling enough reason to flush things
from the cache. Google like to hold on to their cache assuming that
knowledge breath is power. I suppose so because when I once erased an entire
section, but Google and Yahoo came back for about 6 months just to receive
404's. They should have just given up earlier or 'noticed' that the section
no longer had links to it, neither internal or external.
Roy S. Schestowitz
http://Schestowitz.com | SuSE Linux | PGP-Key: 0x74572E8E
4:05am up 16 days 23:59, 4 users, load average: 1.00, 0.93, 0.85
http://iuron.com - next generation of search paradigms