Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Google ignoring noindex META Tag

  • Subject: Re: Google ignoring noindex META Tag
  • From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
  • Date: Thu, 08 Jun 2006 03:59:01 +0100
  • Newsgroups: alt.internet.search-engines
  • Organization: schestowitz.com / MCC / Manchester University
  • References: <1149717298.767710.114330@i39g2000cwa.googlegroups.com> <j8je82d1du9t9k9biqa4ksmr2ehdesu96q@4ax.com> <1149723797.119549.186970@u72g2000cwu.googlegroups.com>
  • Reply-to: newsgroups@xxxxxxxxxxxxxxx
  • User-agent: KNode/0.7.2
__/ [ Rik ] on Thursday 08 June 2006 00:43 \__

> 
> Paul wrote:
>> On 7 Jun 2006 14:54:58 -0700, "Rik" <rik@xxxxxxxxxxxx> wrote:
>>
>> >I have recently noticed some of my pages showing up in the Google cache


I will gladly take this 'problem' off your hands. Google Cache has been
problematic in recent months.


>> >even though the page contained a "noindex" META Tag. These are private
>> >pages for inter office use and are not meant for public display.
>> >
>> >Is there another META tag that will prevent Google from caching these
>> >pages?


Meta tags are not most reliable as not every crawler/cacher will honour them.
Exclusions using robots.txt likewise and, in a sense, they are even worse as
they publicly expose the listing of potentially 'sensitive' pages.


>> >Since the pages are not meant for public view, I have just re-named the
>> >
>> >files so anyone that may click them from Google will just get my not
>> >found page.
>> >
>> >My problem is that I really have no way to keep up with pages which
>> >Google has ignored my noindex META. I have now included the noarchive
>> >meta in the hopes the Googlebot might understand that one.
>> >
>> >Any suggestions?


Also see the following:

http://www.i18nguy.com/markup/metatags.html


>> >Rik
>> >
>> >P.S. I posted this question in the public forum but received no
>> >suggestions so I'm hoping some-one in here may have run in to this
>> >problem.  Please pardon my cross post.


I notice that Paul (or you) has reduced the distributions


>> What about password protected files ?


I would suggest the same. Too many times in the past I had my 'hidden' pages
indexed. This was a bit embarrassing at time. The bigger issue is with cache
as information is no longer in your control and cannot be removed from the 
public eye immediately. If you call Google, however, and follow the correct
route, then you can request that they remove unwanted cache.


> The page that contains the links leading to our private pages is
> password protected. That navagation page resides in a folder that is
> disallowed through my robots.txt file.
> 
> The private pages in question reside in folders that contain public
> pages so I was afraid to disallow anything in those folders using the
> robots.txt file for fear of the bot ignoring the folder.
> That's why I chose to use the noindex meta on the individual pages.
> 
> Is it common for Google to ignore META tags like the noindex,noarchive
> I am currently using?  I have seen Google ignore my robots.txt file
> before but this is the first time I have seen them ignore the noindex
> command.


I think I have heard similar stories. They should never be trusted and there
is also a certain need for careful testing of  the files, for which I know
no tools.

It's the same situation with "X-No-Archive: Yes" in newsgroups. Too many
ratbots and aggregators ignore these and, once somebody replies to messages,
all protection is stripped off. You can think of this as the equivalent of
someone scraping your 'noindex' pages, putting them in public space
elsewhere.

Best wishes,

Roy

-- 
Roy S. Schestowitz      |    Open Source Othello: http://othellomaster.com
http://Schestowitz.com  |  SuSE GNU/Linux   ¦     PGP-Key: 0x74572E8E
  3:45am  up 41 days  9:18,  11 users,  load average: 0.30, 0.70, 0.87
      http://iuron.com - help build a non-profit search engine

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index