On Tue, 03 Jan 2006 07:09:01 +0000, Roy Schestowitz
>__/ [Carol W] on Tuesday 03 January 2006 06:41 \__
>> On Tue, 03 Jan 2006 04:57:44 +0000, Roy Schestowitz
>> <newsgroups@xxxxxxxxxxxxxxx> wrote:
>>>What's the benefit of permitting Alexa to crawl though? Having the site
>>>archived for someone to look back at deleted content in the future? Have
>>>we not learned the lesson yet?
>> Actually it can be helpful to have an archived copy - even if that
>> particular content or site becomes deleted at a later date. I have
>> used the web archive to help locate some information or data that had
>> been deleted or removed from the web.
>...But if the size of a site does not exceed gigabytes (particularly when
>compressed), why not make use of private storage, which is often very
>cheap. You can stack up a progressive backup for just a few quid. If re-
>silience is important, you can duplicate the content periodically. The Web
>Archive is slower to access and it tends to mix objects that were collect-
>ed at different timepoints.
Well that is an option - and would work if people snagged copies of
things while the site still existed. However some sites close up with
next to no warning or people, when first visitng that site, may not
need a copy of that particular information at that time - but at a
later date wanting to find a copy of it due to something they are
You know what they say about hindsight being 20/20 though ...
>Another issue is people having access to content which was *accidently*
>made public, or even find the roots of a site whose 'image' has evolved.
I think web archive folks may take some situations into consideration
on if they will remove some contents or not at the request of the site
owner. I mean even Google Groups/Deja vu offer/offered people a way to
remove some of their own posts that they no longer cared to have in
>Having said that, the Web Archive can be useful to the user.
*nod* Archival sites can be helpful indeed. What helps them in being
useful is that they will archive what is accessible (within the scope
of what they want to archive) and willingly store it for others to
make use of at al later date.
>Lastly, as the OP points out, Alexa can have a noticeable cost, whether
>that cost is latency when serving visitors and search engines (crawlers)
>or even the traffic (hosting) bill. Rarely is there something to be
That can be said of any spider or bot that "goes nuts" on a site. I
seem to recall some folks complaining about spiders from (name of any
one of the Top 3 search engines inserted here] to the point they
snippets to their robots.txt files to turn away or try to curb those
spiders/bots for a while. However the OP also said that this behavior
has only been displayed for the past couple of days - so also unusual
in his observation. If that is the case then he could temporarily deny
alexa's spider from his site.