Home Messages Index
[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index

Re: Usenet Data

Using a pointed scraper and pebbles, [Toby Inkster] stuck:

> Roy Schestowitz wrote:
>> Google acquired a company that stored all of the data, which as far as I
>> know goes back to the dawn of UseNet. I can't recall the name of that
>> company, but I can check if you wish.
> DejaNews started collecting Usenet articles in 1995. In about 1998, at the
> height of the dot-com boom, it morphed into the for-profit Deja.com.
> Deja.com struggled for a business model, and eventually in late 2000
> started shutting down parts of its archive as it simply couldn't afford to
> run them.
> White knight Google came in and bought Deja.com's archive. This was in
> early 2001. For the first few months, it only showed about a 6 month
> archive, but then they added in the full backlog of Deja.com's data and
> Google Groups 1.0 came out of beta.
> At this point they had archives going back until only 1995. They put out
> an appeal for anyone with older archives to get in touch.
> Back in the late eighties and early nineties, there were companies that
> burnt six months worth of Usenet archives onto CD-ROMs to sell. For many,
> it was cheaper to buy these CDs than to pay for the bandwidth to subscribe
> to Usenet. Eventually, it became infeasible for the companies to produce
> these CDs. (Usenet had grown too much.) Many people sent Google data from
> these.
> Also, plenty of sites kept archives of just particular groups, going back
> for years and years. Many of these were sent into Google.
> As a result, Google was able to add in (albeit somewhat patchy) archives
> going back to May 1981. Usenet itself goes back a little further, to 1979,
> but those messages have been lost in the mists of time.

How do you know all of this? Didn't UseNet start around the time you were
born? This is by no means a complaint.


Roy S. Schestowitz      | Useless fact: Digits 772-777 of Pi are 999999
http://Schestowitz.com  |    SuSE Linux    |     PGP-Key: 74572E8E
  2:35pm  up 23 days  2:49,  2 users,  load average: 0.37, 0.55, 0.55

[Date Prev][Date Next][Thread Prev][Thread Next]
Author IndexDate IndexThread Index