Philip Seccombe wrote:
I would be very interested in how much data was collected in this.
I'd also be interested if it was a basic harvest or there was some smart archiving done for duplicate files etc Eg for a couple of customers they have between 4 and 8 or so sites all pointing at the same place
Also as a computer tech I'll put up a download directory to grab programs from for cleaning customers pc (eg spyware utils, general apps, service packs), a quick look is showing 1 gig of data there, and we have that as .co.nz .net.nz and as different domains just so if I tell a customer over the phone to download something to fix they won't make a mistake. Funny you'll be using probably 4gig just on my spyware apps and service packs because somehow its document heritage to New Zealand...of programs made mostly in the states :)
Ah well, in 30 years I guess someone will be interested to see what the internet looked like back in 2008. It's also probably a cheaper option than our government spending $100 million to hire people to decide what should and shouldn't be kept
Philip
Cheaper? only for NatLib. We who host are paying the bill for this. And no, they are not doing any smart filtering for duplication. They managed to download 260GB+ of international WHOIS and spam archives from an 80GB disk drive here before the harvest IPs got firewalled. I'm not pleased. PS. natlib: robots.txt is often expressly setup to prevent this type of 'accident'. AYJ