I'm guessing multiple domains pointing at the same data meant
the 80gig was replicated to 260gig?
At what point can you say that
international standards (ie robots.txt) were not observed thus they should
reimburse you for bandwidth costs?
260gig of spam is going to be a
fantastic thing for someone to look through in the future, it really does show
the true culture of New Zealand....</sarcasm>
Philip
Seccombe
-----Original Message-----
From: TreeNet Admin [mailto:admin@treenetnz.com]
Sent: Sun
10/26/2008 7:08 PM
To: Philip Seccombe
Cc:
nznog@list.waikato.ac.nz
Subject: Re: [nznog] NLNZHarvester2008
Philip
Seccombe wrote:
> I would be very interested in how much data was
collected in this.
>
> I'd also be interested if it was a basic
harvest or there was some smart
> archiving done for duplicate files
etc
> Eg for a couple of customers they have between 4 and 8 or so sites
all
> pointing at the same place
>
> Also as a computer tech
I'll put up a download directory to grab
> programs from for cleaning
customers pc (eg spyware utils, general apps,
> service packs), a quick
look is showing 1 gig of data there, and we have
> that as .co.nz .net.nz
and as different domains just so if I tell a
> customer over the phone to
download something to fix they won't make a
> mistake.
> Funny
you'll be using probably 4gig just on my spyware apps and service
> packs
because somehow its document heritage to New Zealand...of programs
> made
mostly in the states :)
>
> Ah well, in 30 years I guess someone
will be interested to see what the
> internet looked like back in 2008.
It's also probably a cheaper option
> than our government spending $100
million to hire people to decide what
> should and shouldn't be
kept
>
> Philip
Cheaper? only for NatLib. We who host are
paying the bill for this.
And no, they are not doing any smart filtering
for duplication. They
managed to download 260GB+ of international WHOIS and
spam archives from
an 80GB disk drive here before the harvest IPs got
firewalled. I'm not
pleased.
PS. natlib: robots.txt is often expressly
setup to prevent this type of
'accident'.
AYJ
--
This
message was scanned by Turnstone Spam Filter and is believed to
be
clean.
Click here to report this message as spam.
http://spamfilter.turnstone.co.nz/cgi-bin/learn-msg.cgi?id=6BB6E28035.49CC9