Regan Murphy wrote:
So essentially the argument is, we don't want to pay a small amount for it, so we'll push that (larger) cost on to NZ businesses instead? Was there even any research done in to finding out what the cost would be to NZ businesses? Should a govt. thing like natlib care about that sort of thing?
Last I looked at such things, public rate card for colo is 1000/mbit for international capacity. Let's assume most colo customers don't really know how to negotiate that down and are paying that. Most of my customers who's kit I look after are (or were before I came along :-). Domestic is what, 100/mbit?
Is the data cost to site owners really such a big issue? More than 97.5% of the harvested sites had less than 100MB of data downloaded and only 477 sites had more than 1GB data downloaded. Of the larger sites, I wonder how many are paying per/MByte instead of per/MBps
Refer the table from the original options paper linked http://bit.ly/nlnzwebharvest :
Data downloaded Number of hosts Percent of hosts < 1MB 322,951 81.3% 1 to 10MB 43,226 10.9% 10 to 100MB 22,082 5.6% 100 to 1000MB 8,365 2.1% 1 to 10 GB 455 0.1% 10 to 100 GB 22 0.006% Total 397,101 100%
"really such a big issue?" Well, this was last time... http://treenet.co.nz/natlib.png This graph is taken on traffic to the back-end shard server *behind* a CDN buffer cloud/cluster. It is just one of those 92.1% of servers on a <10MB link. NP: for comparison, the Sept spike is a site replication. I image most web hosts have similar piles of deadweight site data that nobody but robots ever visit. I have crossed fingers that the new harvest will at least do If-Modified-Since on the old URLs with last harvests date on stuff like images? AYJ