So, simply blocking the IPs in a couple of routers makes much more sense for me.
It's all coming from the following for us (across lots of domains) 207.241.226.39 207.241.226.40 ie: 207.241.226.40 - - [19/Jan/2013:15:53:04 +1300] "GET /index.html HTTP/1.0" 200 3653 "-" "Mozilla/5.0 (compatible; NLNZ_IAHarvester2013 +http://natlib.govt.nz/about-us/current-initiatives/web-harvest-2012)" 1275 207.241.226.39 - - [19/Jan/2013:19:27:40 +1300] "GET /robots.txt HTTP/1.0" 200 19 "-" "Mozilla/5.0 (compatible; NLNZ_IAHarvester2013 +http://natlib.govt.nz/about-us/current-initiatives/web-harvest-2012)" 363 -- Jean-Francois Pirus | Technical Manager francois(a)clearfield.com | Mob +64 21 640 779 | DDI +64 9 282 3401 Clearfield Software Ltd | Ph +64 9 358 2081 | www.clearfield.com