On 21/10/2008, at 11:38 AM, Jasper Bryant-Greene wrote:
On Tue, 21 Oct 2008 11:33:41 +1300 (NZDT), Simon Lyall
wrote: On Tue, 21 Oct 2008, Miskell, Craig wrote:
We already have, by having a robots.txt file. Shouldn't have to ask twice.
User-agent: * Disallow: /recruitment
Which I think highlights the problem. Many people have robots.txt files because they have some content they don't want archived by others, other people have load and bandwidth issues.
The National Library really has to ignore the first group but at the cost
of hitting the second group.
But surely they could obey robots.txt entries that specifically target them?
User-agent: NLNZHarvester2008 Disallow: /massive-collection-of-high-res-images
I believe one complaint was a lack of forward notice. -- Nathan Ward