Congrats, Regan. That has to be the nicest way of saying "RTFA" I've
seen in awhile. ;)
On Thu, Apr 15, 2010 at 11:57 PM, Regan Murphy
***** Information abut the user agent is available from http://bit.ly/nlnzwebharvest and we don't yet know what IP addresses will be used. *****
Fairly sure the robots.txt was mentioned in the URL provided..... !
-----Original Message----- From: nznog-bounces(a)list.waikato.ac.nz [mailto:nznog-bounces(a)list.waikato.ac.nz] On Behalf Of Scott Weeks Sent: Thursday, 15 April 2010 2:01 p.m. To: nznog(a)list.waikato.ac.nz Subject: Re: [nznog] Fwd: Notification of web harvest & consultation report
--- Gordon.Paynter(a)natlib.govt.nz wrote: From: "Gordon Paynter"
Information abut the user agent is available from http://bit.ly/nlnzwebharvest and we don't yet know what IP addresses will be used.
We have no plans to use the If-Modified-Since (or Etag, or similar approaches) for comparison with the 2008 harvest.
If you have concerns about how the crawler may behave on specific websites, feel free to email us directly at web-harvest-2010(a)natlib.govt.nz or get in touch via our feedback form. -------------------------------------------------
Does it honor the robots.txt file?
scott