There seem to be a number of issues here. To summarise the ones I can identify so far: 48 posts from 20/10/08 4:31 PM to 21/10/08 1:47 PM NZDT International Bandwidth Usage: The harvest was initiated from international sites and as such content owners were forced to pay for international bandwidth to return content. National Library is working with its contractors to discuss the possibility of a New Zealand based harvest site for any future use. robots.txt: The matter of crawling sites which contain robots.txt files was a hot topic. Subtopics around this were the fact that they would be honored if the webmaster requested this. It was not seen that this was necessary, as they were there for a reason in the first place. It was also noted that robots.txt were used for different valid purposes. Scan IPs: There was discussion around the IPs used to mount the scan. These seem to have been known to the group until logs were checked. This is in spite of the fact that they were provided by National Library on their website. Lack of notification: There seems to be a general feeling that more of an effort should have been made to notify industry about this harvest. The issues around International Bandwidth and robots.txt were cited as reasons an extraordinary effort should have been made in this case. It was also noted that administrators were able to contact the National Library and request that their robots.txt files be honored. This only makes sense if they were aware of the harvest before it began. It was also noted that some smaller content providers 'lurk' on the NZNOG list to receive updates such as this. National Library have undertaken to increase notification in mailing lists (such as NZNOG) in any future harvests. Missing NZ only content: Since de-peering, a large amount of New Zealand content is only available from within New Zealand. This content will be missing from the current harvest as it was conducted from an international source. Internet harvest vs Real world books. Some discussion occurred around the comparison between collecting Internet content and the obligation of publishers to send copies of works to the National Library. The point was made that even though publishers are required by law to deposit works in the Library. They are not required to do this at considerable personal expense (Paying international traffic charges rather then local). Ways to combat additional harvests: There was discussion around possible ways to avoid being harvested in the future. These centered around blocking IPs and blocking certain HTTP strings. It was mentioned that the National Library would rather people did not do this and that contacting them to have a robots.txt file registered would be a preferable option. Speed of Harvest: It was noted that although the majority of website owners are indexed by Google on a fairly regular basis, Google takes a "Slow, over time" approach to indexing. The Harvest took an "as fast as possible" approach. It was felt that this contributed to an unnecessary impact on some content providers internet links. .nz Domain Names: A question was asked as to how the National Library was able to obtain a list of sites to harvest. The Domain Name Commissioner responded with "I can confirm that the .nz zone file has not been released to the National Library" Please let me know if I've forgotten anything. Regards, Dean