There are enough discontinuities in IPNI ids that 1,2,3 would quickly run into the sand. I agree it's not a new problem - I just hate to think I'm making life easier for the data scrapers Sally
It can be a problem but I'm not sure if there is a simple solution ... and how different is the LSID crawler scenario from an http://www.ipni.org/ipni/plantsearch?id= 1,2,3,4,5 ... 9999999 scenario?
Paul
-----Original Message----- From: tdwg-guid-bounces@mailman.nhm.ku.edu [mailto:tdwg-guid-bounces@mailman.nhm.ku.edu]On Behalf Of Sally Hinchcliffe Sent: 15 June 2006 12:08 To: tdwg-guid@mailman.nhm.ku.edu Subject: [Tdwg-guid] Throttling searches [ Scanned for viruses ]
Hi all another question that has come up here.
As discussed at the meeting, we're thinking of providing a complete download of all IPNI LSIDs plus a label (name and author, probably) which will be available as an annually produced download
Most people will play nice and just resolve one or two LSIDs as required, but by providing a complete list, we're making it very easy for someone to write a crawler that hits every LSID in turn and basically brings our server to its knees
Anybody know of a good way of enforcing more polite behaviour? We can make the download only available under a data supply agreement that includes a clause limiting hit rates, or we could limit by IP address (but this would ultimately block out services like Rod's simple resolver). I beleive Google's spell checker uses a key which has to be passed in as part of the query - obviously we can't do that with LSIDs
Any thoughts? Anyone think this is a problem?
Sally *** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk