[Tdwg-guid] Throttling searches [ Scanned for viruses ]

Sally Hinchcliffe S.Hinchcliffe at kew.org
Thu Jun 15 13:43:45 CEST 2006


There are enough discontinuities in IPNI ids that 1,2,3 would quickly 
run into the sand. I agree it's not a new problem - I just hate to 
think I'm making life easier for the data scrapers
Sally


> It can be a problem but I'm not sure if there is a simple solution ... and how different is the LSID crawler scenario from an http://www.ipni.org/ipni/plantsearch?id= 1,2,3,4,5 ... 9999999 scenario?
> 
> Paul
> 
> -----Original Message-----
> From: tdwg-guid-bounces at mailman.nhm.ku.edu
> [mailto:tdwg-guid-bounces at mailman.nhm.ku.edu]On Behalf Of Sally
> Hinchcliffe
> Sent: 15 June 2006 12:08
> To: tdwg-guid at mailman.nhm.ku.edu
> Subject: [Tdwg-guid] Throttling searches [ Scanned for viruses ]
> 
> 
> Hi all
> another question that has come up here. 
> 
> As discussed at the meeting, we're thinking of providing a complete 
> download of all IPNI LSIDs plus a label (name and author, probably) 
> which will be available as an annually produced download
> 
> Most people will play nice and just resolve one or two LSIDs as 
> required, but by providing a complete list, we're making it very easy 
> for someone to write a crawler that hits every LSID in turn and 
> basically brings our server to its knees
> 
> Anybody know of a good way of enforcing more polite behaviour? We can 
> make the download only available under a data supply agreement that 
> includes a clause limiting hit rates, or we could limit by IP address 
> (but this would ultimately block out services like Rod's simple 
> resolver). I beleive Google's spell checker uses a key which has to 
> be passed in as part of the query - obviously we can't do that with 
> LSIDs
> 
> Any thoughts? Anyone think this is a problem? 
> 
> Sally
> *** Sally Hinchcliffe
> *** Computer section, Royal Botanic Gardens, Kew
> *** tel: +44 (0)20 8332 5708
> *** S.Hinchcliffe at rbgkew.org.uk
> 
> 
> _______________________________________________
> TDWG-GUID mailing list
> TDWG-GUID at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
> 
> _______________________________________________
> TDWG-GUID mailing list
> TDWG-GUID at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid

*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** S.Hinchcliffe at rbgkew.org.uk





More information about the tdwg-tag mailing list