[Tdwg-guid] Throttling searches [ Scanned for viruses ]
Sally Hinchcliffe
S.Hinchcliffe at kew.org
Thu Jun 15 13:43:45 CEST 2006
There are enough discontinuities in IPNI ids that 1,2,3 would quickly
run into the sand. I agree it's not a new problem - I just hate to
think I'm making life easier for the data scrapers
Sally
> It can be a problem but I'm not sure if there is a simple solution ... and how different is the LSID crawler scenario from an http://www.ipni.org/ipni/plantsearch?id= 1,2,3,4,5 ... 9999999 scenario?
>
> Paul
>
> -----Original Message-----
> From: tdwg-guid-bounces at mailman.nhm.ku.edu
> [mailto:tdwg-guid-bounces at mailman.nhm.ku.edu]On Behalf Of Sally
> Hinchcliffe
> Sent: 15 June 2006 12:08
> To: tdwg-guid at mailman.nhm.ku.edu
> Subject: [Tdwg-guid] Throttling searches [ Scanned for viruses ]
>
>
> Hi all
> another question that has come up here.
>
> As discussed at the meeting, we're thinking of providing a complete
> download of all IPNI LSIDs plus a label (name and author, probably)
> which will be available as an annually produced download
>
> Most people will play nice and just resolve one or two LSIDs as
> required, but by providing a complete list, we're making it very easy
> for someone to write a crawler that hits every LSID in turn and
> basically brings our server to its knees
>
> Anybody know of a good way of enforcing more polite behaviour? We can
> make the download only available under a data supply agreement that
> includes a clause limiting hit rates, or we could limit by IP address
> (but this would ultimately block out services like Rod's simple
> resolver). I beleive Google's spell checker uses a key which has to
> be passed in as part of the query - obviously we can't do that with
> LSIDs
>
> Any thoughts? Anyone think this is a problem?
>
> Sally
> *** Sally Hinchcliffe
> *** Computer section, Royal Botanic Gardens, Kew
> *** tel: +44 (0)20 8332 5708
> *** S.Hinchcliffe at rbgkew.org.uk
>
>
> _______________________________________________
> TDWG-GUID mailing list
> TDWG-GUID at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
>
> _______________________________________________
> TDWG-GUID mailing list
> TDWG-GUID at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** S.Hinchcliffe at rbgkew.org.uk
More information about the tdwg-tag
mailing list