<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
<br>
Is this an LSID issue? LSIDs essential provide a binding service
between an name and one or more web services (we default to HTTP GET
bindings). It isn't really up to the LSID authority to administer any
policies regarding the web service but simply to point at it. It is up
to the web service to do things like throttling, authentication and
authorization.<br>
<br>
Imagine, for example, that the different services had different
policies. It may be reasonable not to restrict the getMetadata() calls
but to restrict the getData() calls.<br>
<br>
The use of LSIDs does not create any new problems that weren't there
with web page scraping - or scraping of any other web service.<br>
<br>
Just my thoughts...<br>
<br>
Roger<br>
<br>
<br>
Ricardo Scachetti Pereira wrote:
<blockquote cite="mid44916F47.9090903@tdwg.org" type="cite">
<pre wrap=""> Sally,
You raised a really important issue that we had not really addressed
at the meeting. Thanks for that.
I would say that we should not constrain the resolution of LSIDs if
we expect our LSID infrastructure to work. LSIDs will be the basis of
our architecture so we better have good support for that.
However, that is sure a limiting factor. Also server efficiency will
likely vary quite a lot, depending on underlying system optimizations
and all.
So I think that the solution for this problem is in caching LSID
responses on the server LSID stack. Basically, after resolving an LSID
once, your server should be able to resolve it again and again really
quickly, until something on the metadata that is related to that id changes.
I haven't looked at this aspect of the LSID software stack, but
maybe others can say something about it. In any case I'll do some
research on it and get back to you.
Again, thanks for bringing it up.
Cheers,
Ricardo
Sally Hinchcliffe wrote:
</pre>
<blockquote type="cite">
<pre wrap="">There are enough discontinuities in IPNI ids that 1,2,3 would quickly
run into the sand. I agree it's not a new problem - I just hate to
think I'm making life easier for the data scrapers
Sally
</pre>
<blockquote type="cite">
<pre wrap="">It can be a problem but I'm not sure if there is a simple solution ... and how different is the LSID crawler scenario from an <a class="moz-txt-link-freetext" href="http://www.ipni.org/ipni/plantsearch?id=">http://www.ipni.org/ipni/plantsearch?id=</a> 1,2,3,4,5 ... 9999999 scenario?
Paul
-----Original Message-----
From: <a class="moz-txt-link-abbreviated" href="mailto:tdwg-guid-bounces@mailman.nhm.ku.edu">tdwg-guid-bounces@mailman.nhm.ku.edu</a>
[<a class="moz-txt-link-freetext" href="mailto:tdwg-guid-bounces@mailman.nhm.ku.edu">mailto:tdwg-guid-bounces@mailman.nhm.ku.edu</a>]On Behalf Of Sally
Hinchcliffe
Sent: 15 June 2006 12:08
To: <a class="moz-txt-link-abbreviated" href="mailto:tdwg-guid@mailman.nhm.ku.edu">tdwg-guid@mailman.nhm.ku.edu</a>
Subject: [Tdwg-guid] Throttling searches [ Scanned for viruses ]
Hi all
another question that has come up here.
As discussed at the meeting, we're thinking of providing a complete
download of all IPNI LSIDs plus a label (name and author, probably)
which will be available as an annually produced download
Most people will play nice and just resolve one or two LSIDs as
required, but by providing a complete list, we're making it very easy
for someone to write a crawler that hits every LSID in turn and
basically brings our server to its knees
Anybody know of a good way of enforcing more polite behaviour? We can
make the download only available under a data supply agreement that
includes a clause limiting hit rates, or we could limit by IP address
(but this would ultimately block out services like Rod's simple
resolver). I beleive Google's spell checker uses a key which has to
be passed in as part of the query - obviously we can't do that with
LSIDs
Any thoughts? Anyone think this is a problem?
Sally
*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** <a class="moz-txt-link-abbreviated" href="mailto:S.Hinchcliffe@rbgkew.org.uk">S.Hinchcliffe@rbgkew.org.uk</a>
_______________________________________________
TDWG-GUID mailing list
<a class="moz-txt-link-abbreviated" href="mailto:TDWG-GUID@mailman.nhm.ku.edu">TDWG-GUID@mailman.nhm.ku.edu</a>
<a class="moz-txt-link-freetext" href="http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid">http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid</a>
_______________________________________________
TDWG-GUID mailing list
<a class="moz-txt-link-abbreviated" href="mailto:TDWG-GUID@mailman.nhm.ku.edu">TDWG-GUID@mailman.nhm.ku.edu</a>
<a class="moz-txt-link-freetext" href="http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid">http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid</a>
</pre>
</blockquote>
<pre wrap="">*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** <a class="moz-txt-link-abbreviated" href="mailto:S.Hinchcliffe@rbgkew.org.uk">S.Hinchcliffe@rbgkew.org.uk</a>
_______________________________________________
TDWG-GUID mailing list
<a class="moz-txt-link-abbreviated" href="mailto:TDWG-GUID@mailman.nhm.ku.edu">TDWG-GUID@mailman.nhm.ku.edu</a>
<a class="moz-txt-link-freetext" href="http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid">http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid</a>
</pre>
</blockquote>
<pre wrap=""><!---->
_______________________________________________
TDWG-GUID mailing list
<a class="moz-txt-link-abbreviated" href="mailto:TDWG-GUID@mailman.nhm.ku.edu">TDWG-GUID@mailman.nhm.ku.edu</a>
<a class="moz-txt-link-freetext" href="http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid">http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid</a>
</pre>
</blockquote>
<br>
<br>
<pre class="moz-signature" cols="72">--
-------------------------------------
Roger Hyam
Technical Architect
Taxonomic Databases Working Group
-------------------------------------
<a class="moz-txt-link-freetext" href="http://www.tdwg.org">http://www.tdwg.org</a>
<a class="moz-txt-link-abbreviated" href="mailto:roger@tdwg.org">roger@tdwg.org</a>
+44 1578 722782
-------------------------------------
</pre>
</body>
</html>