Great feedback Sally. I'll interleave my answers below.
Sally Hinchcliffe wrote:
Hi Roger and everyone
Hi Everyone,
I have been doing some work on a TaxonAPI. This is a high/low definition of the kind of thing I believe people publishing taxonomic data need to produce and people consuming it need to eat based on the work I have been involved in on TCS.
I don't know if it helps but more than a year ago Donald & I worked out a set of calls that he would need for indexing ipni for ECAT. I don't think they've ever actually been used (although they are implemented within ipni but not very rigorously tested) but they might be of some use to look at if you haven't already. (we had requirements like 'changed since' and added since' which avoids flogging the entire ipni data set each time the index was rebuilt)
Yes this is kind of inspired by things that Donald has been pushing in my direction but I am trying to make it generic and TCS orientated.
The incorporation of DublinCore metadata in the ProviderSpecificData element will give the ability to do the "modified since" searches. It also gives the ability to put other stuff in any of the DC fields to be searched on (keywords for example - you could put the keywords MONOCOT or GRAY in which might be useful for people like Donald).
This still needs expanding on and I will return to it.
Other brief comments (with ipni hat firmly on my head) - on your TaxonApiMatching page, some of the terms you list are ones which we can respond to but incompletely. For example publication year (because it's not as important with botanical names) - we could send back some data but it would be incomplete (because of the state of parsing of the underlying data), i.e. records published in the supplied year would not actually get returned. This may be the case (or not) for other data suppliers. Or they may not be able to return at all for particular parameter if their level of normalisation does not support it.
This list is provisional and still very under development. It is unlikely that every publisher will provide data for every field so if, for example, IPNI have no record of a name with 1954 in the PublishedIn field you wouldn't get one back even though there may be one there that just hasn't been scored to that field yet. Even if you had scored names to years you couldn't guarantee that you had done them all I suppose. Another approach may be to have some form of capabilities or error response that would say 'unsupported field' or something but this begins to break what is a very simple model of "get back all that matches"
Also your example parameters (Genus, Species epithet) don't actually appear in the list below the text - is this because it's a work in progress or because I have totally missed the point?
I am still writing the document so this is just what I bunged in. It will be improved as the parameter list is revised.
I didn't quite understant GetReferringTNames -there is a parameter which is a list of ids but apparently no other parameter to say which relationship (basionymof, homonymof) is being referred to. Does that mean that can't be set? The client would then have to parse through the resulting TCS to see which were the basionyms if that is what they were interested in.
My logic here was not to have to define the relationships because the size of a TaxonNames graph will always be quite small - containing a maximum of a dozen or so TaxonName objects (this is open to discussion). I thought it would be easy to implement and understand returning everything than to provide the type of relationship. If you want the names that use this name as a basionym for example you are also likely to want to know of the names that are a SpellingCorrectionOf this name or a name that has been ConservedAgainst it.
Do you think we should definitely be able to specify the relationship? Perhaps we should restrict to only being able to ask for these related names for a single name rather than an array of them?
On GetTNamesById, you point out that suppliers can refuse to respond (presumably in the case of a too-long list of ids) - alternatively I suppose they could just truncate at some predetermined number of records (something to mention in a capability document?)
Yes. This is a bit protocol specific. In Tapir you could return in the diagnostics section that it has been truncated.
Also we have the issue of deleted (or in IPNI's case suppressed) ids. At the moment IPNI will serve up suppressed records which it must do if the ids are to be persistent, but there's nothing in the response which could be used to indicate that the records are deprecated in any way. I don't know if this is a problem or not ...
Interesting one. You could put it in the metadata but it would be better in the schema itself. This is something that would have to be thought through...
Forgive me if I've missed the point on some of these issues ... Monday afternoon and all that
Don't think you have missed the point at all - seem to be pretty sharp for a Monday.
I need to work on some documentation stuff for the next few days which is why I pushed this out a little rough round the edges. I need some feedback to start shaping it further (or throwing it out!) once I have got the other stuff done.
Thanks,
Roger
Sally
It is framed as if it were a web service type of thing but I imagine that each of the method calls mentioned could equate to a Tapir template of some kind. I imagine a template containing a parameterized filter (not sure if those exist anymore).
This is still work in progress as I need to define the list of parameters more closely and sort the greater than/less than issue for dates and a few other bits. Plus no one has seen it to comment on!
You can read more about it here:
http://www.tdwg.hyam.net/twiki/bin/view/TIP/TaxonAPI
I would be most grateful for your comments.
How does this fit with Tapir?
Roger
--
Roger Hyam Technical Architect Taxonomic Databases Working Group
http://www.tdwg.org roger@tdwg.org
+44 1578 722782
tdwg-tapir mailing list tdwg-tapir@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tapir_lists.tdwg.org
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk