Great feedback Sally. I'll interleave my answers below.
Sally Hinchcliffe wrote:
Hi Roger and everyone
Hi Everyone,
I have been doing some work on a TaxonAPI. This is a high/low definition
of the kind of thing I believe people publishing taxonomic data need to
produce and people consuming it need to eat based on the work I have
been involved in on TCS.
I don't know if it helps but more than a year ago Donald & I worked
out a set of calls that he would need for indexing ipni for ECAT. I
don't think they've ever actually been used (although they are
implemented within ipni but not very rigorously tested) but they
might be of some use to look at if you haven't already.
(we had requirements like 'changed since' and added since' which
avoids flogging the entire ipni data set each time the index was
rebuilt)
Yes this is kind of inspired by things that Donald has been pushing in
my direction but I am trying to make it generic and TCS orientated.
The incorporation of DublinCore metadata in the ProviderSpecificData
element will give the ability to do the "modified since" searches. It
also gives the ability to put other stuff in any of the DC fields to be
searched on (keywords for example - you could put the keywords MONOCOT
or GRAY in which might be useful for people like Donald).
This still needs expanding on and I will return to it.
Other brief comments (with ipni hat firmly on my head) - on your
TaxonApiMatching page, some of the terms you list are ones which we
can respond to but incompletely. For example publication year
(because it's not as important with botanical names) - we could send
back some data but it would be incomplete (because of the state of
parsing of the underlying data), i.e. records published in the
supplied year would not actually get returned. This may be the case
(or not) for other data suppliers. Or they may not be able to return
at all for particular parameter if their level of normalisation does
not support it.
This list is provisional and still very under development. It is
unlikely that every publisher will provide data for every field so if,
for example, IPNI have no record of a name with 1954 in the PublishedIn
field you wouldn't get one back even though there may be one there that
just hasn't been scored to that field yet. Even if you had scored names
to years you couldn't guarantee that you had done them all I suppose.
Another approach may be to have some form of capabilities or error
response that would say 'unsupported field' or something but this
begins to break what is a very simple model of "get back all that
matches"
Also your example parameters (Genus, Species epithet) don't actually
appear in the list below the text - is this because it's a work in
progress or because I have totally missed the point?
I am still writing the document so this is just what I bunged in. It
will be improved as the parameter list is revised.
I didn't quite understant GetReferringTNames -there is a parameter
which is a list of ids but apparently no other parameter to say which
relationship (basionymof, homonymof) is being referred to. Does that
mean that can't be set? The client would then have to parse through
the resulting TCS to see which were the basionyms if that is what
they were interested in.
My logic here was not to have to define the relationships because the
size of a TaxonNames graph will always be quite small - containing a
maximum of a dozen or so TaxonName objects (this is open to
discussion). I thought it would be easy to implement and understand
returning everything than to provide the type of relationship. If you
want the names that use this name as a basionym for example you are
also likely to want to know of the names that are a
SpellingCorrectionOf this name or a name that has been ConservedAgainst
it.
Do you think we should definitely be able to specify the relationship?
Perhaps we should restrict to only being able to ask for these related
names for a single name rather than an array of them?
On GetTNamesById, you point out that suppliers can refuse to respond
(presumably in the case of a too-long list of ids) - alternatively I
suppose they could just truncate at some predetermined number of
records (something to mention in a capability document?)
Yes. This is a bit protocol specific. In Tapir you could return in the
diagnostics section that it has been truncated.
Also we have the issue of deleted (or in IPNI's case suppressed) ids.
At the moment IPNI will serve up suppressed records which it must do
if the ids are to be persistent, but there's nothing in the response
which could be used to indicate that the records are deprecated in
any way. I don't know if this is a problem or not ...
Interesting one. You could put it in the metadata but it would be
better in the schema itself. This is something that would have to be
thought through...
Forgive me if I've missed the point on some of these issues ...
Monday afternoon and all that
Don't think you have missed the point at all - seem to be pretty sharp
for a Monday.
I need to work on some documentation stuff for the next few days which
is why I pushed this out a little rough round the edges. I need some
feedback to start shaping it further (or throwing it out!) once I have
got the other stuff done.
Thanks,
Roger
Sally
It is framed as if it were a web service type of thing but I imagine
that each of the method calls mentioned could equate to a Tapir template
of some kind. I imagine a template containing a parameterized filter
(not sure if those exist anymore).
This is still work in progress as I need to define the list of
parameters more closely and sort the greater than/less than issue for
dates and a few other bits. Plus no one has seen it to comment on!
You can read more about it here:
http://www.tdwg.hyam.net/twiki/bin/view/TIP/TaxonAPI
I would be most grateful for your comments.
How does this fit with Tapir?
Roger
--
-------------------------------------
Roger Hyam
Technical Architect
Taxonomic Databases Working Group
-------------------------------------
http://www.tdwg.org
roger@tdwg.org
+44 1578 722782
-------------------------------------
_______________________________________________
tdwg-tapir mailing list
tdwg-tapir@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-tapir_lists.tdwg.org
*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** S.Hinchcliffe@rbgkew.org.uk
--
-------------------------------------
Roger Hyam
Technical Architect
Taxonomic Databases Working Group
-------------------------------------
http://www.tdwg.org
roger@tdwg.org
+44 1578 722782
-------------------------------------