[tdwg-tag] [tdwg-rdf: 105] Re: Any TCS users with experiences to report?
Steve Baskauf
steve.baskauf at vanderbilt.edu
Wed Nov 28 17:27:32 CET 2012
Excerpts from what Richard Pyle wrote and responses:
> I'm not saying this ever will happen, or even should happen. But we've
> developed the data model such that it *could* happen, if it turns out to be
> a useful mechanism for mapping specimens to published taxon concepts.
>
>
>> As so often is the case, I think the problem here boils down to
>> identifiers and the metadata that we associate with them.
>>
>
> Absolutely!!! This is often not intuitive stuff, so the trickey part is
> getting people to apply identifiers and cross-link them in an appropriate
> and consistent way.
>
As a person who isn't really sure if he believes in RDF (and the
co-convener of the RDF Task Group) and as a person who finds this whole
conversation more annoying than about anything else he can think of (but
who is actually trying to walk this walk with his metadata), I'm saying
that we are there. HTTP URIs are not THE way to create identifiers but
they are A way to create identifiers that demonstrably work. RDF is not
THE way to cross link identifiers but they are A way to cross link
identifiers. History is full of examples where a way of doing things
that wasn't the best won out because it was there when needed. (I'm
thinking typewriter keyboard.)
>
>> Let's say in the real-life example above, somebody (we
>> can say GNUB) assigns a persistent identifier (perhaps a
>> URI constructed from an opaque UUID) to "Juncus diffusissimus Buckl. sec
>>
> L. Urbatsch 2009".
>
>> We could say with an rdf:type statement that the resource
>> identified by the URI is a TNU. We can give that resource a
>> tc:hasName property linking it to the name which is
>> represented by the string "Juncus diffusissimus Buckl.".
>> (I'm not sure what property we use to say that L. Urbatch made the
>>
> assertion).
>
> I'm not sure how you would do it in RDF, but if it's any help the relevant
> DwC term is taxon:nameAccordingToID.
>
This is a major part of why I'm interested in this conversation. All of
the dwc: "ID" terms are flawed for use in RDF for technical reasons that
I've described elsewhere (see
http://code.google.com/p/tdwg-rdf/wiki/Beginners4Vocabularies#4.6._The_Darwin_Core_vocabulary_normative_definition_as_RDF
http://code.google.com/p/tdwg-rdf/wiki/DublinCore#1.3.2.4._dcterms:identifier_(AC)
and
http://code.google.com/p/tdwg-rdf/wiki/DublinCore#2.4._Possible_courses_of_action_where_users_may_be_inclined_to_u
if you care about this). I believe that tc:accordingTo would be a
fairly exact equivalent to dwc:nameAccordingToID which does not have the
subproperty problems the DwC ID terms have in RDF. So the question in
my mind (and I think a question posed explicitly earlier in this thread)
is whether enough of the technical stuff you want to do with TNUs can be
done with the TDWG Taxon Concept ontology which is based on a ratified
TDWG standard (TCS).
> It's certainly part of the metadata for the TNU itself:
>
> There is a TNU for diffusissimus within the Reference of Buckl. This is the
> original description of the epithet, so it is also the Protonym for
> diffusissimus. There is also a TNU for the genus name Juncus as used within
> the Reference of Buckl. If, in the same publication, Buckl. Also
> established that genus, then the genus would also be the Protonym TNU.
> However, the genus Juncus was established by L., so there is another TNU for
> Juncus that is the protonym (Juncus L.), and the TNU for the usage of Juncus
> within Buckl. Links to the TNU of Juncus L. (the Protonym).
>
> So, that's three TNUs:
> 1. Juncus L. sec. L. (Protonym for the genus Juncus)
> 2. Juncus L. sec. Buckl. (links to 1 as ProtonymID)
> 3. diffusissimus Buckl. sec. Buckl. (Protonym for the species diffusissimus,
> links to 2 as parent)
>
> There are also two more TNUs linked to the Urbatsch 2009 publication:
> 4. Juncus L. sec. Urbatsch (links to 1 as ProtonymID)
> 5. diffusissimus Buckl. sec. Urbatsch (links to 3 as ProtonymID, and to 4 as
> parent)
>
>
>> Now let's say that L. Urbatsch publishes a paper describing in detail her
>> concept of Juncus diffusissimus Buckl.
>>
>
> Do you mean the 2009 paper of "Juncus diffusissimus Buckl. sec L. Urbatsch
> 2009"? (In which case we have the TNUs as 4 & 5).... or do you mean a later
> paper (2010)? In which case we'd need two more TNUs. Is the "2009" thing a
> specimen determination, or a publication? It doesn't matter -- I just want
> to make sure I'm following your example correctly.
>
The 2009 "thing" is something that L. Urbatsh had in his head - the idea
of how he thought the name "Juncus diffusissimus Buckl." should be
applied to real organisms and the specimens that come from them. We
don't know what that idea is but presumably he had one and we could
assign a persistent identifier to it and describe it using the string
"Juncus diffusissimus Buckl. sec L. Urbatsch 2009". I was thinking
that was what you meant by TNU. Whether or not it is better to
proliferate a bunch of additional TNU instances, assign them their own
identifiers, and relate them to each other is a technical detail that as
an end user I'm happy to let you take care of within your GNUB system.
End users want a persistent identifier of some sort to link
identification instances to. They will hopefully find it through a
user-friendly interface that hides the ugly details you are describing here.
[some quotes omitted for brevity]
>
>> The point I'm trying to make is that as long as this "thing"
>> that we are variously calling "taxon name usage", "taxon concept",
>> "shallow taxonomic concept", or "deep taxonomic concept"
>> can be assigned an identifier, what really matters is the metadata
>> we associate with it, not really what we call it.
>>
>
> Absolutely! With emphasis on "really matters". It's certainly important to
> mint persistent identifiers, but it's equally (more?) important to make sure
> we understand the "thing" the identifier represents. We have a pretty
> stable & solid definition of a TNU, that has emerged form many NOMINA
> meetings over a number of years. What I don't think has been done yet, is
> any real analysis of whether the appropriate subset of TNUs accompanied by a
> robust concept definition *are* the concepts (i.e., the TNU GUID is the
> concept GUID); or whether there should be a separate GUID minted explicitly
> for the concept, which then links back to the TNU GUID as it's "definition"
> or "source" (or whatever). I can see it working both ways.
>
I looked at the TDWG Taxon Concept ontology (see
http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonConcept.owl
) and came up with the following scenario. Let's say that I assign a
persistent identifier [URI1] to the TNU we are talking about here,
"Juncus diffusissimus Buckl. sec L. Urbatsch 2009" and that I'm right
that by TNU we mean the idea that Lowell Urbatch had in his head about
how he meant to (sorry to drag him into this randomly - I actually
don't know him).
We can assign a tc:accordingTo property to [URI1] with the object of
that property being some kind of resource whose metadata says that
Lowell Urbatcsh used the name in some sense known to him in 2009. (We
could work out the details of how one would write that metadata but
there probably is enough stuff in the Dublin Core and FOAF vocabularies
to do it. I think later in your email you said GNUB calls it a
"Reference".) Now let's say that in 2012 I call him on the phone and
say "hey Lowell, what taxonomic treatment were you thinking about in
2009 when you annotated that specimen in the NLU herbarium with barcode
LSU00000428?" and he says "Gleason and Cronquist, 1991". I have two
choices:
1. add a tc:describedBy property to the metadata for [URI1] with the
value urn:isbn:0893273651 (a persistent URI representing Gleason and
Cronquist 1991) and "promote" the resource identified by [URI1] from
generic TNU to "deeper" taxonomic concept.
2. create a different [URI2] which has a property/value pair of
tc:describedBy urn:isbn:0893273651 , then somehow relate the royal URI2
taxonomic concept to the lowly URI1 generic TNU.
If we use owl:sameAs to relate the two instances (i.e. assert [URI2]
owl:sameAs [URI1] ) then there is no advantage to option two. All
statements made about [URI1] would apply to [URI2] and vice-versa. All
we would be accomplishing is doubling the number of triples that we have
to keep track of. The question is whether this would result in "bad" or
"silly" statements (the essence of Bob Morris' objection to owl:sameAs,
I think). If so, then we need some kind of new term to relate royal
Taxon Concepts ("deep" taxonomic concepts) to lowly generic TNUs
("shallow" taxonomic concepts). I don't think such a term exists at the
moment.
The advantage of choice 1 is that we have off-the-shelf technology (the
TDWG Taxon Concept ontology based on a ratified TDWG standard, TCS). If
we go with choice 2 (and don't use owl:sameAs to relate the two URIs),
then we doom ourselves to another five+ years of thrashing out some new
vocabulary or ontology for relating TNUs to Taxon Concepts. So here
where we would be if we went with choice 1:
1. The rdf:type of the "thing" is tc:Taxon or tc:TaxonConcept . The
ontology declares them to be equivalent classes.
2. The minimal requirement for a tc:Taxon is to have a persistent
identifier and a name associated with it, either through tc:nameString
literal or better yet a tc:hasName property whose value is a persistent
URI that provides more extensive metadata than a string. uBio comes to
mind here as a giant source of name strings with assigned persistent
identifiers. A tc:Taxon instance having only name information would be
a nominal taxon - an undesirable but probably common circumstance.
3. If the tc:Taxon instance has a tc:accordingTo property, it is
elevated to TNU status because we can know who used the name and when if
we investigate of the object of the tc:accordingTo property further.
Maybe we can learn more about this kind of tc:Taxon instance later, but
in most cases probably not.
4. If the tc:Taxon instance has a tc:describedBy property, then it is
elevated to full taxonomic concept status because it would be related to
the persistent identifier of a published taxonomic treatment. One could
then theoretically discover all kinds of cool relationships to other
full-blown concepts using the tools that Nico and Rich are going to
develop.
One could create a Venn diagram showing the subset relationships of
these three levels of tc:Taxon. If it made Rich feel better, he could
mint a class URI for the deep taxonomic concepts which could be used in
rdf:type statements in addition to the basic rdf:type of tc:Taxon .
I see this as a way to make rapid progress on this front by leveraging
work which has already been completed and accepted by TDWG but not
really implemented. The TDWG Taxon Concept ontology and TCS may not do
everything that people want as far as allowing one to define all of the
set relationships required to do the fancy stuff. But I don't see why
that can't be added in a TCS 2.0 that is backwardly compatible with TCS
1.2 .
I will defend my repeated references to HTTP URIs and RDF by saying that
this email is being posted to the TDWG RDF group list but also because
the ratified standard for GUIDs (see
http://bioimages.vanderbilt.edu/pages/guid-applicability-final-2011-01.pdf)
says that persistent identifiers must be HTTP proxied (recommendation 2)
and that they should resolve to provide RDF/XML (recommendation 10).
Having not yet achieved the degree of cynicism attained by Rod Page
about people actually following TDWG standards, I still believe that we
should try to follow them. Or else just stop wasting time on this...
Steve
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
http://bioimages.vanderbilt.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20121128/3707ec37/attachment-0001.html
More information about the tdwg-tag
mailing list