So the question in my mind (and I think a question posed explicitly
earlier in this thread)
is whether enough of the technical stuff you want to do with TNUs can be
done with
the TDWG Taxon Concept ontology which is based on a ratified TDWG standard
(TCS).
Yes, I agree that this is the key question right now (speaking as TDWG Convener for the TDWG-TNC Group); unfortunately it's not one I have time to address in detail this week. But I think it's the conversation we ought to be having.
The 2009 "thing" is something that L. Urbatsh had in his head - the idea of how he thought the name "Juncus diffusissimus Buckl." should be applied to real organisms and the specimens that come from them. We don't know what that idea is but presumably he had one and we could assign a persistent identifier to it and describe it using the string "Juncus diffusissimus Buckl. sec L. Urbatsch 2009".
Ah! In that case, it only becomes a TNU when it is documented (i.e., when it is "used"). The data model is very liberal in terms of what constitutes "documentation" (="Reference", in the GNUB data model), but chemical signals inside a person's brain would be out of scope for "documentation". However, once he's written it down on a determination label, then it becomes a TNU. The trick then becomes how to map what was in his head, to what is defined in some way. When people write determination labels on specimens, they usually don't include lots of details about the scope of the taxon concept they had in mind when assigning that name to that taxon. This is why an assertion is needed (as per Nico's emails on this thread): Someone (either Urbatsh himself, or the collection manager, or someone else) needs to make an assertion that the TNU recorded on the specimen determination label maps to some other TNU with a well-documented concept definition (e.g., a published revision), through some set-theory relationship.
I was thinking that was what you meant by TNU. Whether or not it is better to proliferate a bunch of additional TNU instances, assign them their own identifiers, and relate them to each other is a technical detail that as an end user I'm happy to let you take care of within your GNUB system. End users want a persistent identifier of some sort to link identification instances to. They will hopefully find it through a user-friendly interface that hides the ugly details you are describing
here.
I have to admit that I'm a little squeamish about minting TNUs for determination labels, and I wouldn't encourage it. However, for the collection managers out there, sometimes this is the only way you'll ever translate scientific names on specimen labels into well-defined taxon concepts. The only part that gets fuzzy is the notion of dwc:Identification. The way we model it for our specimen data is that each Identification instance gets its own GUID, and basically says "this person on this date determined this specimen to fall within the implied concept of this TNU". In other words, an Identification bridges a specimen instance to a TNU instance according to some person (with other metadata, like date, etc.) In the ideal world, whenever anyone every made a specimen determination, they would add a "sensu XXXX" to the name they applied. The unfortunately reality is that the vast vast vast majority of actual specimen data do not meet this ideal. Thus, we're left to guess what concept they had in mind when making the determination. In some cases, we can confidently infer that the person had a particular concept in mind (the example I gave on the expert visiting a collection, making specimen determinations, and later publishing a revision). You can also confidently build these links whenever a publication includes the "Material Examined" section. But in most cases, there really is no other choice than minting a TNU for the determination label, then hoping that TNU will eventually get cross-linked to other TNUs with more well-defined concept definitions.
Let's say that I assign a persistent identifier [URI1] to the TNU we are talking about here, "Juncus diffusissimus Buckl. sec L. Urbatsch 2009" and that I'm right that by TNU we mean the idea that Lowell Urbatch had in his head about how he meant to
OK, I'll assume there is a determination label in a jar with that species name on it, and credit to L. Urbatsch, and dated to 2009. This determination label serves as the basis for the TNU.
We can assign a tc:accordingTo property to [URI1] with the object of that property being some kind of resource whose metadata says that Lowell Urbatcsh used the name in some sense known to him in 2009.
I'm not sure this is right, because it sounds like you're saying that tc:accordingTo should point to a TNU. However, tc:accordingTo should point to a "Reference" (sensu GNUB data model). As such, there would be a Reference instance in GNUB with a GUID that refers to the determination label. The "author" of that determination label would be Lowell Urbatcsh, and the date would be 2009. This serves the same function as a journal article, or book, or any other more familiar form of documentation ("Reference"); but instead of being ink on paper in thousands of copies distributed in libraries around the world, it's ink on paper on a label attached to a specimen. So, tc:accordingTo would point to the GUID for the *Reference* instance; not a TNU instance.
(We could work out the details of how one would write that metadata but there probably is enough stuff in the Dublin Core and FOAF vocabularies to do it. I think later in your email you said GNUB calls it
a "Reference".)
Right -- exactly. A Reference is a piece of documentation -- a journal article, a book, a field notebook, a piece of correspondence, a determination label, etc. A Reference may be associated with zero, one, or many TNUs. In the case of determination labels, this is often a case where one reference is associated with only one TNU. But that doesn't mean they're the same thing. A Reference is a Reference, and a TNU is a TNU, and tc:accordingTo should point to a Reference, not a TNU.
Now let's say that in 2012 I call him on the phone and say "hey Lowell,
what
taxonomic treatment were you thinking about in 2009 when you annotated that specimen in the NLU herbarium with barcode LSU00000428?" and he says "Gleason and Cronquist, 1991". I have two choices: 1. add a tc:describedBy property to the metadata for [URI1] with the
value
urn:isbn:0893273651 (a persistent URI representing Gleason and Cronquist
1991)
and "promote" the resource identified by [URI1] from generic TNU to
"deeper" taxonomic concept.
- create a different [URI2] which has a property/value pair of
tc:describedBy
urn:isbn:0893273651 , then somehow relate the royal URI2 taxonomic concept
to the lowly URI1 generic TNU.
I need to get my head around tc:describedBy before I can respond in detail; but in GNUB space, there would be a mapping of the TNU for "Juncus diffusissimus Buckl. sec L. Urbatsch 2009" (the determination label), and the TNU for "Juncus diffusissimus Buckl. Gleason and Cronquist, 1991" (the publication -- assuming that G&C 1991 used the same name/combination/spelling). In this case, the mapping would be of type "congruent".
I can't comment on the OWL implications. My role here is to make sure that our "things" are consistent. (a Reference being on "thing", a TNU being another "thing", and also how the "Identification" thing and possibly the taxon concept "thing" relate to these other things).
Form what you describe later in your email, I guess I'm in favor of option 1.
- If the tc:Taxon instance has a tc:accordingTo property, it is elevated
to
TNU status because we can know who used the name and when if we investigate of the object of the tc:accordingTo property further.
Again, tc:accordingTo should not refer to a TNU. It should refer to a Reference.
If it made Rich feel better, he could mint a class URI for the deep taxonomic concepts which could be used in rdf:type statements in addition to the basic rdf:type of tc:Taxon .
It actually wouldn't make me happier to do this. My preference is to use the TNU GUID as the proxy for the concept, rather than mint new GUIDs for concepts. This is how we do it on the nomenclatural act side. We don't create a new GUID for the nomencatural act -- the GUID for the TNU carrying the nomenclatural act is a sufficiently unambiguous proxy. (However, the new draft GNUB model does parse out each nomencaltural act into one-to-many nomenclatural "events" -- but that's a separate conversation).
I see this as a way to make rapid progress on this front by leveraging work which has already been completed and accepted by TDWG but not really implemented. The TDWG Taxon Concept ontology and TCS may not do everything that people want as far as allowing one to define all of the set relationships required to do the fancy stuff. But I don't see why that can't be added in a TCS 2.0 that is backwardly compatible
with TCS 1.2 .
I wholeheartedly endorse this approach! In December I'll take a closer look at the Taxon Concept ontology and I'll discuss with Rob Whitton (CC'd -- sorry, Rob!) how we might actually start implementing services that output GNUB content in conformance with tc. Rapid progress is definitely a Good Thing!
Aloha, Rich
Richard L. Pyle, PhD Database Coordinator for Natural Sciences Associate Zoologist in Ichthyology Dive Safety Officer Department of Natural Sciences, Bishop Museum 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
Note: This disclaimer formally apologizes for the disclaimer below, over which I have no control.