In that section, Rich notes that
"Eventually, a third party may be able to deduce (perhaps through a suite
of
other, external information) a RelationshipAssertion that maps the TNU "[Juncus] diffusissimus Buckl. sec L. Urbatsch 2009" to some other,
perhaps
published and well-defined taxon concept (of the same or different name). Also, if there are 100 specimens in the collection that L. Urbatsch identified as "Juncus diffusissimus Buckl." in 2009, then anchoring all
100
Identification instances to the one TNU, allows all of those specimens to inherit the mapping of the one "[Juncus] diffusissimus Buckl. sec L. Urbatsch 2009" TNU instance to some other better-defined taxon concept."
From that post, I understood that a TNU (a.k.a. "assertion" in Pyle 2004
http://systbio.org/files/phyloinformatics/1.pdf)
can be as vague as an idea that some determiner had in his/her head about how organism/specimen instances should be mapped to a name.
Yes, a TNU can be that simple. Basically, a TNU exists whenever someone documents a scientific name. That doesn't mean that all of these will be entered into a database. But the intention was to scope TNUs to be wide open to any form of documentation (including a specimen label), in case someone has a need to record the fact that somebody used a particular name in a particular way (useful, for example, if you want to track so-called "manuscript names" that only exist on specimen labels).
The problem is, if you have a broad scope for what a TNU can be, then there is no guarantee that the TNU is accompanied by a taxon concept definition. Certainly a specimen label is not (it applies to only one specimen). Neither is, for example, a published type catalog (which often records names, without implied concepts). And then there are things like newspaper articles, where perhaps there is a concept somewhere in there, but it's too ambiguous to map to pin down to any particular concept. However, some TNUs (e.g., treatments in revisionary monographs) certainly are accompanied by well-defined concept definitions. And those are the ones that we'd like to see become "anchor-points" to taxon concepts, against which the other (non-concept-bearing) TNUs can be mapped (as asserted by a third party; or in some cases by the first party).
I think from what Rich said there that there is the potential that we as metadata aggregators may at some later point be able to map how that idea in the determiner's head fits in with a more well-defined (e.g. published) taxon description which one may choose to call a taxon concept rather than a TNU.
That was certainly the intention, yes. Whether or not that will ever happen with most (many?) specimens remains to be seen. Certainly it can happen in some cases. For example, in our collection we often have visitors come through and study our holdings of a particular genus or a particular family. Often they will make determinations about the specimens they examine. Sometime later, they'll published a revision of the group. If they include a comprehensive "material examined" section, then there's an explicit relationship between the specimen and the publication that allows direct mapping of the specimen to the well-defined concept. However, if there is not a comprehensive material examined section, but there is a determination label by the same person in the jar (so you know the person examined it and identified it as part of the work that went behind the published revision), then the collection manager could create a TNU for the determination label, then map that determination-TNU to the publication's TNU (a RelationshipAssertion by the third-party collection manager).
I'm not saying this ever will happen, or even should happen. But we've developed the data model such that it *could* happen, if it turns out to be a useful mechanism for mapping specimens to published taxon concepts.
As so often is the case, I think the problem here boils down to identifiers and the metadata that we associate with them.
Absolutely!!! This is often not intuitive stuff, so the trickey part is getting people to apply identifiers and cross-link them in an appropriate and consistent way.
Let's say in the real-life example above, somebody (we can say GNUB) assigns a persistent identifier (perhaps a URI constructed from an opaque UUID) to "Juncus diffusissimus Buckl. sec
L. Urbatsch 2009".
We could say with an rdf:type statement that the resource identified by the URI is a TNU. We can give that resource a tc:hasName property linking it to the name which is represented by the string "Juncus diffusissimus Buckl.". (I'm not sure what property we use to say that L. Urbatch made the
assertion).
I'm not sure how you would do it in RDF, but if it's any help the relevant DwC term is taxon:nameAccordingToID.
It's certainly part of the metadata for the TNU itself:
There is a TNU for diffusissimus within the Reference of Buckl. This is the original description of the epithet, so it is also the Protonym for diffusissimus. There is also a TNU for the genus name Juncus as used within the Reference of Buckl. If, in the same publication, Buckl. Also established that genus, then the genus would also be the Protonym TNU. However, the genus Juncus was established by L., so there is another TNU for Juncus that is the protonym (Juncus L.), and the TNU for the usage of Juncus within Buckl. Links to the TNU of Juncus L. (the Protonym).
So, that's three TNUs: 1. Juncus L. sec. L. (Protonym for the genus Juncus) 2. Juncus L. sec. Buckl. (links to 1 as ProtonymID) 3. diffusissimus Buckl. sec. Buckl. (Protonym for the species diffusissimus, links to 2 as parent)
There are also two more TNUs linked to the Urbatsch 2009 publication: 4. Juncus L. sec. Urbatsch (links to 1 as ProtonymID) 5. diffusissimus Buckl. sec. Urbatsch (links to 3 as ProtonymID, and to 4 as parent)
Now let's say that L. Urbatsch publishes a paper describing in detail her concept of Juncus diffusissimus Buckl.
Do you mean the 2009 paper of "Juncus diffusissimus Buckl. sec L. Urbatsch 2009"? (In which case we have the TNUs as 4 & 5).... or do you mean a later paper (2010)? In which case we'd need two more TNUs. Is the "2009" thing a specimen determination, or a publication? It doesn't matter -- I just want to make sure I'm following your example correctly.
We can now assign the resource identified by the URI a tc:accordingTo property whose value is the DOI of the paper she wrote.
That would be a property of the TNU record (#4 & #5) -- assuming the paper she wrote is the 2009 paper.
If we want, we can replace the previous rdf:type statement with different one stating that the resource is a taxon concept rather than a TNU, or if we believe that all taxon concepts are also TNUs we can leave the rdf:type statement that we had before and just add a second one saying that the resource is also of type taxon concept.
Yes -- this space has not yet been clearly defined. We could either say the well-defined concept TNU *is* the concept (i.e., use the TNU GUID to represent the concept itself). Or, we could create another GUID for the concept, and anchor that concept to the TNU as the "original definition" (or whatever you want to call it). I've been mostly focused on using GNUB TNUs for nomenclatural things, so I haven't thought that far into it yet for concepts. The TNUs will certainly play a fundamental role; but it's not clear to me whether they should be regarded as the concept, or as a property of the concept.
The point I'm trying to make is that as long as this "thing" that we are variously calling "taxon name usage", "taxon concept", "shallow taxonomic concept", or "deep taxonomic concept" can be assigned an identifier, what really matters is the metadata we associate with it, not really what we call it.
Absolutely! With emphasis on "really matters". It's certainly important to mint persistent identifiers, but it's equally (more?) important to make sure we understand the "thing" the identifier represents. We have a pretty stable & solid definition of a TNU, that has emerged form many NOMINA meetings over a number of years. What I don't think has been done yet, is any real analysis of whether the appropriate subset of TNUs accompanied by a robust concept definition *are* the concepts (i.e., the TNU GUID is the concept GUID); or whether there should be a separate GUID minted explicitly for the concept, which then links back to the TNU GUID as it's "definition" or "source" (or whatever). I can see it working both ways.
The more metadata that we can connect with it, either through datatype properties like name strings or object properties that describe how the
"thing"
is related to other resources, the "deeper" the concept. On the other
extreme,
we may know nothing more than the name string. In that case we could call
it a "nominal concept", but we could still assign it an identifier and
maybe
with luck we could associate more metadata with it (make it "deeper") at some point in the future.
Yes!
I am going to be bold and say that we already have the minimum tools required to get started implementing TNUs/TaxonConcepts:
- URI GUIDs (which if one preferred could be UUIDs or LSIDs --
HTTP proxied to make Linked Data people happy; see the TDWG GUID Applicability Statement standard if you don't know how to do this) to identify the TNU/concepts,
- the two terms tc:hasName and tc:accordingTo (from the TDWG Taxon
Concept ontology) to relate the TNUs/TaxonConcepts to names and sec.
references, and
- some sources for name and publication URI GUIDs.
There are deficiencies all over the place for that last item, but they can be addressed over time by improving the scope of the relevant databases and the quality of the metadata provided. uBio has URIs for almost every name I've ever looked for. BHL has a growing collection of old literature which has been assigned identifiers by Rod Page's BioStor, new literature usually has an
assigned,
dereferenceable proxied DOI, and one can even make valid URIs from ISBNs of books (although they aren't resolvable). I'm not sure how one should address the situation where the "sec." reference of a TNU is a person and date since there isn't a standard database of people (as far as
I know).
But that could be remedied. Ultimately, one could create the kinds of
mapping
tools that Nico and Rich are talking about which relate different taxon
concepts/TNUs
which have set theory relationships. Whether that would be done with RDF,
OWL, or something completely different I don't know, but the basic
anchoring
of persistent identifiers for the TNU/concepts to the names and sec.
references
wouldn't have to wait on that. We could also get hung up about what terms
to use to express the metadata describing the basic TNU/name/sec.
resources,
but there is nothing that says that metadata can't change or be improved
over time.
It's the identifier that shouldn't change.
Am I wrong about this???
Not wrong at all, in my mind. This is the framework we're following right now as we start to build these things.
One comment on this bit:
" I'm not sure how one should address the situation where the "sec." reference of a TNU is a person and date since there isn't a standard database of people (as far as I know)."
In the GNUB Model, Person(s) + Date = Reference. A Reference is the link to the "documentation" (i.e., the place where the TNU occurred). A Reference can be (and usually is) a publication, but it can also be a notebook, specimen label, etc., etc. All References have "Authors" and a date, so a Reference yields Person(s) + Date. Think of it like a determination label for a specimen being a micro-publication; and then everything else is easy to model.
So, regardless of whether "Urbatsch 2009" refers to a determination label or a published monograph, it's still the basis for the TNUs 4 & 5 in the examples above.
Aloha, Rich