Re: [tdwg-tag] [tdwg-rdf: 103] Re: Any TCS users with experiences to report?
I want to get into this topic in more detail (going back to Steve's original post), but this week is hell-week for me, so only a quick comment now.
I generally agree with everything Nico says, but I think we need to be a little more clear of what we mean by "name sec. author"
The core unit of the data model we've been building towards (GNUB, which underlies ZooBank) uses as its fundamental unit something we've been calling a "Taxon Name Usage Instance" (TNU). The scope of what can be a TNU is intentionally very broad - anything from an original taxon name description, to a mention in a newspaper article, and potentially even a scribbled hand-written label or letter. The only requirement is that it be static - that is, a snapshot in time. I mention this because database records can be represented as TNUs, but only as a static snapshot of the record. If the essence of the database record changes over time (e.g., due to changing taxonomic opinion), then a new TNU is generated for a different snapshot in time.
A very small subset of the universe of TNUs represent Code-governed Nomenclatural Acts (original descriptions of new names and other code-governed nomenclatural actions). In the case of such TNUs involving the ICZN Code (for example), the TNUs are registered in ZooBank. But the point is, one subset of all TNUs are those that involve actions governed by a Code of nomenclature.
The reason I mention this is that, if I read Nico's email correctly, I think he's saying that not all TNUs de-facto represent taxon concepts. Rather, analogous to the nomenclatural subset of TNUs, there is a subset of TNUs that rise to the level of representing Taxon Concept definitions. In the case of nomenclatural acts, someone must make some sort of declaration (assertion) that a specific TNU constitutes a Code-governed nomenclatural act, along with relevant metadata relating to that assertion and the nature of the Act. In the case of zoological names, ZooBank is intended to facilitate this role (i.e., when a person registers a TNU in ZooBank, there is an implied assertion that the TNU represents a nomenclatural act under the ICZN Code).
What would be nice to have (and what TDWG could play a helpful role in facilitating), is a registry of sorts (analogous to ZooBank) for those TNUs that represent taxon concepts. In other words, a mechanism for people to "register" the subset of all TNUs that represent taxon concepts. Secondarily, there would also be a mechanism to make assertions about how registered taxon concepts map to each other (via some sort of set theory relationship[s]).
In summary, my points are
1) We should be clear when we say "name sec. author" whether we mean it sensu lato (i.e., all TNUs); or sensu stricto (i.e., only those TNUs that rise to the level of representing taxon concepts).
2) There ought to be a registry (perhaps administered by CoL?) for identifying the subset of TNUs that represent concept definitions, and it should include a mechanism for making set-theory relationship assertions among registered concept-TNUs.
3) The two things mentioned in #2 should be separate; that is, one can assert that a particular TNU represents a taxon concept separately from (potentially multiple) assertions about how that taxon concept relates to other taxon concepts.
More later.
Aloha,
Rich
P.S By my standards that WAS quick!
From: Nico Franz [mailto:nico.franz@asu.edu] Sent: Monday, November 26, 2012 1:40 PM To: tdwg-rdf@googlegroups.com Cc: Roderic Page; Tony.Rees@csiro.au; pmurray@anbg.gov.au; Simon.Pigot@csiro.au; J.Kennedy@napier.ac.uk; eotuama@gbif.org; tdwg-tag@lists.tdwg.org; deepreef@bishopmuseum.org; David Patterson Subject: Re: [tdwg-rdf: 103] Re: [tdwg-tag] Any TCS users with experiences to report?
Thank you, Steve.
This I think is quite helpful in summarizing where we are and aren't on this issue. I'd like to respond / reiterate mainly in relation to Rich's comments.
http://lists.tdwg.org/pipermail/tdwg-tag/2012-November/002526.html
Regarding the notion of what a concept "is" - Well, scientific definitions quite often can be deliberately vague and still work well enough (or work precisely because of that vagueness). I think there is acceptable middle ground here. Perhaps it is best to focus on the expected actions of concepts.
(1) Concepts are intended to provide taxonomic resolution (i.e. help us understand whether two authors talk about the same or different perceived entities in nature [natural entities for the realist, human perceptions for the constructivist - no great matter where we stand on that one]) BEYOND what can be achieved via name strings and synonymy relationships. (2) Concepts are LABELED with a name sec. author combination (or something similar to to that). (3) (1) is relevantly fulfilled IF an expert feels confident enough in asserting that concept 1 and concept 2 have a single, or a combination of multiple, set theory relationship(s), or some sort of percent matching relationship, that goes beyond "somehow overlapping / not" (which type-anchored nomenclatural relationships could typically convey as well).
This means that, to a degree (and reflecting real biological practice, I'd think), whether some name sec. author instance attains the level of concept - as in "concept that does more taxonomic resolution than just names" - or not is contingent on there being sufficient context for *some* expert to make such an assertion ("congruent or more inclusive than but not any of the other three RCC-5 relationships). As you know, learning about context is something that experts are very good at. It doesn't always jump out from the written pages.
I cannot quite see how this is a profound issue for modeling in a database environment. You handle the name sec. author items. The presence or absence of concept relationships will be (by and large, over time) reflective of which subsets of the name sec. authors items allow deeper semantics in the sense of (1) and (3). Providers can have the option of IDENTIFYING specimens/observations TO a (set of) concept(s) if suitable concepts are already available in the environment and thus they see no need to add new ones. Most of this may not be a technical challenge but more a challenge of relearning speaker roles and making context explicit as one's own or someone else's. We tend to have this information in mind when we write systematic and non-systematic works.
I think there are lots of valuable datasets out there presenting single, snapshot concept hierarchies of incredible value. What we need, if for nothing else then for the sake of argument, are tools to import these - two at a time initially - into a concept compatible environment that would allow visualization and mapping to take place and be checked for consistency. I say "for the sake of argument" because that model by itself has no reward scheme yet for an expert to get involved (though some might do it on their own initiative).
I do believe that TDWG is a proper body to take on and promote this task but TDWG has perhaps more readily focused on leveraging existing information as opposed to being very proactive in facilitating new taxonomic practices. IMO we cannot just focus on the data that are there, but can also build tools to generate better primary data from now on moving forward.
Best,
Nico
On Wed, Nov 21, 2012 at 8:24 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
In an effort to prevent the loss of information presented in the recent thread on TCS as it relates to RDF, I have created a summary at http://code.google.com/p/tdwg-rdf/wiki/TCSthread with links to the archived messages. I left out posts focused specifically on XML schemas which I consider to be outside the scope of the TDWG RDF group (that is somebody else's problem). I will continue to add to this page as additional relevant posts occur and can also post more "further information" links if anybody wants to send them to me.
Steve
participants (1)
-
Richard Pyle