Thank you, Rich.
So we seem to agree on something like this:
Rich Nico
taxon name usage <===> "shallow" taxonomic concept
taxon concept <===> "deep" taxonomic concept
Both: labeling is via name sec. author
Both: authoring concepts/usages vs. identifying to those => slippery issue; ideally requires proper speaker awareness.
Why the latter? - well, because (again) the desirable effect of using concepts - the desirable situation where these would have a justification that goes beyond just really meticulous data management and advances to the level of facilitating better science qua more precise taxonomic semantics - only obtains if a great number of name occurrences in a wide range of shallow-ish sources is linked via identification to a presumably smaller number of occurrences where those names are well defined and where successive definitions of names are semantically linked. So there needs to be an emerging culture of minimizing concept inflation. Otherwise we obtain what we have now (mostly just names) and on top of that add new baggage (lots of really shallow concepts) that nobody can do good semantics with.
Here is where I think we disagree, perhaps just in terms of sales strategy:
You seem to suggest that making an a priori distinction between TNUs and concepts is (1) possible in a good number of cases, (2) is desirable perhaps in the form of registry, and (3) even necessary for building and populating databases, etc.
Here I disagree, for a number of reasons. First off I do believe that not defining certain things too soon or too narrowly is sometimes actually really good science and on the other hand, doing so can be a show stopper if other people don't share this narrowness and find it limiting. Second, while we can perhaps readily agree that a lengthy monograph published in American Museum Novitates rises to the level of authoring next concepts whereas a label saying "Family Carabidae" on a specimen submitted as part of an insect student collection does not, there are enough in-between cases where only time will tell.
Example: USDA Plants promotes a particular perspective of groundcherry taxonomy, genus-level concept Physalis - http://plants.usda.gov/java/profile?symbol=physa - with some 29 species-level concepts recognized. ASU's herbarium curator Les Landrum is a bit of a groundcherry nerd (I say this with admiration). If you go here: http://swbiodiversity.org/seinet/index.php, then Search Collections => Select All => Next => Scientific Name = Physalis => Search, you get some 3700 pertinent specimen records. If you then switch to the Species List tab, you see 115 concept listed overall. Switching to the USDA Plants Thesaurus will give you only 46 concepts that these 3700 specimens are mapped to. Using instead the ASU Taxonomic Thesaurus will yield 89 concepts linking variously to those specimens. This is based on Les' classification of groundcherries which is not further documented in the SEINet environment at this moment.
Now, saying a a priori whether Les' list represents a set of TNUs versus concepts would presumably require you to assert that there is nobody who is Les or very much like him that can provide a semantically very accurate mapping of the 89 name usages in the SEINet-ASU Physalis list to the much more thoroughly circumscribed USDA Plants concepts. That could seem like a daring prediction given how little Les might think of the USDA perspective. At the very moment that Les or someone very much like him DOES provide the mapping, what looked like a list of TNUs then all of a sudden acquires - via the mapping - a much deeper semantic status where others can readily go from one classification to the next, even though each come with very different amounts of information in their original appearances. Some people may prefer Les' concepts at least for Arizonan groundcherries, and in either case, the mapping put both on an even playing field.
So this exemplifies IMO why so far the concept approach has been too abstract, the TCN has been too depauperate on the relationships/mapping side (worrying instead almost needlessness about what constitutes a concept per se), and definitions between identifications, name usages, shallow, deep concepts have been too abstract as well. I believe we should focus less discussion on those issues and more emphasis on building mapping tools that can carry a wide range of input and logically infer additional implied mappings from the initial expert-given set. The actual semantic properties of that input will emerge a posteriori and will be hard to predict in some cases. Some descriptions are lengthy but nobody understands them. Some names lists are profoundly informative if the context of their origin is well known to an expert.
There will be some obvious overreaches in both directions (too many unconnected items, some items that are connected more precisely than their inherent information would seem to justify). I think these overreaches would be tolerable. What's less productive to me is a restrictive set of definitions that provide an early blockage in they way towards an environment where mapping is supposed to occur very frequently. We're not at the registry stage yet. More at the "can this work in principle" stage. As I mentioned before, the mappings ARE the concepts under a certain viewpoint. We don't want to pre-determine their fate by separating TNUs from concepts in a great number of cases.
I hope this was not a misrepresentation of your view and also a clarification of my view. In the end, we both advocate some sort of balance for the same concerns, but perhaps disagree only strategically about the moment where/when that balance will materialize - upfront via precise definitions and registration or later on via the presence/lack of actual mappings.
Best,
Nico
On Mon, Nov 26, 2012 at 5:18 PM, Richard Pyle <deepreef@bishopmuseum.org> wrote:
I want to get into this topic in more detail (going back to Steve’s original post), but this week is hell-week for me, so only a quick comment now.
I generally agree with everything Nico says, but I think we need to be a little more clear of what we mean by “name sec. author”
The core unit of the data model we’ve been building towards (GNUB, which underlies ZooBank) uses as its fundamental unit something we’ve been calling a “Taxon Name Usage Instance” (TNU). The scope of what can be a TNU is intentionally very broad – anything from an original taxon name description, to a mention in a newspaper article, and potentially even a scribbled hand-written label or letter. The only requirement is that it be static – that is, a snapshot in time. I mention this because database records can be represented as TNUs, but only as a static snapshot of the record. If the essence of the database record changes over time (e.g., due to changing taxonomic opinion), then a new TNU is generated for a different snapshot in time.
A very small subset of the universe of TNUs represent Code-governed Nomenclatural Acts (original descriptions of new names and other code-governed nomenclatural actions). In the case of such TNUs involving the ICZN Code (for example), the TNUs are registered in ZooBank. But the point is, one subset of all TNUs are those that involve actions governed by a Code of nomenclature.
The reason I mention this is that, if I read Nico’s email correctly, I think he’s saying that not all TNUs de-facto represent taxon concepts. Rather, analogous to the nomenclatural subset of TNUs, there is a subset of TNUs that rise to the level of representing Taxon Concept definitions. In the case of nomenclatural acts, someone must make some sort of declaration (assertion) that a specific TNU constitutes a Code-governed nomenclatural act, along with relevant metadata relating to that assertion and the nature of the Act. In the case of zoological names, ZooBank is intended to facilitate this role (i.e., when a person registers a TNU in ZooBank, there is an implied assertion that the TNU represents a nomenclatural act under the ICZN Code).
What would be nice to have (and what TDWG could play a helpful role in facilitating), is a registry of sorts (analogous to ZooBank) for those TNUs that represent taxon concepts. In other words, a mechanism for people to “register” the subset of all TNUs that represent taxon concepts. Secondarily, there would also be a mechanism to make assertions about how registered taxon concepts map to each other (via some sort of set theory relationship[s]).
In summary, my points are
1) We should be clear when we say “name sec. author” whether we mean it sensu lato (i.e., all TNUs); or sensu stricto (i.e., only those TNUs that rise to the level of representing taxon concepts).
2) There ought to be a registry (perhaps administered by CoL?) for identifying the subset of TNUs that represent concept definitions, and it should include a mechanism for making set-theory relationship assertions among registered concept-TNUs.
3) The two things mentioned in #2 should be separate; that is, one can assert that a particular TNU represents a taxon concept separately from (potentially multiple) assertions about how that taxon concept relates to other taxon concepts.
More later.
Aloha,
Rich
P.S By my standards that WAS quick!
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A. delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235 office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu