<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7654.12">
<TITLE>Re: [tdwg-content] Producing a global taxon register (was: ITIS        TSNID to uBio NamebankIDs mapping)</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>Van: Richard Pyle [<A HREF="mailto:deepreef@bishopmuseum.org">mailto:deepreef@bishopmuseum.org</A>]<BR>
Verzonden: zo 5-6-2011 19:14<BR>
<BR>
> * a GTR - global taxon register - is something else entirely, at least<BR>
> if the term is taken literally. It would be indispensable if the purpose<BR>
> "to index all usages of all names in all sources" is to be realized.<BR>
<BR>
Yes, that would be nice. But as Tony indicated, that would be impractical<BR>
for the foreseeable future. Especially when you consider that "all sources"<BR>
encompasses not only "all publications" (including popular books and<BR>
magazine articles, newspaper articles, etc., etc.), but also all unpublished<BR>
sources (museum specimen labels, field notebooks, personal correspondence,<BR>
etc., etc.). The GNUB model is designed to accommodate any & all of these,<BR>
but a proactive attempt to populate it to that extent would represent an<BR>
unrealistic amount of effort.<BR>
<BR>
***<BR>
I don't know about that. Certainly, if you organize this top down<BR>
it would be a prohibitively massive undertaking. However, if<BR>
organized bottom up, it may be quite different; of course, coverage<BR>
would be uneven.<BR>
* * *<BR>
<BR>
However, an enormous benefit would be achieved it a select subset of "all<BR>
usages of all names in all sources" was targeted. For example, the first<BR>
priority for populating GNUB will be:<BR>
<BR>
> a complete nomenclatural index<BR>
> (inventorying all nomenclatural acts),<BR>
<BR>
***<BR>
This would not be a first step. The first step will be a complete<BR>
Checklist; likely, this will be ready years before a complete<BR>
nomenclatural index.<BR>
* * *<BR>
<BR>
And the next step would be:<BR>
<BR>
> moving towards<BR>
> lists of currently accepted names<BR>
<BR>
That is, capturing the specific usage instances for each that reflect a<BR>
modern taxonomic landscape. Of course, there is more than one<BR>
interpretation of the "modern taxonomic landscape" (i.e., different opinions<BR>
about how to structure the HCAL). Therefore, you need a spectrum of modern<BR>
usage instances to capture all of the popular HCAL perspectives.<BR>
<BR>
> Names and taxa are quite different things and they are interconnected<BR>
> in a complex way.<BR>
<BR>
I don't think that the interconnection is all that complex. In the same way<BR>
that nomenclature and biology intersect at the type specimen, names and taxa<BR>
intersect at the Taxon Name Usage instance. The analogy is reasonably good.<BR>
A scientific name is "anchored" to the biological world through the type<BR>
specimen. Likewise, a taxon concept is anchored to a name through a taxon<BR>
name usage instance. Not all taxon name usage instances rise to the level<BR>
of an explicit or implicit taxon concept definition. However, all taxon<BR>
concept definitions exist in the form of a Taxon Name Usage instance.<BR>
<BR>
The problem, as Tony alluded to, is that TNU instances are so abundant that<BR>
it can be overwhelming to contemplate the TNU universe in its entirety.<BR>
<BR>
***<BR>
Not so sure if it is handy to refer to this as a universe, given that<BR>
the components conflict, cluster, overlap, etc.<BR>
* * *<BR>
<BR>
Dave Remsen referred to TNUs as the "individual molecules" of taxonomy. When<BR>
we look at a physical object, we don't think of it in terms of an assemblage<BR>
of individual molecules; we abstract it to the entire object. This is why<BR>
we have so many databases that focus on the HCAL -- it's much more direct to<BR>
capture the entire object (in this case, taxon concept), than to enumerate<BR>
all of the molecules that comprise it. <BR>
<BR>
***<BR>
I agree that it is more direct, but its popularity will be because<BR>
of the popularity of shortcuts, the desire for the One Truth or the<BR>
Latest Thing, or just laziness.<BR>
* * *<BR>
<BR>
But unlike physical objects and their constituent molecules, there are<BR>
"special" TNUs that stand out from all the rest. Capturing a few of these<BR>
"special" TNUs will allow us to get most of the benefit in representing the<BR>
parts of the taxon concept we're interested. As already noted, these<BR>
"special" TNUs include all the relevant nomenclatural acts for all of the<BR>
names that have been associated with that taxon concept, as well as the main<BR>
concept definitions (e.g., published taxonomic treatments that may or may<BR>
not carry nomenclatural acts with them). In other words, unlike trying to<BR>
describe a physical object by enumerating its individual molecules, we can<BR>
capture the majority of our interest in taxon names and concepts by<BR>
enumerating only a small fraction of the TNUs (i.e., the aforementioned<BR>
"special" ones).<BR>
<BR>
***<BR>
Yes, the Law of Diminishing Returns applies. However, there are two<BR>
issues. Firstly, where to draw the line? Secondly, the marketing<BR>
pitch. Anything that will offer, say, a 40% usability will be marketed<BR>
as the Eighth Wonder of the World, and will cause further damage.<BR>
<BR>
Paul<BR>
<BR>
<BR>
</FONT>
</P>
</BODY>
</HTML>