Re: [tdwg-content] Taxon and Name [SEC=UNCLASSIFIED]

2 Nov 2010

      Hi Steve,
...
I read through Rich's various 
emails in the thread, Paul's email (below), and looked at the 
Darwin Core Taxon and Identification class terms and came up with:
http://bioimages.vanderbilt.edu/pages/taxon-diagram1.gif
Please note that this diagram does NOT represent an opinion 
on my part but  rather an attempt to summarize what people 
have said in a graphical way.
That's pretty close.  The way we deal with the "taxonName" box, however, is
as a subset of taxonNameUsage which we've been calling a "Protonym".
Roughly equivalent to a botanical basionym, but more general.  Basically, a
protonym is "the naxonNameUsage instance that established the name".  The
word "established" is the one that needs to be carefully defined; for
scientific names, it refers to the particular taxonNameUsage instance that
established the name in accordance (or attempted accordance with) with the
relevant Code(s) ["Code(s)" because some names fall under more than one Code
-- see Ambiregnal).

Anyway, the point is, I would represent the diagram with the taxonName box
represented as a subset of tanxonNameUsage, or as a recursive relationship
(not sure how best to do that using your symbols).

A more detailed explanation starts on p. 18 (after "Taxa" heading) of this
article:
http://systbio.org/files/phyloinformatics/1.pdf
...where the term "Assertion" is equivalent to "taxonNameUsage".
...
1. There are taxon concepts, which I guess represents a 
particular circumscription of individuals.  The taxon concept 
is the result of some kind of rule that allows one to decide 
whether  particular individuals should be included in that 
taxon or not.  The set of all biological individuals that are 
included are the actual concept (or maybe not?).
I think that's probably about right, but I generally prefer Nico's wording
in his reply.  Only thing I'm a little uncomfortable with in his #1 (second
paragraph) is the notion that a Taxon Concept relies on the existence of a
scientific name.  I think that a taxon concept exists independantly of the
name(s) that have been used to label it, to include circumscribed sets of
individuals that have not yet been assigned a scientific name.  I also think
they can exist independently of a publication.  I also tend to think of the
concept as the "implied" set of organisms (living, dead, yet-to-be-born),
which is more or less what I think Nico means, but perhaps worded slightly
differently.

I agree that what I think of as "taxonNameUsage" is somewhat close to what
Nico defines as "Taxon Concept", but a TNU doesn't always come with an
implied concept -- sometimes it's just a raw name-usage without an implied
concept (e.g., in a catalog of type specimens at a Museum).  However, I
would say that *all* taxon concepts are anchored to at least one TNU
instance, keeping in mind that the "N" part doesn't have to be a scientific
name, and the "U" doesn't have to occur within the scope of a publication.
...
3. There are taxon name usages, which are a sort of node that 
connects a name with a concept.  If I'm getting Rich right, 
this is the resource to which dwc:Identifications should be 
tied.  Rich also suggested that taxon name usages might be 
instances of the dwc:Taxon class.
Although these three types of resources aren't all defined as 
"classes"
in Darwin Core, it seems to me that they are classes in the 
"RDF sense"
(i.e. that their instances can be typed to them).
As mentioned, I think of taxon concepts as additional contextual information
that exist in the form of TNUs.  Just like Nomecnaltural Acts (which are not
concepts).  Just like a lot of other information that's not really part of a
defined taxon concept (but often constitutes information that definds the
boundaries of the concept circumscription).

TNUs are really just a core index of documented usages of names for
organisms, where "documented" is usually a publication, but can include
other forms of static documentation.  But the thing is, basically all
nomenclatural acts, and all concept definitions, and a very, very large
proposrtion of information about biodiversity (at least the ones that
provide context via scientific names) exist as part of a TNU.
...
In the diagram, I used triangles to indicate 1:many 
relationships similar to the way Rich did in his original 
diagram.  In the DwC term index, 
(http://rs.tdwg.org/dwc/terms/index.htm), there seem to be 
other entities represented that are like the "acceptedName" (i.e.
originalName, parentName) but I've left them off the diagram 
for simplicity and because I don't fully understand them.
acceptedName[Id] refers to the particular usage instance of a name that the
data provider believes "got it right" (i.e., right spelling, right parent,
right rank, right set of heterotypic synonyms).  It's a way of saying, "When
we use the name Aus bus, we mean it in the sense of Jones, 1980.  It's the
"sec" reference that Nico was referring to.

originalName[ID] is a pointer to the Protonym TNU for the terminal name. For
example, if Aus bus was originally described by Linnaeus in 1758, there
would be a TNU for the epithet "bus" linked to the reference of Linnaeus
1758, and that particular usage instance for "bus" would be the Protonym.
If the data provider wanted to represent the name as "Xus bus (Linnaeus)
Smith" (botanical standard for saying the species epithet "bus" established
by Linnaeus, first combined with the genus "Xus" by Smith), then the
originalName[ID] would refer to the (protonym) usage instance of "Aus bus"
by Linnaeus.

parentName[ID] is a link to the parent taxon of the particular usage.  For
example, if the given record (taxonID) referred to the protonym instance of
Aus bus in Linnaeus 1758, then the parentName[ID] would point to another TNU
representing the treatment of the genus "Aus" in Linnaeus 1758.  If the
taxonID record is for the usage of Xus bus (L.) by Smith, then the
parentUsage[ID] would refer to the usage instance of the genus name "Xus" by
Smith.

All three are recursive links back to other taxonNameUsage instances.
...
I've indicated my guess about some of the key properties of 
each of the four "classes" (including also Identifications) 
by arrows pointing away from the boxes.  In a lot of cases 
there are both xyz and xyzID terms, where xyz would be for a 
string literal and xyzID would be for a GUID (e.g. URI).  But 
they would behave the same way, so I only showed the xyzID 
version in the diagram.
Most of these seem about right to me, but I haven't examined them in detail
yet.
...
Here is a use case for me.  I refer to the Gleason and 
Cronquist key and the Golden Guide to the Trees to identify a 
tree that I've documented in an Occurrence.  identifiedBy 
would be me.  nameAccordingTo would be a reference to the 
Gleason and Cronquist treatment.  namePublishedIn would be 
the original publication for the species name.
Agreed.
...
identificationReferences would be the Golden Guide and the 
Gleason and Cronquist key.
Not sure -- I guess this presumes that the taxon concept represented in G&C
is congruent with the taxon concept represented in the Golden Guide.
...
If somebody like Pete had created 
a URI to represent the concept, I could refer to that using 
taxonConceptID.
Right...and presumably this taxonConceptID would include the fact that the
G&C TNU and the Golden Guide TNU represent congruent Concepts.
...
So is this anything close to reality?
I would say very close...but I didn't study your email or diagram in detail.

Aloha,
Rich

Re: [tdwg-content] Taxon and Name [SEC=UNCLASSIFIED]

Richard Pyle