GUIDs for Taxon Names and Taxon Concepts

Fri Nov 11 13:26:24 CET 2005

Hi Rich,

I think you are wrong in your conclusions that we do not need GUIDs for
TaxonNames and TaxonObjects but I need to do diagrams to illustrate the
case so have created a wiki page here:

http://wiki.gbif.org/guidwiki/wikka.php?wakka=SeparateNamesAndConcepts

I also intersperse some comments below.

Richard Pyle wrote:
> Hi Roger,
>
>
>> Rich - yes. I think you sum up the difference between new combinations
>> in ICZN and ICBN well. But... just because the ICZN does not consider
>> the usage of a name in a different genus as a nomenclatural act does not
>> stop us creating a data object (TaxonName object) to represent what that
>> name looks like in the new genus - perhaps with its new ending and
>> possibly different author string (ICZN Recommendation 51G).
>>
>
> I agree that nothing is stopping us, but I forsee headaches down the road as
> a result of doing so.  In your next message, you wrote:
>
> "1. We have two kinds of GUID (one for TaxonNames and one for
> TaxonConcepts)."
>
> >From this I interpret that TaxonName GUIDs are different with TaxonConcept
> GUIDs -- is that correct?  If so, the problem is that a botanist would
> assign a new TaxonName GUID to a new combination, and a zoologist would
> assign a TaxonConcept GUID to the same entity, because "genus combination"
> is a property of a name in botany, and a properyy of a usage (~concept) in
> zoology.
>
> Yes, you could certainly force-treat zoological names as though they were
> botanical names (treating new combinations as "new names"), just as you
> could easily force-treat botanical names as though they were zoological
> names (assigning TaxonName GUIDs only to basionyms, and representing new
> combincations via Usage/TaxonConcept GUIDs).  I just believe that we will
> come to regret it if we leave the distinction "fuzzy".
>
>
I am not suggesting we leave it fuzzy I am suggesting we leave it up to
the nomenclators and that it isn't a GUID issue. I don't leave the
maintenance  of the brakes on my car fuzzy by not fixing them myself - I
delegate it to a mechanic. Just because we are not solving the problem
here does not mean that we are not going to get the problem solved.
> My point has always been, and continues to be, that *IF* you have separate
> "kinds of GUID" for TaxonNames and TaxonConcepts, the line between the two
> should be unambigious (and ideally be consistent for both botany and
> zoology).  After thinking about it some more, in the context of what has
> been written on this thread, I find myself coming back to that first "IF".
> Given that there seems to be a need and a desire to leave the definition of
> a "name unit" (to which a GUID is assigned) loose and flexible, then perhaps
> it would be premature to establish a GUID system for Name-Objects at all.
> Instead, I think we can both simplify *and* disambiguate taxonomic objects
> if there is only *ONE* GUID system -- which represents a Name-Usage
> instance.
>
Then don't refer to the TaxonName GUIDs issued by nomenclators when you
issue your TaxonConcepts. Just ignore the nomenclator bit. If you are
correct then everyone will follow your lead. A two GUID system will just
degrade to a one GUID system if the names thing is wrong.
> In the case of datasets consisting of only namestrings, with no specific
> implied concept objects, the names can be interpreted as "NameString SEC
> Nobody" (=Nominal TaxonConcept in TCS). Nomenclators could use whichever
> subset of these Name-usage GUIDs that they wish to refer to their own
> version of a name-object. For example, ZooBank can concern itself only with
> those GUIDs attached to original basionym usage instances, and IPNI can
> manage both basionyms usage instances and new-combination usage instances.
> ConceptBank could expand the scope of GUIDs to all those usage instances
> that represent defined concepts. Name indexers could use the broadest set of
> GUIDs (effectively all name-usage instances).
>
I think you are mixing issues here. The name string probably represents
a TaxonConcept sec the dataset - depending on your intent. If you have
had people scoring observations to a checklist for an area for a number
of years then it would be convenient to treat the names as concepts so
that you can reason about how they may be related to other better
defined concepts. They are concepts sec your list. If they are just
words you found in some books then they are just words  till you intend
to do something with them.

A nomenclator should not be publishing data other than nomenclatural
data (debatable what this is I know but no circumscription data anyway).

ConceptBank sounds like any indexer (like GBIF) that crawls the people
who are defining concepts of taxa and tries to provide sensible query
expansion on the basis of the GUIDs or however else it links the objects
together.

> So, in summary, my feeling is that if we are to think of "two kinds of GUID"
> for taxon objects, then we need an unambiguous (and cross-Code) distinction
> between the two.  If such a distinction is too cumbersome to draw (as it
> seems to be, based on the current thread), then we should only go with one
> kind of GUID -- and by default it should be the one kind that is most
> flexible (=usage instance).
>
>
See the wiki page for why I think this is an erroneous conclusion.

Here is a definition of the two types of GUID:

   1. A TaxonName GUID resolves to a data object that only contains
      information about nomenclature. The provider does not intend
      anyone to be able to identify a specimen to this GUID.
   2. TaxonConcept GUIDs resolves to a data object that contains
      information about the delimitation/circumscription/relationships
      of a taxon. The provider intends people to be able to identify or
      otherwise related data to this GUID.

I think the notion of the intension of the provider is very important as
it answers many questions. "Should I attach a GUID to object X?" depends
on what you want other people to do with it. I don't think this is very
cumbersome definition? There is no 'natural' distinction between these
two things. It is a pragmatic split so we produce a series of 'rooted'
directed graphs of

All the best,

Roger

--

-------------------------------------
 Roger Hyam
 Technical Architect
 Taxonomic Databases Working Group
-------------------------------------
 http://www.tdwg.org
 roger at tdwg.org
 +44 1578 722782
-------------------------------------

--------------040604040104030204070107
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
  <title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
Hi Rich,<br>
<br>
I think you are wrong in your conclusions that we do not need GUIDs for
TaxonNames and TaxonObjects but I need to do diagrams to illustrate the
case so have created a wiki page here:<br>
<br>
<a class="moz-txt-link-freetext" href="http://wiki.gbif.org/guidwiki/wikka.php?wakka=SeparateNamesAndConcepts">http://wiki.gbif.org/guidwiki/wikka.php?wakka=SeparateNamesAndConcepts</a><br>
<br>
I also intersperse some comments below.<br>
<br>
Richard Pyle wrote:
<blockquote
 cite="midIMEKKFHEGHHDDDHKIOJEEEGGDHAA.deepreef at bishopmuseum.org"
 type="cite">
  <pre wrap="">Hi Roger,

  </pre>
  <blockquote type="cite">
    <pre wrap="">Rich - yes. I think you sum up the difference between new combinations
in ICZN and ICBN well. But... just because the ICZN does not consider
the usage of a name in a different genus as a nomenclatural act does not
stop us creating a data object (TaxonName object) to represent what that
name looks like in the new genus - perhaps with its new ending and
possibly different author string (ICZN Recommendation 51G).
    </pre>
  </blockquote>
  <pre wrap=""><!---->
I agree that nothing is stopping us, but I forsee headaches down the road as
a result of doing so.  In your next message, you wrote:

"1. We have two kinds of GUID (one for TaxonNames and one for
TaxonConcepts)."

&gt;From this I interpret that TaxonName GUIDs are different with TaxonConcept
GUIDs -- is that correct?  If so, the problem is that a botanist would
assign a new TaxonName GUID to a new combination, and a zoologist would
assign a TaxonConcept GUID to the same entity, because "genus combination"
is a property of a name in botany, and a properyy of a usage (~concept) in
zoology.

Yes, you could certainly force-treat zoological names as though they were
botanical names (treating new combinations as "new names"), just as you
could easily force-treat botanical names as though they were zoological
names (assigning TaxonName GUIDs only to basionyms, and representing new
combincations via Usage/TaxonConcept GUIDs).  I just believe that we will
come to regret it if we leave the distinction "fuzzy".

</pre>
</blockquote>
I am not suggesting we leave it fuzzy I am suggesting we leave it up to
the nomenclators and that it isn't a GUID issue. I don't leave the
maintenance&nbsp; of the brakes on my car fuzzy by not fixing them myself -
I delegate it to a mechanic. Just because we are not solving the
problem here does not mean that we are not going to get the problem
solved.
<blockquote
 cite="midIMEKKFHEGHHDDDHKIOJEEEGGDHAA.deepreef at bishopmuseum.org"
 type="cite">
 <pre wrap="">My point has always been, and continues to be, that *IF* you have separate
"kinds of GUID" for TaxonNames and TaxonConcepts, the line between the two
should be unambigious (and ideally be consistent for both botany and
zoology). After thinking about it some more, in the context of what has
been written on this thread, I find myself coming back to that first "IF".
Given that there seems to be a need and a desire to leave the definition of
a "name unit" (to which a GUID is assigned) loose and flexible, then perhaps
it would be premature to establish a GUID system for Name-Objects at all.
Instead, I think we can both simplify *and* disambiguate taxonomic objects
if there is only *ONE* GUID system -- which represents a Name-Usage
instance.
 </pre>
</blockquote>
Then don't refer to the TaxonName GUIDs issued by nomenclators when you
issue your TaxonConcepts. Just ignore the nomenclator bit. If you are
correct then everyone will follow your lead. A two GUID system will
just degrade to a one GUID system if the names thing is wrong.
<blockquote
 cite="midIMEKKFHEGHHDDDHKIOJEEEGGDHAA.deepreef at bishopmuseum.org"
 type="cite">
 <pre wrap="">In the case of datasets consisting of only namestrings, with no specific
implied concept objects, the names can be interpreted as "NameString SEC
Nobody" (=Nominal TaxonConcept in TCS). Nomenclators could use whichever
subset of these Name-usage GUIDs that they wish to refer to their own
version of a name-object. For example, ZooBank can concern itself only with
those GUIDs attached to original basionym usage instances, and IPNI can
manage both basionyms usage instances and new-combination usage instances.
ConceptBank could expand the scope of GUIDs to all those usage instances
that represent defined concepts. Name indexers could use the broadest set of
GUIDs (effectively all name-usage instances).
 </pre>
</blockquote>
I think you are mixing issues here. The name string probably represents
a TaxonConcept sec the dataset - depending on your intent. If you have
had people scoring observations to a checklist for an area for a number
of years then it would be convenient to treat the names as concepts so
that you can reason about how they may be related to other better
defined concepts. They are concepts sec your list. If they are just
words
you found in some books then they are just words&nbsp; till
you intend to do something with them. 
 
A nomenclator should not be publishing data other than nomenclatural
data (debatable what this is I know but no circumscription data anyway). 
 
ConceptBank sounds like any indexer (like GBIF) that crawls the people
who are defining concepts of taxa and tries to provide sensible query
expansion on the basis of the GUIDs or however else it links the
objects together. 
 
<blockquote
 cite="midIMEKKFHEGHHDDDHKIOJEEEGGDHAA.deepreef at bishopmuseum.org"
 type="cite">
 <pre wrap="">So, in summary, my feeling is that if we are to think of "two kinds of GUID"
for taxon objects, then we need an unambiguous (and cross-Code) distinction
between the two. If such a distinction is too cumbersome to draw (as it
seems to be, based on the current thread), then we should only go with one
kind of GUID -- and by default it should be the one kind that is most
flexible (=usage instance).

</pre>
</blockquote>
See the wiki page for why I think this is an erroneous conclusion. 
 
Here is a definition of the
two types of GUID: 

<ol>
 <li>A TaxonName GUID resolves to a data object that only contains
information about nomenclature. The provider does not intend anyone to
be able to identify a specimen to this GUID.</li>
 <li>TaxonConcept GUIDs resolves to a data object that contains
information about the delimitation/circumscription/relationships of a
taxon. The provider intends people to be able to identify or otherwise
related data to this GUID.</li>
</ol>
I think the notion of the intension of the provider is very important
as it answers many questions. "Should I attach a GUID to object X?"
depends on what you want other people to do with it. I don't think this
is very cumbersome definition? There is no 'natural' distinction
between these two things. It is a pragmatic split so we produce a
series of 'rooted' directed graphs of 
 
All the best, 
 
Roger 
 
 
<pre class="moz-signature" cols="72">--

-------------------------------------
 Roger Hyam
 Technical Architect
 Taxonomic Databases Working Group
-------------------------------------
 <a class="moz-txt-link-freetext" href="http://www.tdwg.org">http://www.tdwg.org</a>
 <a class="moz-txt-link-abbreviated" href="mailto:roger at tdwg.org">roger at tdwg.org</a>
 +44 1578 722782
-------------------------------------
</pre>
</body>
</html>