[ Another topic for comments.  Please keep the Topic number in responses. ]

 

Topic 3: GUIDs for Taxon Names and Taxon Concepts

 

Another key area in which TDWG has recognised the need for globally unique identifiers is in connection with taxon names and the various concepts associated with them.  This issue actually also intersects with that of identifiers for taxonomic publications. 

 

Definitions

 

In the following discussion, a “taxon name” is a scientific name string which simply identifies a name assigned in the taxonomic literature.  In many cases such a name may have been applied in different ways by the original author and subsequent taxonomists.  Each such application of a taxon name by a taxonomist to a set of organisms is here referred to as a “taxon concept”.  An understanding of the taxon concept adopted by a researcher is frequently essential if data are to be interpreted correctly.  In its most basic form a “taxon concept” can be considered to be the use of a given “taxon name” in a given “taxonomic publication”, in other words something that could be represented as, “Agenus aspecies Author1 Year1 sec. Author2 Year2”.    One possible approach to assigning identifiers to taxon concepts would therefore be to assign identifiers to taxon names and to taxonomic publications and to use a combination these identifiers to identify each taxon concept.

 

Note that a taxon concept may be defined at least in part by a set of assertions about the relationship between the present concept and the concepts adopted by earlier taxonomists.  In addition it is possible for other researchers to make their own assertions about the relationships between the concepts published by different taxonomists.  Much of the interest and value to be gained from modeling taxonomy relates to the interpretation of these asserted relationships.

 

Although the distinction between taxon names and taxon concepts may seem (over-)subtle, it is important that we should know whether we are referring simply to a nomenclaturally valid name, quite independently of any set of organisms to which it may be applied, or to a taxon concept which somehow applies such a name to such a set of organisms.  Without this distinction, we will be restricted in our ability to develop biodiversity informatics, although of course there will be many cases in which all we can say is that a data set refers to some unspecified taxon concept associated with a given taxon name.  

 

Identifiers

 

Clearly there are many situations in which a taxon name can itself be treated as a unique identifier without any apparent ambiguity about which name is being referenced (e.g. Turdus merula; Poa annua), but the existence of homonyms prevents this from being generally true.  Even when taxon names include citations of the original publications (e.g. Turdus merula Linnaeus, 1758; Poa annua L.), they can be very difficult to compare since the form of the citations may vary greatly.  Even where there is no ambiguity about which name is being referenced, such a name does not by itself serve to identify which associated concept is being referenced.

 

There are many different systems in place for associating other identifiers with either taxon names or taxon concepts.  ITIS (http://www.itis.usda.gov/, http://www.cbif.gc.ca/pls/itisca, http://siit.conabio.gob.mx/) assigns Taxonomic Serial Numbers (TSNs) to each name in its system.  Other species databases have their own identifiers for taxon concepts.  Recording schemes often have their own identifiers for taxa (e.g. Bradley and Fletcher numbers for Lepidoptera in the UK, various systems of four-letter codes for North American bird species).  These are often used to provide some stability and clarity in the taxonomy used by a given project.

 

Questions

 

I would like therefore to ask the following questions of any of you who use scientific names in your databases (either taxonomic databases recording a list of taxa, or databases recording information about taxa, specimens, observations, etc.):

 

  1. Is your data organised using taxon names or to taxon concepts?
  2. Do you assign any reusable identifiers to taxon names or concepts (i.e. identifiers used in more than one database)?
  3. If so, what is the process in assigning new identifiers for additional taxa and for accommodating taxonomic change?
  4. Where are these identifiers used (other organizations, databases, data exchange, recording forms, etc.)?
  5. Do you use identifiers from any external classification within your database?
  6. Would there be any social or technical roadblocks to replacing these identifiers with a single identifier that was guaranteed to be unique?

 

As before I am looking for information on existing practices and any requirements that would need to be accommodated within any general system of identifiers.

 

Thanks,

 

Donald
 
---------------------------------------------------------------
Donald Hobern (dhobern@gbif.org)
Programme Officer for Data Access and Database Interoperability
Global Biodiversity Information Facility Secretariat
Universitetsparken 15, DK-2100 Copenhagen, Denmark
Tel: +45-35321483   Mobile: +45-28751483   Fax: +45-35321480
---------------------------------------------------------------