[tdwg-content] canonicalScientificName

Donald Hobern (GBIF) dhobern at gbif.org
Wed Mar 14 18:36:24 CET 2012


Hi Rod.

 

Given the two alternatives you offer, I fully agree.  However (and I may be
very wrong on this), I don’t actually believe that what you describe is what
is happening in most cases.  I believe that the scientificName field should
always be the “give us the cleanest representation of the scientific name
you have” and we could certainly provide a more useful set of priorities for
how that is defined.  Regardless of what the DwC guidelines say, I think
many data providers simply map their most closely aligned database column
into the DwC view and we get whatever that happens to be, with whatever
authorship it may contain.

 

I’d agree that the clean name is much closer to what a web-enabled
linked-data world needs and would happily endorse a move to make that the
recommended form.  I just honestly believe that many providers will always
give us something different.  Adding a new field will just shift the
problem.

 

At least that’s my perspective


 

All the best,

 

Donald

 

 

----------------------------------------------------------------------

Donald Hobern - GBIF Director -  <mailto:dhobern at gbif.org> dhobern at gbif.org 

Global Biodiversity Information Facility  <http://www.gbif.org/>
http://www.gbif.org/ 

GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark

Tel: +45 3532 1471  Mob: +45 2875 1471  Fax: +45 2875 1480

----------------------------------------------------------------------

 

From: tdwg-content-bounces at lists.tdwg.org
[mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Roderic Page
Sent: Wednesday, March 14, 2012 5:50 PM
To: TDWG Content Mailing List
Subject: Re: [tdwg-content] canonicalScientificName

 

Dear Donald,

 

I couldn't disagree more!

 

It seems to me that this is one case where needs of consumers and providers
align pretty well.

 

If I'm publishing data I want to avoid hassle, and one hassle is finding the
taxonomic authorities for names. Then there is the issue of how to write the
authority. There are so many variables: do I include diacritic characters?
is the person's name abbreviated? what is the correct date? should I use
parentheses? should I use commas? If I can just publish the canonical name
life is simpler.

 

As a consumer I can't trust people to get the authority right. Publishers
get the taxonomic names wrong, and they will certainly make a mess of the
authority.

 

So, if we mandate clean names we are saying to providers "give me this"

 

<taxonomic name>

[some scope for crap]

 

Instead, we've mandated "give me this"

 

<taxonomic name>      +    <authority>

[some scope for crap]  +  [huge scope for crap]

 

Why? Why would we do this to ourselves? Why do we think it's OK to have
databases full of duplicates such as these (from the ION database)?:

 

Pseudopaludicola Miranda Ribeiro 1926

Pseudopaludicola Mir. Ribeiro 1926

Pseudopaludicola Miranda-Ribeiro 1926

 

One consequence of this is that we have projects like http://globalnames.org
project, which is essentially collecting endless variations on authority
strings. In other words, trying to clean up a mess essentially of our own
making. 

 

By all means have a field for taxonomic authority, but keep that separate
from the canonical taxonomic name. In the real world, the canonical name is
what people use. If we want people to make data available, make it simple.
If we want people to use data make it simple.

 

Regards

 

Rod

 

 

 

On 14 Mar 2012, at 11:16, Donald Hobern wrote:





Hi Peter.

 

I certainly sympathise with the desire for a readily-consumed naked
scientific name field.  However, unless the canonicalScientificName element
is enforced as a mandatory field (which would in itself impact some data
publishers and may prevent them validly sharing their data without extra
work to provide clean scientific names), it will be yet another element
which data consumers must check.  If canonicalScientificName is supplied,
consumers will still need to handle cases where it is malformed.  If is not
supplied, they will need to ignore the record or else do precisely what they
do today with the scientificName field.  

 

I therefore worry that adding this field could in fact make the task more
complex, rather than simpler, for data consumers.

 

Thanks,

 

Donald

 

----------------------------------------------------------------------

Donald Hobern - GBIF Director -  <mailto:dhobern at gbif.org> dhobern at gbif.org

Global Biodiversity Information Facility  <http://www.gbif.org/>
http://www.gbif.org/

GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark

Tel:  <tel:%2B45%203532%201471> +45 3532 1471  Mob:
<tel:%2B45%202875%201471> +45 2875 1471  Fax:  <tel:%2B45%202875%201480> +45
2875 1480

----------------------------------------------------------------------

_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content

 

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine
College of Medical, Veterinary and Life Sciences
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792

Skype: rdmpage
AIM: rodpage1962 at aim.com
Facebook: http://www.facebook.com/profile.php?id=1112517192
Twitter: http://twitter.com/rdmpage
Blog: http://iphylo.blogspot.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20120314/eb4da679/attachment.html 


More information about the tdwg-content mailing list