[tdwg-tag] [tdwg-content] canonicalScientificName

Peter Desmet peter.desmet at umontreal.ca
Wed Mar 14 19:03:44 CET 2012


Hi Donald,

I just don't understand why an additional field canonicalScientificName
(much like minimumElevationInMeters or countryCode) would create or shift
the problem. It has a clear and easy to understand definition [1]. As a
data provider I can assess if its too much of a bother to populate the
field in addition to the more lenient scientificName (just like
countryCode, etc.) and as a data user I can complain or ignore the data if
I see the provider didn't follow the definition.

This discussion resurfaces every time: clearly users would like to have
this. So why do we make it difficult for those users and the data providers
who can provide it?

Peter

[1] http://code.google.com/p/darwincore/issues/detail?id=150 I could even
add "[...] it will contain one, two or three words" (see email Jessie
Kennedy)

PS: Yes, some people probably interpret scientificName as "Give us the
cleanest representation of the scientific name you have", but that is
definitely not the Darwin Core definition. It's more "give us the most
verbose representation of the scientific name you have", which is what I
use and what I advise all data publishers in my network to use. The
difference doesn't matter though, the important thing is that a data user
cannot expect this term to be a canonical representation of the name.

On Wed, Mar 14, 2012 at 13:36, Donald Hobern (GBIF) <dhobern at gbif.org>wrote:

> Hi Rod.****
>
> ** **
>
> Given the two alternatives you offer, I fully agree.  However (and I may
> be very wrong on this), I don’t actually believe that what you describe is
> what is happening in most cases.  I believe that the scientificName field
> should always be the “give us the cleanest representation of the scientific
> name you have” and we could certainly provide a more useful set of
> priorities for how that is defined.  Regardless of what the DwC guidelines
> say, I think many data providers simply map their most closely aligned
> database column into the DwC view and we get whatever that happens to be,
> with whatever authorship it may contain.****
>
> ** **
>
> I’d agree that the clean name is much closer to what a web-enabled
> linked-data world needs and would happily endorse a move to make that the
> recommended form.  I just honestly believe that many providers will always
> give us something different.  Adding a new field will just shift the
> problem.****
>
> ** **
>
> At least that’s my perspective…****
>
> ** **
>
> All the best,****
>
> ** **
>
> Donald****
>
> ** **
>
> ** **
>
> ----------------------------------------------------------------------****
>
> Donald Hobern - GBIF Director - dhobern at gbif.org ****
>
> Global Biodiversity Information Facility http://www.gbif.org/ ****
>
> GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark****
>
> Tel: +45 3532 1471  Mob: +45 2875 1471  Fax: +45 2875 1480****
>
> ----------------------------------------------------------------------****
>
> ** **
>
> *From:* tdwg-content-bounces at lists.tdwg.org [mailto:
> tdwg-content-bounces at lists.tdwg.org] *On Behalf Of *Roderic Page
> *Sent:* Wednesday, March 14, 2012 5:50 PM
> *To:* TDWG Content Mailing List
> *Subject:* Re: [tdwg-content] canonicalScientificName****
>
> ** **
>
> Dear Donald,****
>
> ** **
>
> I couldn't disagree more!****
>
> ** **
>
> It seems to me that this is one case where needs of consumers and
> providers align pretty well.****
>
> ** **
>
> If I'm publishing data I want to avoid hassle, and one hassle is finding
> the taxonomic authorities for names. Then there is the issue of how to
> write the authority. There are so many variables: do I include diacritic
> characters? is the person's name abbreviated? what is the correct date?
> should I use parentheses? should I use commas? If I can just publish the
> canonical name life is simpler.****
>
> ** **
>
> As a consumer I can't trust people to get the authority right. Publishers
> get the taxonomic names wrong, and they will certainly make a mess of the
> authority.****
>
> ** **
>
> So, if we mandate clean names we are saying to providers "give me this"***
> *
>
> ** **
>
> <taxonomic name>****
>
> [some scope for crap]****
>
> ** **
>
> Instead, we've mandated "give me this"****
>
> ** **
>
> <taxonomic name>      +    <authority>****
>
> [some scope for crap]  +  [huge scope for crap]****
>
> ** **
>
> Why? Why would we do this to ourselves? Why do we think it's OK to have
> databases full of duplicates such as these (from the ION database)?:****
>
> ** **
>
> Pseudopaludicola Miranda Ribeiro 1926****
>
> Pseudopaludicola Mir. Ribeiro 1926****
>
> Pseudopaludicola Miranda-Ribeiro 1926****
>
> ** **
>
> One consequence of this is that we have projects like
> http://globalnames.org project, which is essentially collecting endless
> variations on authority strings. In other words, trying to clean up a mess
> essentially of our own making. ****
>
> ** **
>
> By all means have a field for taxonomic authority, but keep that separate
> from the canonical taxonomic name. In the real world, the canonical name is
> what people use. If we want people to make data available, make it simple.
> If we want people to use data make it simple.****
>
> ** **
>
> Regards****
>
> ** **
>
> Rod****
>
> ** **
>
> ** **
>
> ** **
>
> On 14 Mar 2012, at 11:16, Donald Hobern wrote:****
>
>
>
> ****
>
> Hi Peter.****
>
> ** **
>
> I certainly sympathise with the desire for a readily-consumed naked
> scientific name field.  However, unless the canonicalScientificName element
> is enforced as a mandatory field (which would in itself impact some data
> publishers and may prevent them validly sharing their data without extra
> work to provide clean scientific names), it will be yet another element
> which data consumers must check.  If canonicalScientificName is supplied,
> consumers will still need to handle cases where it is malformed.  If is not
> supplied, they will need to ignore the record or else do precisely what
> they do today with the scientificName field.  ****
>
> ** **
>
> I therefore worry that adding this field could in fact make the task more
> complex, rather than simpler, for data consumers.****
>
> ** **
>
> Thanks,****
>
> ** **
>
> Donald****
>
>  ****
>
> ----------------------------------------------------------------------****
>
> Donald Hobern - GBIF Director - dhobern at gbif.org****
>
> Global Biodiversity Information Facility http://www.gbif.org/****
>
> GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark****
>
> Tel: +45 3532 1471  Mob: +45 2875 1471  Fax: +45 2875 1480****
>
> ----------------------------------------------------------------------****
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content****
>
> ** **
>
> ---------------------------------------------------------
> Roderic Page
> Professor of Taxonomy
> Institute of Biodiversity, Animal Health and Comparative Medicine
> College of Medical, Veterinary and Life Sciences
> Graham Kerr Building
> University of Glasgow
> Glasgow G12 8QQ, UK
>
> Email: r.page at bio.gla.ac.uk
> Tel: +44 141 330 4778
> Fax: +44 141 330 2792****
>
> Skype: rdmpage
> AIM: rodpage1962 at aim.com
> Facebook: http://www.facebook.com/profile.php?id=1112517192
> Twitter: http://twitter.com/rdmpage
> Blog: http://iphylo.blogspot.com
> Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html****
>
> ** **
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>


-- 
Peter Desmet
Biodiversity Informatics Manager
Canadensys - www.canadensys.net

Université de Montréal Biodiversity Centre
4101 rue Sherbrooke est
Montreal, QC, H1X2B2
Canada

Phone: 514-343-6111 #82354
Fax: 514-343-2288
Email: peter.desmet at umontreal.ca / peter.desmet.cubc at gmail.com
Skype: anderhalv
Public profile: http://www.linkedin.com/in/peterdesmet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20120314/20656ba1/attachment.html 


More information about the tdwg-tag mailing list