[tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad?

David Remsen (GBIF) dremsen at gbif.org
Wed Nov 24 09:31:19 CET 2010


Several name parsing services exist to provide this functionality

http://tools.gbif.org/nameparser/
http://gni.globalnames.org/parsers/new

My personal philosophy regarding data sharing is that,  whenever  
practical, the burden should be placed on the enabling infrastructure  
and not the person wanting to share their data.   Put enough  
impediments in the way and the pipes will stay empty.

I prefer a loose definition for using scientificName,  that recommends  
parsing authority information into the authorship element but does not  
require it.   The services for parsing could be made more easily  
available and even incorporated into publishing tools.   I think we  
will be dealing with this issue whether canonicalName is available or  
not.

For GBIF at least,  if dealing with authorship issues like this were  
the biggest data processing issue we faced,  we would take it.   If  
you want to see what we are dealing with at the moment,  for example,   
the values in specimen and observation data that are included in the  
dwc:Family element,  its a bit of an eye opener and I have a file!

DR

On Nov 24, 2010, at 4:46 AM, Peter DeVries wrote:

> My vote would be to clarify the use of scientific name to not  
> include authorship as Rich suggests.
>
> Perhaps a partial solution would be for the GNI or GBIF to provide  
> some web service that end users could use to clean and parse their  
> names into their dwc:scientificname and authorship parts. (They  
> probably have something close to this already)
>
> For ease of use the system could output something like this
>
> Puma concolor <tab>  (Linnaeus 1771)
>
> In the process they could flag potentially incorrect uses of  
> parenthesis etc.
>
> Puma concolor <tab>  Linnaeus 1771 <tab> Note Potentially incorrect  
> authorship - parenthesis missing
>
> or
>
> Felis concolor <tab>  Linnaeus 1771 <tab> Note Do you mean "Puma  
> concolor  (Linnaeus 1771)"
>
> A beneficial side effect would be that everyone has a more  
> normalized and accurate species list.
>
> Respectfully,
>
> - Pete
>
> On Tue, Nov 23, 2010 at 6:40 PM, <Tony.Rees at csiro.au> wrote:
> Rich,
>
> No need to apologise... Actually it affects the aggregators in two  
> respects, one is the larger vs. more compact data representation,  
> the other is the present inconsistency about what is actually  
> expected/supplied in practice by real world data providers in the  
> present "scientificName" element. If it was clearer that this was  
> for sciname + author, and the sciname without author had its own  
> dedicated element, the incoming data would (might) be potentially a  
> lot more consistent.
>
> Basically it is the present "scientificNameAuthor" element which is  
> clouding the issue - people see this and then think they do not need  
> to add the author in to "scientificName" as well, although as  
> previously stated by Markus this is technically incorrect according  
> to the DwC spec (and I can see the argument for keeping it that way,  
> so as to capture as much info as possible in that field).
>
> Cheers - Tony
>
>
> > -----Original Message-----
> > From: Richard Pyle [mailto:deepreef at bishopmuseum.org]
> > Sent: Wednesday, 24 November 2010 11:27 AM
> > To: Rees, Tony (CMAR, Hobart); Chuck.Miller at mobot.org; dremsen at gbif.org
> > Cc: tdwg-content at lists.tdwg.org; dmozzherin at eol.org
> > Subject: RE: [tdwg-content] [tdwg-tag] Inclusion of authorship in
> > DwCscientificName: good or bad?
> >
> >
> > OK, understood.
> >
> > But I guess my next question would be: is this really "bloat"?   
> Isn't the
> > cost of the bloat much less than the value of providing fully parsed
> > content?
> >
> > I now understand what I think is a large part of the basis for our
> > (perhaps
> > non-existent?) disagreement: I'm thinking of dwc terms in the  
> abstract
> > sense, whereas you are thinking in terms of more practical issues  
> such as
> > the MB size of your DwCA files.  This also clarifies for me why  
> you keep
> > saying that it's really a question for the big aggregators (which  
> I now
> > understand and agree with).
> >
> > Sorry if I was misunderstanding where you are coming from on this!
> >
> > Aloha,
> > Rich
> >
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>
>
> -- 
> ---------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
> About the GeoSpecies Knowledge Base
> ------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20101124/6de0db43/attachment.html 


More information about the tdwg-content mailing list