My vote would be to clarify the use of scientific name to not include authorship as Rich suggests.

Perhaps a partial solution would be for the GNI or GBIF to provide some web service that end users could use to clean and parse their names into their dwc:scientificname and authorship parts. (They probably have something close to this already)

For ease of use the system could output something like this

Puma concolor <tab>  (Linnaeus 1771) 

In the process they could flag potentially incorrect uses of parenthesis etc.

Puma concolor <tab>  Linnaeus 1771 <tab> Note Potentially incorrect authorship - parenthesis missing

or

Felis concolor <tab>  Linnaeus 1771 <tab> Note Do you mean "Puma concolor  (Linnaeus 1771)"

A beneficial side effect would be that everyone has a more normalized and accurate species list.

Respectfully,

- Pete

On Tue, Nov 23, 2010 at 6:40 PM, <Tony.Rees@csiro.au> wrote:
Rich,

No need to apologise... Actually it affects the aggregators in two respects, one is the larger vs. more compact data representation, the other is the present inconsistency about what is actually expected/supplied in practice by real world data providers in the present "scientificName" element. If it was clearer that this was for sciname + author, and the sciname without author had its own dedicated element, the incoming data would (might) be potentially a lot more consistent.

Basically it is the present "scientificNameAuthor" element which is clouding the issue - people see this and then think they do not need to add the author in to "scientificName" as well, although as previously stated by Markus this is technically incorrect according to the DwC spec (and I can see the argument for keeping it that way, so as to capture as much info as possible in that field).

Cheers - Tony


> -----Original Message-----
> From: Richard Pyle [mailto:deepreef@bishopmuseum.org]
> Sent: Wednesday, 24 November 2010 11:27 AM
> To: Rees, Tony (CMAR, Hobart); Chuck.Miller@mobot.org; dremsen@gbif.org
> Cc: tdwg-content@lists.tdwg.org; dmozzherin@eol.org
> Subject: RE: [tdwg-content] [tdwg-tag] Inclusion of authorship in
> DwCscientificName: good or bad?
>
>
> OK, understood.
>
> But I guess my next question would be: is this really "bloat"?  Isn't the
> cost of the bloat much less than the value of providing fully parsed
> content?
>
> I now understand what I think is a large part of the basis for our
> (perhaps
> non-existent?) disagreement: I'm thinking of dwc terms in the abstract
> sense, whereas you are thinking in terms of more practical issues such as
> the MB size of your DwCA files.  This also clarifies for me why you keep
> saying that it's really a question for the big aggregators (which I now
> understand and agree with).
>
> Sorry if I was misunderstanding where you are coming from on this!
>
> Aloha,
> Rich
>

_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content



--
---------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
About the GeoSpecies Knowledge Base
------------------------------------------------------------