I guess I need to do some more DwC homework.
Is the Genus element in current DwC to be used for the generic epithet of a binomial/trinomial, or for the Genus classification/rank of the species? If the Genus classification/rank and the generic epithet are different, does current DwC have a way to distinguish them?
Isn't an infragenericNameElement another uninomialNameElement, just like Tribe, Subtribe, Subfamily, and all the rest? (ie. name at the rank of genus and above)
A generic way to handle this would be firstEpithet, secondEpithet, thirdEpithet, etc. By using a string like infraspecificEpithet, an attribute is added to the position of the epithet. Or by using the string uninomialNameElement, a different attribute is added to the firstEpithet (which is the only epithet in that case). I'm probably overthinking it.
Chuck
-----Original Message----- From: Markus Döring [mailto:m.doering@mac.com] Sent: Wednesday, November 24, 2010 3:33 PM To: Richard Pyle Cc: Tony.Rees@csiro.au; Chuck Miller; tdwg-content@lists.tdwg.org; dmozzherin@eol.org Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad?
I think we are hitting the problem now quite well. A quick response to your suggestion, Rich. Mostly agree with your conclusions:
verbatimScientificName As I suggested in an earlier post, this would be "the complete set of textual elements useful for recognizing a unique scientific name", exactly as they appear in the original source.
yes for the definition, but Im not sure if removing scientificName from the dwc terms is a true option though. Its the most known term of all...
uninomialNameElement Used for all names at the rank of genus and above; would also replace "genus" in DwC.
Genus will still be needed to represent the denormalised classification, but not for the parsed bits.
infragenericNameElement Better term for "subgenus".
Probably same is true for subgenus
specificEpithet As in existing DwC.
infraspecificEpithet As in existing DwC.
scientificNameAuthorship As in existing DwC.
I don't really agree with Tony on the "clutter" argument for introducing a single "canonicalName" term to replace the parsed uninomialNameElement [aka "genus"], infragenericNameElement [aka "subgenus"], specificEpithet, and infraspecificEpithet. (Side question to Tony -- would canonicalName include "var.", "f." etc., hence obviating the need for TaxonRank as well?) After all, "Aus bus xus" requires exactly the same number of bytes as "Aus,bus,xus" in a DwCA file. Of course, if verbatimScientificName [aka scientificName] is required, we'd have redundancy and hence doubling of bytes. However, if defined as verbatimScientificName as above, it would not really be redundant information if the parsed bits were defined as representing the Code-corrected version of the name, and due to the fact that the verbatimScientificName will often be different from a canonical concatenation of the parsed bits according to some standard format/formula.
So, to me, the main questions to answer are:
- How does the existing DwC/DwCA structure fail to meet the needs of
providers and/or users, in terms of loss of information, potential for misrepresentation of information, or inefficient or ineffective transfer of information (i.e. overburdening either the provider or the client).
- What are the most effective and least disruptive ways to correct the
failures identified in #1 above, in terms of re-defining existing terms, vs. introducing new (and potentially redundant) terms, vs. a complete new set of terms that may be semantically less confusing to taxonomists (as above)?
Aloha, Rich
-----Original Message----- From: Markus Döring [mailto:m.doering@mac.com] Sent: Wednesday, November 24, 2010 2:30 AM To: Richard Pyle Cc: Tony.Rees@csiro.au; Chuck.Miller@mobot.org; tdwg- content@lists.tdwg.org; dmozzherin@eol.org Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwCscientificName: good or bad?
I just had a quick look at the first few thousand data records coming into OBIS for my region (Australia). Just about every supplier who includes authority as dwc:scientificNameAuthor has used dwc:scientificName "incorrectly" i.e., for the canonical name not the canonical name + author. This data then flows into GBIF, ALA, etc. and circulates in this form. So "users" are already ignoring the definition of dwc:scientificName in practice, it would seem, with no apparent ill effects (?) - not sure whether this is good or bad, hence the title of my original question which prompted this thread...
OK, so here's the question:
Is it more disruptive to re-define dwc:scientificName to explicitly exclude authorship?
Thats definitely something Id like to avoid! We really need one place to keep the most explicit form of the name.
From seeing real data coming in I would coin the definition for
scientificName
that it should *contain the most complete, verbatim name string*. If you happen to have only a canonical, use the canonical. If you happen
to
have canonical + authorship parsed, join them if you can (its usually not
a
simple concatenation, beware).
Markus
Or, is it more disruptive to leave the existing (loose) definition of scientificName intact, and create more term(s) with more precise meanings, which we feel can help facilitate sharing of infomration?
Rich
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content