I feel somewhat confused by this. The actual situation looks
eminently simple to me:
1) In a database there sits a text string (in any of a bewildering
number of shapes). The straightforward thing would be to call
this something like
NameInDatabase: Dalbergia baroni Baker
or
VerbatimNameInDatabase: Dalbergia baroni Baker
2) There also is the scientific name, as governed by the relevant
nomenclatural Code. The straightforward thing would be to call this
ScientificName: Dalbergia baronii
3) Finally, there is whatever intermediate result the algorithms that
are being used are converting the harvested text string into. I suppose
I would call this something like
ParserResult: Dalbergia baroni
or
ParserResultForScientificName: Dalbergia baroni
Obviously, it would be nice if algorithms did exist which could convert
a text string into a scientific name, but this still lies in the future.
In the mean time, I do not see why all the complexity needs to come in?
Paul van Rijckevorsel
-----Oorspronkelijk bericht-----
Van: tdwg-content-bounces@lists.tdwg.org namens David Remsen (GBIF)
Verzonden: wo 8-12-2010 17:09
Aan: tdwg-content@lists.tdwg.org List
Onderwerp: [tdwg-content] proposed term: dwc:verbatimScientificName
Markus and I wanted to try to consolidate the issues related to the
current use and definition of scientificName that have been the focus
of last weeks discussion in as simple a way as we can and leave it
with a simple proposal which we will add to the issue tracking on the
project site.
1. We propose that a new term, dwc:verbatimScientificName carry the
existing definition for dwc:scientificName and
2. dwc:scientificName follow the more accepted convention that is
better represented by the earlier proposed definition for Canonical Name
The intention is to enable data publishers to distinguish unparsed,
complex scientific names from more cleanly separated scientific name
data. This will relieve consumers of these data from testing each
instance of a name for one of these two conditions.
Here are the definitions for the two existing terms that have been
part of the discussion:
dwc:scientificName - The full scientific name, with authorship and
date information if known. When forming part of an Identification,
this should be the name in lowest level taxonomic rank that can be
determined. This term should not contain identification
qualifications, which should instead be supplied in the
IdentificationQualifier term.
dwc:scientificNameAuthorship - The authorship information for the
scientificName formatted according to the conventions of the
applicable nomenclaturalCode.
Here are terms and definitions used in the following 5 source data
configurations we came up with. They don't have to be exact for this
purpose.
canonical name - The nomenclatural components of a scentific name
without authorship information.
authorship - the authorship information that follows a scientific name
verbatim name - the verbatim text stored in a source database when it
differs from or combines the two definitions above. This is a bit
more broad than the def for scientificName.
We identified the following configurations in a source database and
how they would be mapped to the existing terms. In cases 4 and 5 we
also propose how we would map these were there a 3rd available term
(called 'mapping b:')
When a source database contains:
1. canonical names only
Mapping: canonical name -> dwc:scientificName
2. canonical name and authorship in two fields
Mapping: canonical name -> dwc:scientificName / authorship-
>dwc:scientificNameAuthorship
3. verbatim name only
Mapping: verbatim name -> dwc:scientificName
4. all three: canonical name, authorship, and verbatim name in 3 diff.
columns
Mapping a: verbatim name -> dwc:scientificName / authorship-
>dwc:scientificNameAuthorship
Mapping b: canonical name -> dwc:scientificName / authorship-
>dwc:scientificNameAuthorship / verbatim name ->
dwc:verbatimScientificName
5. a mix of canonical and verbatim names in a single column
Mapping a: verbatim name + canonical names -> dwc:scientificName
Mapping b: verbatim name + canonical names ->
dwc:verbatimScientificName
Summary - with the current two terms are left with no choice but to
support both canonical and verbatim names in a single term, which
makes consuming these data difficult.
We propose that a new term, dwc:verbatimScientificName carry the
existing definition for dwc:scientificName and that dwc:scientificName
follow the more accepted convention that is better represented by the
definition for Canonical Name
Best,
David Remsen / Markus Döring