[tdwg-content] Darwin Core vernacularName field

Bob Morris morris.bob at gmail.com
Thu Jul 21 18:13:06 CEST 2011


There's a general issue with repeated attributes in a metadata record
of any kind.  Depending on the representation language, when there is
more than one such thing in the record, it can be difficult to specify
any linkages between them when they are semantically related.

One general solution is to have multiple metadata records for the same
resource. This can be costly if there is a powerful reason that every
such record should carry the complete set of attributes except for the
repeated ones, but in the case you put on the table, I think the only
powerful reason would take the form "There are a lot of stupid DwC
applications out there that might discover a record that has nothing
in it but, say, the French vernacular name and a resourceID, and stop
there without ever looking for/at another record with the same
resourceID and more comprehensive metadata, and integrating the
results at the application level."

A response might be "But the point of simple DwC is to support simple
applications." But "simple application" is not the same thing as
"simple minded application", and my guess is that addressing the issue
of multiple metadata records at the application side is, for many
applications, less programming effort than other workarounds.


Bob Morris


On Thu, Jul 21, 2011 at 11:23 AM, Geoffrey Allen <gsallen at unb.ca> wrote:
> Greeting,
> I have recently begun the process of digitising the 60,000 specimen vouchers
> from the UNB herbarium. The textual data for 40,000+ of those has already
> been entered into a database, and I am now trying to map those values to DwC
> so that we may share the data with other collections.
> I have some concern over the fact that simple DwC does not allow the
> repetition or extension of certain fields. The vernacularName field is a
> particular problem. New Brunswick is Canada's only officially bilingual
> province, as such, our specimens are all identified with both their English
> and French common names in the database. It would be very useful if we could
> extend DwC, creating something along the lines of <vernacularName lang=en>,
> or allow nesting of elements, perhaps in the form:
> <vernacularName>
> <English>Chives</English>
> <French>Ciboulette, brulotte</French>
> </vernacularName>
> The other option, as I see it, is that we store the English and French
> common names in our own fields, and then concatenate the two to create the
> DwC:vernacularName field. I see this option as less than ideal since it may
> hinder search/browsability. It may also cause a host of other problems from
> interpreting to storing the data. The herbarium with whom we first intent to
> share the data has already expressed a concern that their system cannot
> handle the diacritics found in many of the French names (!). They would like
> the Eng. common names, but not the French. This is more difficult to achieve
> if we concat the values.
> One additional thought is that the herbarium's imprint, _Flora of New
> Brunswick_, also includes common names in Maliseet and Mi'kmaq wherever
> possible. Although these two aboriginal languages do not currently exist in
> the dataset we are using, there is the potential that they may be added at
> some point in the future.
> It seems to me that the repetition of fields may be necessary in other
> instances too. I am having some difficulty figuring out how to record all
> the location data we have for the specimens, which are indicated using
> verbal descriptions, Lat/Long, UTM, and NTS coordinates - in many cases
> using all 4 for a single sample, but I will save the details for another
> posting.
> I will watch for the group's thoughts on this problem.
> Many thanks,
> Geoffrey
> --------------------------------------------
> Geoffrey Allen
> Digital Projects Librarian
> Electronic Text Centre
> Harriet Irving Library
> University of New Brunswick
> Fredericton, NB  E3B 5H5
> Tel: (506) 447-3250
> Fax: (506) 453-4595
> gsallen at unb.ca
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>



-- 
Robert A. Morris

Emeritus Professor  of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
IT Staff
Filtered Push Project
Department of Organismal and Evolutionary Biology
Harvard University


email: morris.bob at gmail.com
web: http://efg.cs.umb.edu/
web: http://etaxonomy.org/mw/FilteredPush
http://www.cs.umb.edu/~ram
phone (+1) 857 222 7992 (mobile)


More information about the tdwg-content mailing list