[tdwg-content] Delimiters for Darwin Core list-type terms
John Wieczorek
tuco at berkeley.edu
Mon Oct 7 15:47:14 CEST 2013
I second the motion. Steve, is there any documention on the logic
behind the choice of delimiter for AC that will help people here?
On Mon, Oct 7, 2013 at 3:45 PM, Steve Baskauf
<steve.baskauf at vanderbilt.edu> wrote:
> I don't have an opinion about what the recommended delimiter should be, but
> I think it would be beneficial for there to be consistency between Darwin
> Core and Audubon Core. You can see what the recommendation is for Audubon
> Core at
> http://terms.gbif.org/wiki/Audubon_Core_%281.0_normative%29#Lists_of_plain_text_values
> - it's the pipe "|". Either Darwin Core should go with this, or if there is
> a consensus reached here that is different, then AC should be changed before
> it is ratified, which potentially could happen in a matter of weeks. It is
> highly likely that there will be records that are a mixture of AC and DwC,
> so it would not be a good thing for the recommendations to differ.
>
> Steve
>
> Markus Döring wrote:
>
> Hi John et al.,
>
> I would like to see a single recommended default delimiter, preferrably the
> semicolon as its natural and hardly used in values.
> For dwc archives there is a multiValueDelimiter attribute for every term
> mapping that allows to declare other delimiters if needed.
>
> Currently it is hardly possible to detect multi values in a field and you
> can just test for some often used ones but even then you never know if they
> were meant to be delimiters.
> Having a single default value helps to get the idea of multi values across
> and make it a bit more accessible I believe.
>
> dwc:vernacularName I would personally prefer to see as a single value term
> as it is mostly useful in combination with a locale and rarely is shared on
> its own.
> Seeing dwc:typeStatus being a multi value term also feels wrong as the name
> is in singluar while the others carry the multi value nature in the name
> already.
>
>
> Markus
>
>
>
> n 07.10.2013, at 12:28, John Wieczorek wrote:
>
>
>
> Dear all,
>
> On the list of pending Darwin Core issues is a topic of general
> concern about terms that could or do recommend the concatenation and
> delimiting of a list of values. The specific issue was submitted on
> the Darwin Core Project site at
> https://code.google.com/p/darwincore/issues/detail?id=168. Right now
> there is variation in the recommendations of distinct terms.
>
> The Darwin Core terms that could be used to hold lists include the
> following (use the index at
> http://rs.tdwg.org/dwc/terms/index.htm#theterms to find and see the
> details of each of these):
>
> informationWithheld
> dataGeneralizations
> dynamicProperties
> recordedBy
> preparations
> otherCatalogNumbers
> previousIdentifications
> associatedMedia
> associatedReferences
> associatedOccurrences
> associatedSequences
> associatedTaxa
> higherGeography
> georeferenceSources
> typeStatus
> higherClassification
> vernacularName
>
> There are some issues. Many terms do not show examples. Most of those
> that do show examples recommend semi-colon (';') -
> associatedOccurrences, recordedBy, preparations, otherCatalogNumbers,
> previousIdentifications, higherGeography, georeferenceSources, and
> higherClassification, The example for higherClassification does not
> have spaces after the semi-colon while all others do.
>
> Terms that could hold a list of URLs would require a delimiter that
> would be an invalid part of a URL unless it was escaped. This
> precludes comma (','), semi-colon (';'), and colon (':'), among
> others. One possibility here might be the vertical bar or "pipe"
> ('|').
>
> The term dynamicProperties is meant to take key-value pairs. The
> examples suggest the format key=value, with any list delimited by a
> semi-colon, for example, "tragusLengthInMeters=0.014;
> weightInGrams=120". The example for associatedTaxa also shows a
> key-value pair ("host: Quercus alba"), but it is formatted differently
> from the examples for dynamicProperties. There are other terms, such
> as vernacularName, which could potentially also take a key-value pair,
> though it is not currently recommended to be a list.
>
> Please ignore the issue of whether the idea of list-type terms is a
> good idea or not - that is not the issue we're trying to resolve here.
> Instead, the issue is whether a consistent recommendation can be made
> for how to delimit the values in a list. And if not a consistent
> recommendation, can we make specific recommendations for distinct
> terms? If specific recommendations can be made for a term, should that
> be reflected in examples within the term definitions, or should such
> recommendations reside only in Type 3 supplementary documentation such
> as that which can be found on the Darwin Core Project site at, for
> example,
> https://code.google.com/p/darwincore/wiki/Occurrence#associatedSequences?
> Should some of these terms have specific recommendations to contain
> only single values (e.g., vernacularName), in which case they are not
> really viable in Simple Darwin Core?
>
> Cheers,
>
> John
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
> .
>
>
>
>
> --
> Steven J. Baskauf, Ph.D., Senior Lecturer
> Vanderbilt University Dept. of Biological Sciences
>
> postal mail address:
> PMB 351634
> Nashville, TN 37235-1634, U.S.A.
>
> delivery address:
> 2125 Stevenson Center
> 1161 21st Ave., S.
> Nashville, TN 37235
>
> office: 2128 Stevenson Center
> phone: (615) 343-4582, fax: (615) 322-4942
> If you fax, please phone or email so that I will know to look for it.
> http://bioimages.vanderbilt.edu
More information about the tdwg-content
mailing list