[tdwg-content] Delimiters for Darwin Core list-type terms

Steve Baskauf steve.baskauf at vanderbilt.edu
Mon Oct 7 15:45:28 CEST 2013


I don't have an opinion about what the recommended delimiter should be, 
but I think it would be beneficial for there to be consistency between 
Darwin Core and Audubon Core.  You can see what the recommendation is 
for Audubon Core at 
http://terms.gbif.org/wiki/Audubon_Core_%281.0_normative%29#Lists_of_plain_text_values 
- it's the pipe "|".  Either Darwin Core should go with this, or if 
there is a consensus reached here that is different, then AC should be 
changed before it is ratified, which potentially could happen in a 
matter of weeks.  It is highly likely that there will be records that 
are a mixture of AC and DwC, so it would not be a good thing for the 
recommendations to differ.

Steve

Markus Döring wrote:
> Hi John et al.,
>
> I would like to see a single recommended default delimiter, preferrably the semicolon as its natural and hardly used in values. 
> For dwc archives there is a multiValueDelimiter attribute for every term mapping that allows to declare other delimiters if needed.
>
> Currently it is hardly possible to detect multi values in a field and you can just test for some often used ones but even then you never know if they were meant to be delimiters.
> Having a single default value helps to get the idea of multi values across and make it a bit more accessible I believe.
>
> dwc:vernacularName I would personally prefer to see as a single value term as it is mostly useful in combination with a locale and rarely is shared on its own.
> Seeing dwc:typeStatus being a multi value term also feels wrong as the name is in singluar while the others carry the multi value nature in the name already. 
>
>
> Markus
>
>
>
> n 07.10.2013, at 12:28, John Wieczorek wrote:
>
>   
>> Dear all,
>>
>> On the list of pending Darwin Core issues is a topic of general
>> concern about terms that could or do recommend the concatenation and
>> delimiting of a list of values. The specific issue was submitted on
>> the Darwin Core Project site at
>> https://code.google.com/p/darwincore/issues/detail?id=168. Right now
>> there is variation in the recommendations of distinct terms.
>>
>> The Darwin Core terms that could be used to hold lists include the
>> following (use the index at
>> http://rs.tdwg.org/dwc/terms/index.htm#theterms to find and see the
>> details of each of these):
>>
>> informationWithheld
>> dataGeneralizations
>> dynamicProperties
>> recordedBy
>> preparations
>> otherCatalogNumbers
>> previousIdentifications
>> associatedMedia
>> associatedReferences
>> associatedOccurrences
>> associatedSequences
>> associatedTaxa
>> higherGeography
>> georeferenceSources
>> typeStatus
>> higherClassification
>> vernacularName
>>
>> There are some issues. Many terms do not show examples. Most of those
>> that do show examples recommend semi-colon (';') -
>> associatedOccurrences, recordedBy, preparations, otherCatalogNumbers,
>> previousIdentifications, higherGeography, georeferenceSources, and
>> higherClassification, The example for higherClassification does not
>> have spaces after the semi-colon while all others do.
>>
>> Terms that could hold a list of URLs would require a delimiter that
>> would be an invalid part of a URL unless it was escaped. This
>> precludes comma (','), semi-colon (';'), and colon (':'), among
>> others. One possibility here might be the vertical bar or "pipe"
>> ('|').
>>
>> The term dynamicProperties is meant to take key-value pairs. The
>> examples suggest the format key=value, with any list delimited by a
>> semi-colon, for example, "tragusLengthInMeters=0.014;
>> weightInGrams=120". The example for associatedTaxa also shows a
>> key-value pair ("host: Quercus alba"), but it is formatted differently
>> from the examples for dynamicProperties. There are other terms, such
>> as vernacularName, which could potentially also take a key-value pair,
>> though it is not currently recommended to be a list.
>>
>> Please ignore the issue of whether the idea of list-type terms is a
>> good idea or not - that is not the issue we're trying to resolve here.
>> Instead, the issue is whether a consistent recommendation can be made
>> for how to delimit the values in a list. And if not a consistent
>> recommendation, can we make specific recommendations for distinct
>> terms? If specific recommendations can be made for a term, should that
>> be reflected in examples within the term definitions, or should such
>> recommendations reside only in Type 3 supplementary documentation such
>> as that which can be found on the Darwin Core Project site at, for
>> example, https://code.google.com/p/darwincore/wiki/Occurrence#associatedSequences?
>> Should some of these terms have specific recommendations to contain
>> only single values (e.g., vernacularName), in which case they are not
>> really viable in Simple Darwin Core?
>>
>> Cheers,
>>
>> John
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>     
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
> .
>
>   

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
PMB 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20131007/037375e1/attachment.html 


More information about the tdwg-content mailing list