<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><blockquote type="cite"><div>I kind of expected it was futile to make the plea "Please ignore the<br>issue of whether the idea of list-type terms is a<br>good idea or not - that is not the issue we're trying to resolve<br>here." I had to try.<br></div></blockquote><div><div><br></div><div>Sorry John, I fell into your trap.</div><div><br><blockquote type="cite"></blockquote></div></div><div>Surely though, talking about solutions to problems rather than the problem itself is the _worst_ option available to us?</div><div><br></div><div>Consider the likes of this list term <a href="http://rs.tdwg.org/dwc/terms/#typeStatus">http://rs.tdwg.org/dwc/terms/#typeStatus</a></div><div>The description suggests a separated and concatenated list but the example (unless I misunderstand) is showing only 1 list item which is a triplet of "type + author + pub" in a human readable form. This one field is actually suggesting a structure of a repeatable triplet, so need 2 delimiters if machines are to extract the scientific name for the typification . Perhaps these terms are really just verbatim text blocks intended for human consumption (which is fine with me, and we don't need to define delimiters)? Or perhaps we should be discussing terms to atomize them further (e.g. introduce dwc:typeName and dwc:typePublication)?</div><div><br></div><div>If we are heading this way though, can I also suggest we consider declaring the expected ordering on lists where omitted? The likes of <a href="http://rs.tdwg.org/dwc/terms/#higherGeography">http://rs.tdwg.org/dwc/terms/#higherGeography</a> doesn't have one whereas <a href="http://rs.tdwg.org/dwc/terms/#higherGeography">http://rs.tdwg.org/dwc/terms/#higherGeography</a> does.</div><div><br></div><br><blockquote type="cite"><div>There are definitely more rigorous ways to share the information than<br>in concatenated lists. The "list" terms are just as Tim describes, an<br>attempt to share in a flat data structure data that do not fit well in<br>a flat structure, but are nevertheless of common interest. There<br>probably shouldn't be an expectation that one could process the<br>content of such fields and derive individual values, as we can't even<br>get simple content under control yet (see<br><a href="http://soyouthinkyoucandigitize.wordpress.com/2013/07/18/data-diversity-of-the-week-sex/">http://soyouthinkyoucandigitize.wordpress.com/2013/07/18/data-diversity-of-the-week-sex/</a>).<br><br>Nevertheless, these terms do exist, and they expect lists, and people<br>are using them in distinct ways that make them a challenge to process.<br>It would be nice to give guidance. I have no problem if that guidance<br>stays out of the term definitions, but we have a legacy problem of<br>definitions that tell us that the content should consist of a<br>delimited list.<br></div></blockquote><div><br></div><div>Other than deprecating and redefining as new concepts (terms), I don't see any robust way I am afraid. Some things are just not meant to be denormalized.</div><div><br></div><div>Cheers,</div><div>Tim</div><div><br></div><div><br></div><div><br></div><br><blockquote type="cite"><div><br>On Mon, Oct 7, 2013 at 4:02 PM, Tim Robertson [GBIF]<br><<a href="mailto:trobertson@gbif.org">trobertson@gbif.org</a>> wrote:<br><blockquote type="cite">I suspect any attempt to find a universal delimiter will be flakey at best -<br></blockquote><blockquote type="cite">see Unicode character 1 as an example [1].<br></blockquote><blockquote type="cite">I would urge DwC to stop at only defining the concept of each term and leave<br></blockquote><blockquote type="cite">it to the serialization formats, schema definitions, data models etc (e.g.<br></blockquote><blockquote type="cite">DwC-A, XML, RDF, JSON, HTML, excel templates etc) to define those kind of<br></blockquote><blockquote type="cite">things.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">If you were to design an XML schema you would use things like:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><tim:identifications><br></blockquote><blockquote type="cite"> <dwc:scientificName>A</dwc:scientificName><br></blockquote><blockquote type="cite"> <dwc:scientificName>B</dwc:scientificName><br></blockquote><blockquote type="cite"> <dwc:scientificName>C</dwc:scientificName><br></blockquote><blockquote type="cite"></tim:identifications><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">and not:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><tim:identifications><br></blockquote><blockquote type="cite"> <dwc:scientificName>A|B|C</dwc:scientificName><br></blockquote><blockquote type="cite"></tim:identifications><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">I don't think it wise for the DwC standard to suggest anyone should.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">I suspect this request stems from those working with denormalized data<br></blockquote><blockquote type="cite">structures, and trying to shoe-horn all data into flat structures (e.g.<br></blockquote><blockquote type="cite">DwC-A). I think that is a dangerous path to go down, and makes things more<br></blockquote><blockquote type="cite">difficult for both producers and consumers. Very quickly you will get into<br></blockquote><blockquote type="cite">the situation where you will want to also suggest "well the element at index<br></blockquote><blockquote type="cite">[0] of field X should be interpreted as the index [0] for field Y" (e.g.<br></blockquote><blockquote type="cite">identifications and identification dates).<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Cheers,<br></blockquote><blockquote type="cite">Tim<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">[1] <a href="http://www.fileformat.info/info/unicode/char/1f/index.htm">http://www.fileformat.info/info/unicode/char/1f/index.htm</a><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">On Oct 7, 2013, at 3:45 PM, Steve Baskauf wrote:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">I don't have an opinion about what the recommended delimiter should be, but<br></blockquote><blockquote type="cite">I think it would be beneficial for there to be consistency between Darwin<br></blockquote><blockquote type="cite">Core and Audubon Core. You can see what the recommendation is for Audubon<br></blockquote><blockquote type="cite">Core at<br></blockquote><blockquote type="cite"><a href="http://terms.gbif.org/wiki/Audubon_Core_%281.0_normative%29#Lists_of_plain_text_values">http://terms.gbif.org/wiki/Audubon_Core_%281.0_normative%29#Lists_of_plain_text_values</a><br></blockquote><blockquote type="cite">- it's the pipe "|". Either Darwin Core should go with this, or if there is<br></blockquote><blockquote type="cite">a consensus reached here that is different, then AC should be changed before<br></blockquote><blockquote type="cite">it is ratified, which potentially could happen in a matter of weeks. It is<br></blockquote><blockquote type="cite">highly likely that there will be records that are a mixture of AC and DwC,<br></blockquote><blockquote type="cite">so it would not be a good thing for the recommendations to differ.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Steve<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Markus Döring wrote:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Hi John et al.,<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">I would like to see a single recommended default delimiter, preferrably the<br></blockquote><blockquote type="cite">semicolon as its natural and hardly used in values.<br></blockquote><blockquote type="cite">For dwc archives there is a multiValueDelimiter attribute for every term<br></blockquote><blockquote type="cite">mapping that allows to declare other delimiters if needed.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Currently it is hardly possible to detect multi values in a field and you<br></blockquote><blockquote type="cite">can just test for some often used ones but even then you never know if they<br></blockquote><blockquote type="cite">were meant to be delimiters.<br></blockquote><blockquote type="cite">Having a single default value helps to get the idea of multi values across<br></blockquote><blockquote type="cite">and make it a bit more accessible I believe.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">dwc:vernacularName I would personally prefer to see as a single value term<br></blockquote><blockquote type="cite">as it is mostly useful in combination with a locale and rarely is shared on<br></blockquote><blockquote type="cite">its own.<br></blockquote><blockquote type="cite">Seeing dwc:typeStatus being a multi value term also feels wrong as the name<br></blockquote><blockquote type="cite">is in singluar while the others carry the multi value nature in the name<br></blockquote><blockquote type="cite">already.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Markus<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">n 07.10.2013, at 12:28, John Wieczorek wrote:<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Dear all,<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">On the list of pending Darwin Core issues is a topic of general<br></blockquote><blockquote type="cite">concern about terms that could or do recommend the concatenation and<br></blockquote><blockquote type="cite">delimiting of a list of values. The specific issue was submitted on<br></blockquote><blockquote type="cite">the Darwin Core Project site at<br></blockquote><blockquote type="cite"><a href="https://code.google.com/p/darwincore/issues/detail?id=168">https://code.google.com/p/darwincore/issues/detail?id=168</a>. Right now<br></blockquote><blockquote type="cite">there is variation in the recommendations of distinct terms.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">The Darwin Core terms that could be used to hold lists include the<br></blockquote><blockquote type="cite">following (use the index at<br></blockquote><blockquote type="cite"><a href="http://rs.tdwg.org/dwc/terms/index.htm#theterms">http://rs.tdwg.org/dwc/terms/index.htm#theterms</a> to find and see the<br></blockquote><blockquote type="cite">details of each of these):<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">informationWithheld<br></blockquote><blockquote type="cite">dataGeneralizations<br></blockquote><blockquote type="cite">dynamicProperties<br></blockquote><blockquote type="cite">recordedBy<br></blockquote><blockquote type="cite">preparations<br></blockquote><blockquote type="cite">otherCatalogNumbers<br></blockquote><blockquote type="cite">previousIdentifications<br></blockquote><blockquote type="cite">associatedMedia<br></blockquote><blockquote type="cite">associatedReferences<br></blockquote><blockquote type="cite">associatedOccurrences<br></blockquote><blockquote type="cite">associatedSequences<br></blockquote><blockquote type="cite">associatedTaxa<br></blockquote><blockquote type="cite">higherGeography<br></blockquote><blockquote type="cite">georeferenceSources<br></blockquote><blockquote type="cite">typeStatus<br></blockquote><blockquote type="cite">higherClassification<br></blockquote><blockquote type="cite">vernacularName<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">There are some issues. Many terms do not show examples. Most of those<br></blockquote><blockquote type="cite">that do show examples recommend semi-colon (';') -<br></blockquote><blockquote type="cite">associatedOccurrences, recordedBy, preparations, otherCatalogNumbers,<br></blockquote><blockquote type="cite">previousIdentifications, higherGeography, georeferenceSources, and<br></blockquote><blockquote type="cite">higherClassification, The example for higherClassification does not<br></blockquote><blockquote type="cite">have spaces after the semi-colon while all others do.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Terms that could hold a list of URLs would require a delimiter that<br></blockquote><blockquote type="cite">would be an invalid part of a URL unless it was escaped. This<br></blockquote><blockquote type="cite">precludes comma (','), semi-colon (';'), and colon (':'), among<br></blockquote><blockquote type="cite">others. One possibility here might be the vertical bar or "pipe"<br></blockquote><blockquote type="cite">('|').<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">The term dynamicProperties is meant to take key-value pairs. The<br></blockquote><blockquote type="cite">examples suggest the format key=value, with any list delimited by a<br></blockquote><blockquote type="cite">semi-colon, for example, "tragusLengthInMeters=0.014;<br></blockquote><blockquote type="cite">weightInGrams=120". The example for associatedTaxa also shows a<br></blockquote><blockquote type="cite">key-value pair ("host: Quercus alba"), but it is formatted differently<br></blockquote><blockquote type="cite">from the examples for dynamicProperties. There are other terms, such<br></blockquote><blockquote type="cite">as vernacularName, which could potentially also take a key-value pair,<br></blockquote><blockquote type="cite">though it is not currently recommended to be a list.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Please ignore the issue of whether the idea of list-type terms is a<br></blockquote><blockquote type="cite">good idea or not - that is not the issue we're trying to resolve here.<br></blockquote><blockquote type="cite">Instead, the issue is whether a consistent recommendation can be made<br></blockquote><blockquote type="cite">for how to delimit the values in a list. And if not a consistent<br></blockquote><blockquote type="cite">recommendation, can we make specific recommendations for distinct<br></blockquote><blockquote type="cite">terms? If specific recommendations can be made for a term, should that<br></blockquote><blockquote type="cite">be reflected in examples within the term definitions, or should such<br></blockquote><blockquote type="cite">recommendations reside only in Type 3 supplementary documentation such<br></blockquote><blockquote type="cite">as that which can be found on the Darwin Core Project site at, for<br></blockquote><blockquote type="cite">example,<br></blockquote><blockquote type="cite"><a href="https://code.google.com/p/darwincore/wiki/Occurrence#associatedSequences?">https://code.google.com/p/darwincore/wiki/Occurrence#associatedSequences?</a><br></blockquote><blockquote type="cite">Should some of these terms have specific recommendations to contain<br></blockquote><blockquote type="cite">only single values (e.g., vernacularName), in which case they are not<br></blockquote><blockquote type="cite">really viable in Simple Darwin Core?<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">Cheers,<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">John<br></blockquote><blockquote type="cite">_______________________________________________<br></blockquote><blockquote type="cite">tdwg-content mailing list<br></blockquote><blockquote type="cite"><a href="mailto:tdwg-content@lists.tdwg.org">tdwg-content@lists.tdwg.org</a><br></blockquote><blockquote type="cite"><a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">_______________________________________________<br></blockquote><blockquote type="cite">tdwg-content mailing list<br></blockquote><blockquote type="cite"><a href="mailto:tdwg-content@lists.tdwg.org">tdwg-content@lists.tdwg.org</a><br></blockquote><blockquote type="cite"><a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">--<br></blockquote><blockquote type="cite">Steven J. Baskauf, Ph.D., Senior Lecturer<br></blockquote><blockquote type="cite">Vanderbilt University Dept. of Biological Sciences<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">postal mail address:<br></blockquote><blockquote type="cite">PMB 351634<br></blockquote><blockquote type="cite">Nashville, TN 37235-1634, U.S.A.<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">delivery address:<br></blockquote><blockquote type="cite">2125 Stevenson Center<br></blockquote><blockquote type="cite">1161 21st Ave., S.<br></blockquote><blockquote type="cite">Nashville, TN 37235<br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">office: 2128 Stevenson Center<br></blockquote><blockquote type="cite">phone: (615) 343-4582, fax: (615) 322-4942<br></blockquote><blockquote type="cite">If you fax, please phone or email so that I will know to look for it.<br></blockquote><blockquote type="cite"><a href="http://bioimages.vanderbilt.edu">http://bioimages.vanderbilt.edu</a><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">_______________________________________________<br></blockquote><blockquote type="cite">tdwg-content mailing list<br></blockquote><blockquote type="cite"><a href="mailto:tdwg-content@lists.tdwg.org">tdwg-content@lists.tdwg.org</a><br></blockquote><blockquote type="cite"><a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite"><br></blockquote><blockquote type="cite">_______________________________________________<br></blockquote><blockquote type="cite">tdwg-content mailing list<br></blockquote><blockquote type="cite"><a href="mailto:tdwg-content@lists.tdwg.org">tdwg-content@lists.tdwg.org</a><br></blockquote><blockquote type="cite"><a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br></blockquote><blockquote type="cite"><br></blockquote><br></div></blockquote></div><br></body></html>