<div dir="ltr">I'd be very happy to work with you on that. </div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Aug 24, 2015 at 10:35 AM, Tim Robertson <span dir="ltr"><<a href="mailto:trobertson@gbif.org" target="_blank">trobertson@gbif.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I’d suggest TDWG hold back on this until the W3C CSV on the web group finish (Feb. 2016).<br>
I submitted DwC-A as a use case which was accepted (<a href="http://w3c.github.io/csvw/use-cases-and-requirements/" rel="noreferrer" target="_blank">http://w3c.github.io/csvw/use-cases-and-requirements/</a>) and have been following the progress.<br>
<br>
As far as I can tell the recommendations from that group will provide one possible future evolution of DwC-A covering tabular formats, encoding, micro syntax, JSON and RDF serialization and deserialisations, controlled terms, generic models (i.e. not star-schema) etc. It is because of this group that I have held back on updating the DwC text guidelines to address the issues we all know about, as I believe they will be covered there.<br>
<br>
By adopting W3C recommendations / standards, it will allow TDWG to focus on biodiversity specific issues - namely vocabularies and classes / models - and less on serialisation formats.<br>
<br>
I aim to write up / present a proposal on the future of DwC-A built around the recommendations to coincide with the conclusion of the W3C group. It should be a fairly logical progression from where we are today, and backwards compatibility looks doable. I’d be very happy to work with others on that.<br>
<br>
Thanks,<br>
Tim<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
On 21 Aug 2015, at 19:02, Alex Thompson <<a href="mailto:godfoder@acis.ufl.edu">godfoder@acis.ufl.edu</a>> wrote:<br>
<br>
> As someone who is normally a big proponent of JSON as a general<br>
> information representation format, I'd have to say that you're likely to<br>
> run into a myriad of issues with this. Chief among those would be that<br>
> JSON doesn't tend to play well with progressive decoding - most JSON<br>
> libraries force you to parse and decode the entire file, often in<br>
> memory, before you have access to any of the information. This works<br>
> fine for things like APIs where the information is generally quite<br>
> small, but for something a DwC-A, where each item can have 40-50<br>
> properties, and there can be hundreds of thousands of items, it quickly<br>
> gets un-manageable. There are progressive json decoding libraries in<br>
> most languages, but it is a hurdle to effective usage, since they<br>
> normally aren't part of the standard library packages.<br>
><br>
> The two major strategies around this are either using some kind of<br>
> hybrid JSON-delimited (normally new line or null byte) format, or<br>
> writing hundreds of thousands of individual JSON files into a zip or tar<br>
> archive directly without first writing them to disk. I've tried both,<br>
> and I don't really like either of them for anything more than a quick hack.<br>
><br>
> In terms of advanced serializations for DwC-A type data, I'd much rather<br>
> see something like DwC-SQLite, or DwC-HDF5 that would start to give us<br>
> some real tools to work with something other than a star schema.<br>
><br>
> - Alex<br>
><br>
> P.S.<br>
> You could always do this:<br>
><br>
> meta.xml:<br>
> <core encoding="utf-8" fieldsTerminatedBy="\t" linesTerminatedBy="\n"<br>
> fieldsEnclosedBy="" ignoreHeaderLines="1"<br>
> rowType="<a href="http://rs.tdwg.org/dwc/terms/Occurrence" rel="noreferrer" target="_blank">http://rs.tdwg.org/dwc/terms/Occurrence</a>"><br>
><br>
> occurrence.txt:<br>
> id,dynamicProperties<br>
> ABC123 <tab> {<all of your actual data>}<br>
><br>
> On 08/21/2015 12:43 PM, Bob Morris wrote:<br>
>> Is there or should there be a form of DwC-A serialized with JSON? If<br>
>> no, should Interest Group X ( X= ???) initiate some discussion or<br>
>> Task. If IG X is already at work, where is its discussion?<br>
>><br>
>> Alternatively, should my question be something like "What is the JSON<br>
>> alternative to DwC-A and where is it, or should it be, discussed?"<br>
>><br>
>> Thanks<br>
>> Bob<br>
>><br>
><br>
> _______________________________________________<br>
> tdwg-tag mailing list<br>
> <a href="mailto:tdwg-tag@lists.tdwg.org">tdwg-tag@lists.tdwg.org</a><br>
> <a href="http://lists.tdwg.org/mailman/listinfo/tdwg-tag" rel="noreferrer" target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-tag</a><br>
><br>
<br>
_______________________________________________<br>
tdwg-tag mailing list<br>
<a href="mailto:tdwg-tag@lists.tdwg.org">tdwg-tag@lists.tdwg.org</a><br>
<a href="http://lists.tdwg.org/mailman/listinfo/tdwg-tag" rel="noreferrer" target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-tag</a><br>
</div></div></blockquote></div><br></div>