[tdwg-tag] DwC-A and JSON

John Wieczorek tuco at berkeley.edu
Mon Aug 24 10:40:58 CEST 2015


I'd be very happy to work with you on that.

On Mon, Aug 24, 2015 at 10:35 AM, Tim Robertson <trobertson at gbif.org> wrote:

> I’d suggest TDWG hold back on this until the W3C CSV on the web group
> finish (Feb. 2016).
> I submitted DwC-A as a use case which was accepted (
> http://w3c.github.io/csvw/use-cases-and-requirements/) and have been
> following the progress.
>
> As far as I can tell the recommendations from that group will provide one
> possible future evolution of DwC-A covering tabular formats, encoding,
> micro syntax, JSON and RDF serialization and deserialisations, controlled
> terms, generic models (i.e. not star-schema) etc.  It is because of this
> group that I have held back on updating the DwC text guidelines to address
> the issues we all know about, as I believe they will be covered there.
>
> By adopting W3C recommendations / standards, it will allow TDWG to focus
> on biodiversity specific issues - namely vocabularies and classes / models
> - and less on serialisation formats.
>
> I aim to write up / present a proposal on the future of DwC-A built around
> the recommendations to coincide with the conclusion of the W3C group.  It
> should be a fairly logical progression from where we are today, and
> backwards compatibility looks doable.   I’d be very happy to work with
> others on that.
>
> Thanks,
> Tim
>
>
> On 21 Aug 2015, at 19:02, Alex Thompson <godfoder at acis.ufl.edu> wrote:
>
> > As someone who is normally a big proponent of JSON as a general
> > information representation format, I'd have to say that you're likely to
> > run into a myriad of issues with this. Chief among those would be that
> > JSON doesn't tend to play well with progressive decoding - most JSON
> > libraries force you to parse and decode the entire file, often in
> > memory, before you have access to any of the information. This works
> > fine for things like APIs where the information is generally quite
> > small, but for something a DwC-A, where each item can have 40-50
> > properties, and there can be hundreds of thousands of items, it quickly
> > gets un-manageable. There are progressive json decoding libraries in
> > most languages, but it is a hurdle to effective usage, since they
> > normally aren't part of the standard library packages.
> >
> > The two major strategies around this are either using some kind of
> > hybrid JSON-delimited (normally new line or null byte) format, or
> > writing hundreds of thousands of individual JSON files into a zip or tar
> > archive directly without first writing them to disk. I've tried both,
> > and I don't really like either of them for anything more than a quick
> hack.
> >
> > In terms of advanced serializations for DwC-A type data, I'd much rather
> > see something like DwC-SQLite, or DwC-HDF5 that would start to give us
> > some real tools to work with something other than a star schema.
> >
> > - Alex
> >
> > P.S.
> > You could always do this:
> >
> > meta.xml:
> > <core encoding="utf-8" fieldsTerminatedBy="\t" linesTerminatedBy="\n"
> > fieldsEnclosedBy="" ignoreHeaderLines="1"
> > rowType="http://rs.tdwg.org/dwc/terms/Occurrence">
> >
> > occurrence.txt:
> > id,dynamicProperties
> > ABC123 <tab> {<all of your actual data>}
> >
> > On 08/21/2015 12:43 PM, Bob Morris wrote:
> >> Is there or should there be a form of DwC-A serialized with JSON? If
> >> no, should Interest Group X ( X= ???) initiate some discussion or
> >> Task. If IG X is already at work, where is its discussion?
> >>
> >> Alternatively, should my question be something like "What is the JSON
> >> alternative to DwC-A and where is it, or should it be, discussed?"
> >>
> >> Thanks
> >> Bob
> >>
> >
> > _______________________________________________
> > tdwg-tag mailing list
> > tdwg-tag at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-tag
> >
>
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20150824/38324f3c/attachment.html 


More information about the tdwg-tag mailing list