<div dir="ltr">Hello,<div>Resending the same message due to a subscription problem.</div><div class="gmail_extra"><br></div><div class="gmail_extra">--<br clear="all"><div><div class="gmail_signature"><div dir="ltr">Menashè<br><div><br></div></div></div></div>
<br><div class="gmail_quote">2015-10-29 12:15 GMT+01:00 Menashe' Eliezer <span dir="ltr"><<a href="mailto:menashe.eliezer@gmail.com" target="_blank">menashe.eliezer@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello,<div>Please see my updated suggestion at <a href="https://github.com/gbif/ipt/issues/1165" target="_blank">https://github.com/gbif/ipt/issues/1165</a></div><div>IMHO Open Refine is not the right tool. One can simply use org.apache.poi in his Java application for reading all the information from the different files inside the DwC, and create an ODS file with the combined matrix, which takes into consideration also possible parentEventID. I'm sorry I don't have time to do it myself.</div><div class="gmail_extra">I hope it's clear.</div><div class="gmail_extra">--<br clear="all"><div><div><div dir="ltr">Menashè<br><div><br></div></div></div></div>
<br><div class="gmail_quote"><div><div class="h5">2015-10-28 18:57 GMT+01:00 Shorthouse, David <span dir="ltr"><<a href="mailto:david.shorthouse@umontreal.ca" target="_blank">david.shorthouse@umontreal.ca</a>></span>:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><div dir="ltr">All,<div><br></div><div>Is part of the issue being expressed here because the raw ecological data sets we're discussing are small-ish matrices rather than occurrences, with site codes as columns, taxa as rows and measures of density/abundance as cells (and similar for environmental variables)? Such structures are often used as input for software that executes eg ordinations, classification & regression trees, species richness estimates. The shortcoming of such a structure is the inherent idiosyncratic nature of "site codes", with variable numbers of them, i.e. an arbitrary number of columns. I doubt it was ever designed for ease of dataset integration, but rather for ease of computation. Representing this structure as Event core requires significant transposition & potential for error if it were manual. Open Refine is one such tool that could permit bi-directional transpositions (DwC -> matrix and then matrix -> DwC), but it is still clunky and accommodation of extensions is virtually non-existent. But, perhaps Open Refine recipes and guides gets us one step closer to finding a balance between the need for standardized representation & efficient transport (DwC) vs. end-users who want matrices for ease of computation.</div><div><br></div><div>David P. Shorthouse</div><div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Oct 27, 2015 at 7:36 AM, David Valentim Dias <span dir="ltr"><<a href="mailto:dvdias@sibbr.gov.br" target="_blank">dvdias@sibbr.gov.br</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div>Hi again,<br><br></div>I think the problem target both. DwC because is a solution to a problem creating another problem to researchers less "skilled" in table manipulation. Ecological data with occurrence is resulting in three tables and manipulation of these are getting harder with the number of core or extensions used.<br></div>Two possible solutions comes in mind: create a new term describing the original layout of the columns (so we can use csvjoin like Menashe suggest) or ipt with option to store the original table associated with resource.<br></div>We can always use external links in eml and save the file somewhere but this means creating another service and managing more login (aka resource cost and new problems).<br><br></div>I think any solution will need ipt changes.<br></div><div><div><div><div><div class="gmail_extra"><br><div class="gmail_quote">2015-10-27 9:08 GMT-02:00 Menashe' Eliezer <span dir="ltr"><<a href="mailto:menashe.eliezer@gmail.com" target="_blank">menashe.eliezer@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Tim,<div>I believe that the IPT feature I've requested long ago could be helpful for David: <a href="https://github.com/gbif/ipt/issues/1165" target="_blank">https://github.com/gbif/ipt/issues/1165</a></div><div>Consumers and also the data providers don't have a DwC-A viewer, and they need to join the separate csv files for having one table in a worksheet.</div><div>Web applications like the one at OBIS website do let end users download one big table.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Best regards,<br clear="all"><div><div><div dir="ltr">Menashè<br><div><br></div></div></div></div>
<br><div class="gmail_quote"><div><div>2015-10-27 9:53 GMT+01:00 Tim Robertson <span dir="ltr"><<a href="mailto:trobertson@gbif.org" target="_blank">trobertson@gbif.org</a>></span>:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><div style="word-wrap:break-word"><div>Hi David</div><div>(CC’ing the IPT list as this might be an IPT specific thread - <a href="http://lists.gbif.org/mailman/listinfo/ipt" target="_blank">http://lists.gbif.org/mailman/listinfo/ipt</a>)</div><div><br></div><div>For clarification - is your question specific to the DwC-A standard which is possible as Alex says or is it specific to the IPT tool please?</div><div><br></div><div>Do you imagine a scenario where you’d effectively map the same extension 2 times - once to interpreted and once to verbatim - or do you envisage a different data schema for each?</div><div><br></div><div>Thanks,</div><div>Tim</div><div><br><div><br></div><div><br><div><blockquote type="cite"><div>On 23 Oct 2015, at 16:00, Alex Thompson <<a href="mailto:godfoder@acis.ufl.edu" target="_blank">godfoder@acis.ufl.edu</a>> wrote:</div><br><div>
<div bgcolor="#FFFFFF" text="#000000">
David,<br>
<br>
It's certainly possible, within the context of a Darwin Core
Archive, to include other files within the ZIP file that lie outside
the schema of the archive. Both GBIF and iDigBio do this when
generating downloads for various reasons (RIGHTS & LICENSE
files, additional EML metadata, etc). However, I do not believe it
is possible to do this within IPT. You might submit an issue on the
IPT issue tracker (<a href="https://github.com/gbif/ipt/issues" target="_blank">https://github.com/gbif/ipt/issues</a>) for potential
inclusion of this feature in a future version of IPT.<br>
<br>
There are workarounds you can use to include additional data in
Darwin Core archives, but none of them will exactly match your old
format. For instance, including an additional Occurrence file with
the values as JSON in dynamicProperties or in some other verbatim
format in the occurrenceRemarks field. Both of those would at least
give some method of single-row access (vs joining multiple
measurementOrFacts to a single event id) if that is the primary
concern, even if they would require additional parsing steps to be
useful.<br>
<br>
Alex Thompson<br>
iDigBio Infrastructure<div><div><br>
<br>
<div>On 10/23/2015 09:40 AM, David Valentim
Dias wrote:<br>
</div>
</div></div><blockquote type="cite"><div><div>
<div dir="ltr">
<div>
<div>
<div>Dear colleagues,<br>
<br>
</div>
Here on SiBBr we're using the new eventCore and
measurementOrFacts and after the process of standardization
to DwC and publishing we think some users/researchers will
want the "original" table format because of multiple
reasons.<br>
<br>
</div>
Is possible to have a vertabimTable or some place where we can
store the original table/column format?<br>
<br>
</div>
Regards</div></div></div></blockquote></div></div></blockquote></div></div></div></div></div></div></blockquote></div></div></div></blockquote></div></div></div></div></div></div></blockquote></div></div></div></div>
<br></div></div><span class="">_______________________________________________<br>
IPT mailing list<br>
<a href="mailto:IPT@lists.gbif.org" target="_blank">IPT@lists.gbif.org</a><br>
<a href="http://lists.gbif.org/mailman/listinfo/ipt" rel="noreferrer" target="_blank">http://lists.gbif.org/mailman/listinfo/ipt</a><br>
<br></span></blockquote></div><br></div></div>
</blockquote></div><br></div></div>