<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Some time ago I tried to write an inventory of all the different ways how occurrences are published with dwc archives. It includes the Occurrence, Taxon, Event and MaterialSample cores with various extensions:<div><a href="https://github.com/mdoering/dwca-examples/blob/master/README.md">https://github.com/mdoering/dwca-examples/blob/master/README.md</a></div><div><br></div><div>It might need a little update now, but hopefully it is still useful to get a complete overview.</div><div><br></div><div>Markus</div><div><br></div><div><br></div><div><br><div><br><div><div>On 20 Aug 2014, at 23:46, Markus Döring <<a href="mailto:mdoering@gbif.org">mdoering@gbif.org</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><meta http-equiv="Content-Type" content="text/html charset=windows-1252"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Rob,<div><br></div><div>this proposal if for monitoring surveys really, not to be confused with material samples like environmental or tissue samples which have a distinct new dwc class MaterialSample. </div><div><br></div><div>We tend to overload the term sampling a lot and it helps treating material samples different from pure observational "sampling". That is why the existing Event class was used as the core and classic Occurrence records as extensions. A classic example is a vegetation survey where each plot represents an Event record and each recorded species in that plot will be an Occurrence extension record with a given quantity. Darwin Core already offers individualCount to specify quantity, but it is a very specific way of measuring "abundance" restricted to only some use cases. Abiotic measurements about the plot (e.g. soil type, pH, temperature) can be published using the measurements or facts extension linked to the Event core. </div><div><br></div><div>Markus</div><div><br></div><div><br></div><div><br><div><div>On 20 Aug 2014, at 20:08, Robert Guralnick <<a href="mailto:Robert.Guralnick@colorado.edu">Robert.Guralnick@colorado.edu</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr"><br><div> Anne -- I don't know the answers! These are questions for Eamonn. I would presume that a sample could be a jumble of species or even just water or soil samples, and biomass would refer to that sample - but maybe that isn't a use case being considered? The examples given in the longer document all link an event_id to species name and some measure of quantity for that species (to the species, not an individual specimen), so I assume that is the prevailing (or only) case? </div>
<div>Best, Rob</div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Aug 20, 2014 at 11:56 AM, Anne Thessen <span dir="ltr"><<a href="mailto:annethessen@gmail.com" target="_blank">annethessen@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Hi Rob<br>
I would like to respond to your item number 2.<br>
From my perspective, I deal with lots of published descriptions of
taxa. The text might say something like "I saw species A in the
Chesapeake Bay, the Adriatic Sea and the Indian Ocean and the
biomass is 5 - 9 grams". The biomass range obviously corresponds to
at least three different occurrences, but how to divide the biomass
data? I would love to be able to have an *event* to attach it all
to. There is almost two different levels of events - a sampling
event and a "study event". The "study event" would correspond to the
type of event I would like to use in the above example. It may not
be ideal, but for the old literature that might be the best we can
do. <br>
I have to admit that I don't know enough about trawl data to
understand why an event core would be a problem. It seems that the
trawl would be an event and each biomass measure (of each fish)
would be attached to a separate occurrence which is attached to that
event. Am I understanding this wrong?<br>
btw - I found a workaround for the example I gave, so it's not
impossible to model with the current structure....<span class="HOEnZb"><font color="#888888"><br>
Anne</font></span><div><div class="h5"><br>
<br>
<div>On 8/20/2014 1:16 PM, Robert Guralnick
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr"><br>
<div>Éamonn et al. --- Thanks for the clarifications. I think
these help a ton but it raises a couple more questions for me.
</div>
<div><br>
</div>
<div>1) I am surprised that you plan to use of
MeasurementorFact extension in relation to the Event core,
which seems like a novel (or perhaps awkward or unintended?)
mechanism for capturing environmental data, but the same
extension was not be seen as relevant for describing samples?
Can you explain more about the thinking there?</div>
<div><br>
</div>
<div>2) There may be a subtle issue here extending "Event" to
be more what you call a "Sampling Event Core". My read of
this is that Darwin Core serves as a way to deal with point
occurrences and Event reflects the context of a single capture
event (whether a single observation, or a bulk sample
capture). The changes recommended seem to dramatically extend
and change that meaning? Its simply a question that I don't
have answer to, but is Darwin Core, the right vehicle to start
capturing repeated measures of biomass values from trawls? I
don't have answer but man, terms like quantityType (as a
property of occurrence?) give me pause. </div>
<div><br>
</div>
<div>3) Is Sampling Unit a controlled vocabulary? For another
project, I have looked through - and captured scope, effort
and completeness measures from - a large number of published
biotic area inventories. The vast majorities of these are
measured in units like bucket hours, or trap nights. Is a
"bucket" part of SamplingGeometry or Sampling Unit? I'd be
happy to send along all the many examples of how biotic
inventories of an area are completed and perhaps it might be
good to see how those might be represented using the terms you
are proposing? </div>
<div><br>
</div>
<div>Best, Rob</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div> </div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">
On Wed, Aug 20, 2014 at 10:16 AM, Richard Pyle <span dir="ltr"><<a href="mailto:deepreef@bishopmuseum.org" target="_blank">deepreef@bishopmuseum.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="white" link="blue" vlink="purple" lang="EN-US">
<div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Same
here – Events are central to the work that we do.</span></p><div><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Aloha,</span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Rich</span></p><div><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><br class="webkit-block-placeholder"></div>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in 0in 0in"><p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:windowtext">
<a href="mailto:tdwg-content-bounces@lists.tdwg.org" target="_blank">tdwg-content-bounces@lists.tdwg.org</a>
[mailto:<a href="mailto:tdwg-content-bounces@lists.tdwg.org" target="_blank">tdwg-content-bounces@lists.tdwg.org</a>]
<b>On Behalf Of </b>Anne Thessen<br>
<b>Sent:</b> Wednesday, August 20, 2014 2:59
AM<br>
<b>To:</b> <a href="mailto:tdwg-content@lists.tdwg.org" target="_blank">tdwg-content@lists.tdwg.org</a></span></p>
<div>
<div><br>
<b>Subject:</b> Re: [tdwg-content] Darwin
Core: proposed news terms for expressing
sample data</div>
</div>
</div>
</div>
<div>
<div><div> <br class="webkit-block-placeholder"></div><p class="MsoNormal" style="margin-bottom:12.0pt">Hello<br>
I would just like to comment on *event core*.<br>
I've been doing a lot of work translating
published data into Darwin Core. During that
process I've wished several times that I could
use Event as core. I am happy to hear about
that proposed change. It will make it easier
to model the data I am working with.<br>
Anne</p>
<div><p class="MsoNormal">On 8/20/2014 7:04 AM,
Éamonn Ó Tuama [GBIF] wrote:</p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Hi
Rob,</span></p><div><span style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Thank
you for the feedback. I have tried to
address the two main issues you raise
below. At the outset, I would like to
emphasise that much of this work is taking
place in the context of the EU BON project
which includes a task on
developing/enhancing tools and standards
for data sharing with a particular focus
on the IPT for publishing sample-based
data. So, we were constrained by the need
to publish sample-based data sets in the
Darwin Core Archive format and to
demonstrate practical application using a
working prototype. When the discussion on
the TDWG list faded out, we took it to our
EU BON partners whose requirements were
essential input to further development. We
recognise that these discussions took
place away from TDWG (although the TDWG/EU
BON contributors overlapped) and this is
the reason we are presenting the outcomes
here for further consideration. </span></p><div><span style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">*<b>Event
core</b>*</span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">As
the SIGS report indicated, sample data can
be modelled in Darwin Core Archives using
either Occurrence or Event as core. This
was the starting point for our evaluation
but as things progressed the data
wrangling pushed the model back towards
the Event core. We actually went through
the exercise of mapping multiple test
datasets in an iterative process spanning
several months' work. In the end, we found
that using an Event core better matched
the typical sample data we were dealing
with, allowing use of a
measurement-or-fact extension to be
included for the efficient expression of
environmental information associated with
the event. The choice comes down to an
Occurrence core or an Event core +
Occurrence extension. In both cases, the
true observation records are Occurrences.
The big difference is what type the core
has and therefore to which kind of records
you can attach further facts and extra
information with DwC-A extensions. Many
sampling datasets have very rich
information about the site and event, so
it is very natural to hang facts from an
Event core. When picking the Occurrence
core those facts would have to be repeated
for each and every occurrence record.
Moreover, our approach doesn’t stop anyone
from using the Occurrence core if they so
wish. This just provides a different
option for datasets that better fit an
Event core model.</span></p><div><span style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">I
want to stress that we are not building a
“specific IPT version” to support an Event
core but, rather, we adapted the IPT so
that it can be configured to support any
generic “core + extension” format to
enable its use for exploration of more
data formats. This is part of the core
codebase and there were no custom forks of
the IPT for this work. Our view at GBIF
is that if there are significant numbers
of data publishers who are keen to adopt,
promote and use a (any) format, and the
tools can be configured to do so, then we
should support it, and, if necessary, use
a custom namespace.</span></p><div><span style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">*<b>New
terms around abundance</b>*</span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Yes,
the discussion on TDWG did fade out but it
was clear that the term “abundance” as
recommended by the SIGS report (along with
abundanceAsPercent) was confusing many
when we were looking for term(s) that
reported quantitative measures of
organisms in a sample. It also became
clear we would need to be able to state
the type of quantity being measured. An
alternative suggestion for using the
MeasurementsOrFact class was immediately
shot down.</span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">As
some of our main use cases were coming
from the EU BON project, discussion
shifted to that forum and consensus formed
about the currently proposed terms. It was
within this group that the additional
terms (samplingGeometry, samplingUnit,
eventSeriesID) were proposed and where we
began testing with sample data sets.</span></p><div><span style="font-size:11.0pt;font-family:"Calibri","sans-serif""> </span><br class="webkit-block-placeholder"></div><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Best
regards,</span></p><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Éamonn</span></p><div><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><br class="webkit-block-placeholder"></div><div><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><br class="webkit-block-placeholder"></div>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in 0in 0in"><p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">
<a href="mailto:robgur@gmail.com" target="_blank">robgur@gmail.com</a> [<a href="mailto:robgur@gmail.com" target="_blank">mailto:robgur@gmail.com</a>]
<b>On Behalf Of </b>Robert Guralnick<br>
<b>Sent:</b> 19 August 2014 16:56<br>
<b>To:</b> Éamonn Ó Tuama [GBIF]<br>
<b>Cc:</b> TDWG Content Mailing List<br>
<b>Subject:</b> Re: [tdwg-content]
Darwin Core: proposed news terms for
expressing sample data</span></p>
</div><div> <br class="webkit-block-placeholder"></div>
<div><div> <br class="webkit-block-placeholder"></div>
<div><p class="MsoNormal"> Hi Éamonn --- I am
curious about the outcomes presented in
the SIGS paper, in particular, this
portion of the paper: </p>
</div>
<div><div> <br class="webkit-block-placeholder"></div>
</div>
<div><p class="MsoNormal">"Solutions without
introducing an event core in Darwin Core
Archives: During the review of the
solutions for the uses cases, it became
apparent that either model could be
applied to every use case. The core and
extensions bore a complementary
relationship and between them could
express all the required information.
The core simply provided the central
anchor in the star schema from which to
join the additional information.
Therefore, using the Occurrence core,
well established in the GBIF network
through uptake of the IPT, seemed more
appropriate than inventing
CollectingEvent as an additional core
type."</p>
</div>
<div><div> <br class="webkit-block-placeholder"></div>
</div>
<div><p class="MsoNormal"> That SIGS paper
has John Wieczorek and you both as
authors, including many luminaries
across the biodiversity standards
spectrum. Given the above, its curious
to see the EventCore come back again,
along with a specific IPT version to
support it. </p>
</div>
<div><div> <br class="webkit-block-placeholder"></div>
</div>
<div><p class="MsoNormal"> So I see two
issues, conflated, in this post you just
made. One is the need for an EventCore
at all, and the nature of relating Event
and Occurrence/Material Sample. The
second is the introduction of new terms,
which seemingly have arrived after
debate on similar terms - but framed
around abundance - stalled a year ago.
To my mind, these both require some
further discussion, because I don't
(necessarily) see TDWG community
coherence around either issue?</p>
</div>
<div><div> <br class="webkit-block-placeholder"></div>
</div>
<div><p class="MsoNormal">Best, Rob</p>
</div>
<div><div> <br class="webkit-block-placeholder"></div>
</div>
</div>
<div><div style="margin-bottom: 12pt;">
<br class="webkit-block-placeholder"></div>
<div><p class="MsoNormal">On Tue, Aug 19, 2014
at 6:11 AM, Éamonn Ó Tuama [GBIF] <<a href="mailto:eotuama@gbif.org" target="_blank">eotuama@gbif.org</a>>
wrote:</p>
<div>
<div><p class="MsoNormal">
Dear All,</p><div> <br class="webkit-block-placeholder"></div><p class="MsoNormal">GBIF is committed
to exploring ways in which the IPT
and Darwin Core Archive format can
be extended for publishing
sample-based data sets. In
association with the EU BON project
[1], a customised version of the IPT
[2] has been deployed to test this
using a special type of Darwin Core
Archive in which the core is an
“Event” with associated taxon
occurrences in an “Occurrence”
extension.</p><div> <br class="webkit-block-placeholder"></div><p class="MsoNormal">The Darwin Core
vocabulary already provides a rich
set of terms with many relevant for
describing sample-based data.
Synthesising several sources of
input (GBIF organised workshop on
sample data, May 2013 [3],
discussions on the TDWG mailing list
in late 2013; internal discussion
among EU BON project partners), five
new terms relating to sample data
were identified as essential. The
complete model including these new
terms are fully described with
examples in the online document
“Publishing sample data using the
GBIF IPT” [4].</p><div> <br class="webkit-block-placeholder"></div><p class="MsoNormal">As a first step
towards ratification, we would like
to register the new terms in the DwC
Google Code tracker [5] if there are
no major objections on this list.
The five terms are:</p><div> <br class="webkit-block-placeholder"></div><p style="margin-bottom:0in;margin-bottom:.0001pt">1.<span style="font-size:7.0pt"> </span><b>quantity</b>:
the number or enumeration value of
the quantityType (e.g., individuals,
biomass, biovolume,
BraunBlanquetScale) per samplingUnit
or a percentage measure recorded for
the sample.</p><div> <br class="webkit-block-placeholder"></div><p style="margin-bottom:0in;margin-bottom:.0001pt">2.<span style="font-size:7.0pt"> </span><b>quantityType</b>:
: the entity being referred to by
quantity, e.g., individuals,
biomass, %species, scale type.</p><div> <br class="webkit-block-placeholder"></div><p style="margin-bottom:0in;margin-bottom:.0001pt">3.<span style="font-size:7.0pt"> </span><b>samplingGeometry</b>:
an indication of what kind of space
was sampled; select from point,
line, area or volume.</p><div> <br class="webkit-block-placeholder"></div><p style="margin-bottom:0in;margin-bottom:.0001pt">4.<span style="font-size:7.0pt"> </span><b>samplingUnit</b>:
the unit of measurement used for
reporting the quantity in the
sample, e.g., minute, hour, day,
metre, metre^2, metre^3. It is
combined with quantity and
quantityType to provide the complete
measurement, e.g., 9 individuals per
day, 4 biomass-gm per metre^2.</p><div> <br class="webkit-block-placeholder"></div><p style="margin-bottom:0in;margin-bottom:.0001pt">5.<span style="font-size:7.0pt"> </span><b>eventSeriesID</b>:
an identifier for a set of events
that are associated in some way,
e.g., a monitoring series; may be a
global unique identifier or an
identifier specific to the series.</p><div> <br class="webkit-block-placeholder"></div><div> <br class="webkit-block-placeholder"></div><p class="MsoNormal">Best regards,</p><div> <br class="webkit-block-placeholder"></div><p class="MsoNormal">Éamonn</p><div>
<br class="webkit-block-placeholder"></div><p class="MsoNormal">[1] <a href="http://eubon.eu/" target="_blank">http://eubon.eu</a>
</p><p class="MsoNormal">[2] <a href="http://eubon-ipt.gbif.org/" target="_blank">http://eubon-ipt.gbif.org</a>
</p><p class="MsoNormal">[3] <a href="http://www.standardsingenomics.org/index.php/sigen/article/view/sigs.4898640" target="_blank">http://www.standardsingenomics.org/index.php/sigen/article/view/sigs.4898640</a></p><p class="MsoNormal">[4] <span style="color:#1f497d"><a href="http://links.gbif.org/sample_data_model" target="_blank"><span style="color:purple">http://links.gbif.org/sample_data_model</span></a></span></p><p class="MsoNormal"><span style="color:#1f497d">[5] <a href="https://code.google.com/p/darwincore/issues/list" target="_blank">https://code.google.com/p/darwincore/issues/list</a>
</span></p><div>
<br class="webkit-block-placeholder"></div><div> <br class="webkit-block-placeholder"></div><div> <br class="webkit-block-placeholder"></div><p class="MsoNormal">____________________________________________________</p><p class="MsoNormal"><i>Éamonn Ó
Tuama, M.Sc., Ph.D. (<a href="mailto:eotuama@gbif.org" target="_blank">eotuama@gbif.org</a>),
</i></p><p class="MsoNormal"><i>Senior
Programme Officer for
Interoperability, </i></p><p class="MsoNormal"><i>Global
Biodiversity Information
Facility Secretariat, </i></p><p class="MsoNormal"><i>Universitetsparken
15, DK-2100, Copenhagen Ø,
DENMARK</i></p><p class="MsoNormal"><i><span style="font-size:9.0pt">Phone:
<a href="tel:%2B45%203532%201494" target="_blank">+45 3532 1494</a>;
Fax: <a href="tel:%2B45%203532%201480" target="_blank">+45 3532 1480</a></span></i></p><div> <br class="webkit-block-placeholder"></div>
</div>
</div><p class="MsoNormal" style="margin-bottom:12.0pt"><br>
_______________________________________________<br>
tdwg-content mailing list<br>
<a href="mailto:tdwg-content@lists.tdwg.org" target="_blank">tdwg-content@lists.tdwg.org</a><br>
<a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content" target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a></p>
</div><div> <br class="webkit-block-placeholder"></div>
</div><p class="MsoNormal">
<br>
<br>
<br>
</p>
<pre>_______________________________________________</pre>
<pre>tdwg-content mailing list</pre>
<pre><a href="mailto:tdwg-content@lists.tdwg.org" target="_blank">tdwg-content@lists.tdwg.org</a></pre>
<pre><a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content" target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a></pre>
</blockquote><p class="MsoNormal"><br>
<br>
</p>
<pre>-- </pre>
<pre>Anne E. Thessen, Ph.D.</pre>
<pre>The Data Detektiv, Owner and Founder</pre>
<pre>Ronin Institute, Research Scholar</pre>
<pre><a href="tel:443.225.9185" value="+14432259185" target="_blank">443.225.9185</a></pre>
</div>
</div>
</div>
</div>
</div>
<br>
_______________________________________________<br>
tdwg-content mailing list<br>
<a href="mailto:tdwg-content@lists.tdwg.org" target="_blank">tdwg-content@lists.tdwg.org</a><br>
<a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content" target="_blank">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
<pre cols="72">--
Anne E. Thessen, Ph.D.
The Data Detektiv, Owner and Founder
Ronin Institute, Research Scholar
<a href="tel:443.225.9185" value="+14432259185" target="_blank">443.225.9185</a></pre>
</div></div></div>
</blockquote></div><br></div>
_______________________________________________<br>tdwg-content mailing list<br><a href="mailto:tdwg-content@lists.tdwg.org">tdwg-content@lists.tdwg.org</a><br><a href="http://lists.tdwg.org/mailman/listinfo/tdwg-content">http://lists.tdwg.org/mailman/listinfo/tdwg-content</a><br></blockquote></div><br></div></div></blockquote></div><br></div></div></body></html>