[tdwg-content] terms for sample data in DwC

Éamonn Ó Tuama [GBIF] eotuama at gbif.org
Fri Nov 7 15:21:31 CET 2014


In compliance with the directions for requesting changes to the Darwin Core
(DwC) vocabulary, we hereby submit to the TDWG Content list a proposal for
five new terms to express information on sample-based data. 

 

Sample-based data is a type of data available from thousands of
environmental, ecological, and natural resource investigations. These can be
one-off studies or monitoring programmes. Such data are usually
quantitative, calibrated, and follow certain protocols so that changes and
trends of populations can be detected. This is in contrast to opportunistic
observation and collection data which today form a significant proportion of
openly accessible biodiversity data.  Sample-based data are often not shared
because the underlying protocols have been hard to encode in a standardised
way. For further background information and examples, please consult the
sample data primer document [1].

 

We cannot emphasise enough that our intention here is not to establish how
data should be captured or modelled but rather demonstrate one way data can
be exposed to maximize discoverability and reuse. In particular, GBIF, in
association with EU BON [2] project partners, is exploring how the IPT and
Darwin Core Archives can enable the flow of sample based data in support of
GEO BON Essential Biodiversity Variables (EBVs) [3].

 

While DwC already provides many terms that are relevant for describing
sample-based data, based on several inputs (GBIF organised workshop on
sample data, May 2013 ; previous discussions on the TDWG mailing list;
discussions on the EU BON mailing list), we have identified a need for five
new terms. These are:

 

sampleSize

 <https://github.com/tdwg/dwc/issues/10>
https://github.com/tdwg/dwc/issues/10

#Justification

Required for sharing organism abundance data from controlled sampling and
monitoring surveys (i.e., sampling events). For an introduction to the
sampling proposal please see  <http://links.gbif.org/ipt-sample-data-primer>
http://links.gbif.org/ipt-sample-data-primer.

#Definition

A numeric value for the time duration, length, area or volume involved in
the sampling event.

#Comment

The terms sampleSize and sampleSizeUnit are required to be used as a pair.
The value of sampleSize is a number. Example: “5” for the value part of 5m.

#Term group

Event

 

sampleSizeUnit

 <https://github.com/tdwg/dwc/issues/11>
https://github.com/tdwg/dwc/issues/11

#Justification

Required for sharing organism abundance data from controlled sampling and
monitoring surveys (i.e., sampling events). For an introduction to the
sampling proposal please see  <http://links.gbif.org/ipt-sample-data-primer>
http://links.gbif.org/ipt-sample-data-primer.

#Definition

The unit of measurement used in the sampling event.

#Comment

The terms sampleSize and sampleSizeUnit are required to be used as a pair,
e.g., “5 metre”. Example values of sampleSizeUnit include “minute”, “hour”,
“day, “metre”, “square metre”, “cubic metre”. Recommended best practice is
to use a controlled vocabulary such as the Ontology of Units of Measure
<http://www.wurvoc.org/vocabularies/om-1.8/>
http://www.wurvoc.org/vocabularies/om-1.8/ of SI units, derived units or
other non-SI units accepted for use within the SI (e.g. minute, hour, day,
litre). Example: “metre” for the unit part of 5m.

#Term group

Event

 

quantity

 <https://github.com/tdwg/dwc/issues/12>
https://github.com/tdwg/dwc/issues/12

#Justification

Required for sharing organism abundance data from controlled sampling and
monitoring surveys (i.e. sampling events). For an introduction to the
sampling proposal please see  <http://links.gbif.org/ipt-sample-data-primer>
http://links.gbif.org/ipt-sample-data-primer.

#Definition

The quantity, per sampling event, given as a number or enumeration value for
the quantityType.

#Comment

The terms quantity and quantityType are required to be used as a pair. The
value of quantity is a number or enumeration. Examples: “12.5”, “r”

#Term group

Occurrence

 

quantityType

 <https://github.com/tdwg/dwc/issues/13>
https://github.com/tdwg/dwc/issues/13

#Justification

Required for sharing organism abundance data from controlled sampling and
monitoring surveys (i.e., sampling events). For an introduction to the
sampling proposal please see  <http://links.gbif.org/ipt-sample-data-primer>
http://links.gbif.org/ipt-sample-data-primer.

#Definition

The entity to which the number or enumeration reported in quantity refers.

#Comment

The terms quantity and quantityType are required to be used as a pair, e.g.,
“14 individuals”. The value of quantityType (i.e., the entity being
measured) is expected to be drawn from a small controlled vocabulary.
Examples: “Individuals”, “% Biomass”, “% Biovolume”, “% species”, “%
coverage”, “BraunBlanquetScale”, “DominScale”.

#Term group

Occurrence

 

parentEventID

 <https://github.com/tdwg/dwc/issues/9> https://github.com/tdwg/dwc/issues/9

#Justification

Allows arbitrary linking of sub sampling events, e.g., for nested sampling
plots. This was demanded by several people during the discussion of the TDWG
sample based data session. For an introduction to the sampling proposal
please see  <http://links.gbif.org/ipt-sample-data-primer>
http://links.gbif.org/ipt-sample-data-primer.

#Definition

An event identifier for the super event which is composed of one or more
sub-sampling events.

#Comment

The value must refer to an existing eventID. If the identifier is local it
must exist within the given dataset. Example: “A1” identifying the main
Whittaker Plot in nested samples, each with their own eventID (e.g., “A1:1”,
“A1:2”).

#Term group

Event

 

None of the existing DwC terms is sufficient for any of the proposed terms
although two are somewhat related: sampleEffort ("The amount of effort
expended during an Event") and individualCount ("The number of individuals
represented present at the time of the Occurrence"). Neither are adequate –
sampleEffort just provides a free text statement so is hard to parse in
comparison to sampleSize and sampleSizeUnit and individualCount refers
exclusively to individuals and not percentages or entities like biomass,
biovolume, etc.

 

[1]  <http://links.gbif.org/ipt-sample-data-primer>
http://links.gbif.org/ipt-sample-data-primer

[2]  <http://eubon.eu/> http://eubon.eu

[3]
<http://www.earthobservations.org/documents/cop/bi_geobon/ebvs/201301_ebv_pa
per_pereira_et_al.pdf>
http://www.earthobservations.org/documents/cop/bi_geobon/ebvs/201301_ebv_pap
er_pereira_et_al.pdf

 

____________________________________________________

Éamonn Ó Tuama, Markus Döring, GBIF Secretariat

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20141107/d3f60300/attachment.html 


More information about the tdwg-content mailing list