Hi all,

In compliance with the directions for requesting changes to the Darwin Core (DwC) vocabulary, I hereby submit to the TDWG Content list a proposal for four new terms to more faithfully express sample event data.

The need for these four new terms is based on several inputs: EU Nodes workshop on sample event data, April 2016[1]; discussions on the Sample event data publishing interest group[2]; experiences publishing the first round of sample event datasets[3].

While DwC already provides many terms that enable describing sample event data, these four new terms will allow a larger number of underlying protocols to be encoded in a standardised way. Notably the term samplingTaxaRange will also allow for accurate presence/absence data to be recorded.

These four terms are:

layer
https://github.com/tdwg/dwc/issues/125 
#Justification
Required for describing the different layers within a location. Especially useful for describing the location of sampling when the sampling protocol uses stratification. For example, vegetation surveys (relevés) commonly stratify a plot into vegetation layers and describe each plant species’ abundance within each layer.  
#Definition
The layer within a Location. Recommended best practice is to use a controlled vocabulary such as the GBIF vegetation layer vocabulary[4]. The elevation or depth of the layer should be well defined in the vocabulary. Otherwise describe the range of elevation using minimumElevationInMeters and maximumElevationInMeters or the range of depth using minimumDepthInMeters and maximumDepthInMeters.
#Comment
Examples: “tree”, “shrub”.
#Term group
Location

siteID
https://github.com/tdwg/dwc/issues/126 
#Justification
Required for identifying a specific sampling site at or within a location. This allows distinction from dwc:locationID, which is an identifier for the sampling location itself. Especially useful for describing permanent sampling sites that haven’t been georeferenced. This also enables sampling sites to be referenced across datasets.
#Definition
An identifier for the sampling site at or within the Location.
#Comment
May be a global unique identifier or an identifier specific to the dataset. Example: “tr011-s1” identifies a sampling site within a location (locationID= “tr011”).
#Term group
Location

samplingTaxaRange
https://github.com/tdwg/dwc/issues/127 
#Justification
Required for describing the range of taxa that could be observed at the time and place of sampling given a) the sampling protocol or b) the taxonomic coverage of the study and the expertise of the personnel carrying out the identification. This would allow for accurate presence/absence data to be recorded.
#Definition
A short list (concatenated and separated) of taxa names or the URI of a timestamped checklist describing the range of taxa being sampled during a sampling event. For taxonomic and biogeographical/ecological reasons, however, this checklist should exist solely within the context of the sample event dataset.
#Comment
The recommended best practice varies depending on whether the list of taxa is short or long. For a short list, the recommended best practice is to separate the values with a vertical bar (' | '). Examples: “Lepidoptera | Coleoptera”,  “Lepidurus arcticus (Pallas, 1793)”. For a long list, the recommended best practice is to use the URI of a timestamped checklist. Example: “http://danbif.au.dk/ipt/resource?r=checklist-bees-denmark” - checklist for the taxa of bees occurring in Denmark.
#Term group
Event

siteTreatment
https://github.com/tdwg/dwc/issues/128 
#Justification
Required for describing the treatments that were intentionally applied to a sampling site before sampling occurred. Especially useful for describing sample events where the underlying protocol tests a control-group versus a treatment-group. For example, an investigation looking at the effects of grazing on vegetation will have a control plot and a plot that has been permanently grazed. Note this is different from dwc:preparations, which only applies to a specimen.
#Definition
A list (concatenated and separated) of treatments for a sampling site.
#Comment
The recommended best practice is to separate the values with a vertical bar (' | '). Examples: "permanently grazed", “Imazapyr | Triclopyr” - two herbicides that have been applied.
#Term group
Event

None of the existing DwC terms is sufficient for any of the proposed terms although two are somewhat related: locationID (“An identifier for the set of location information”) and preparations (“A list of preparations and preservation methods for a specimen.”). Neither are adequate - locationID is an identifier for the sampling location itself but not specific sampling sites within the location and preparations refers exclusively to specimens but not a sampling site.

Thank you for your consideration of my proposal,

Kyle Braak
GBIF Secretariat

[1] http://www.gbif.pt/EuropeanNodesMeeting/workshop 
[2] http://community.gbif.org/pg/groups/47949/sample-event-data-publishing-interest-group
[3] http://www.gbif.org/dataset/search?q=&type=SAMPLING_EVENT 
[4] http://rs.gbif.org/vocabulary/gbif/vegetation_layer.xml