I see what you are proposing, but I think there some things that have to be thought through carefully. First, why would a new core type be needed? In the Darwin Core Workshop we concluded that we could cast almost every use case with an Occurrence Core in a star schema. For the abundance case, we could construct records with a normal Occurrence Core and MeasurementOrFact extension. Or even with an abundance extension, where the abundance terms were defined outside of Darwin Core.
If we had a MeasurementOrFact extension specifically for abundance, what are the technical implications when somebody sees how you've done that and says, "Hey, I could do that for body mass too?" Now we would have another extension based on MeasurementOrFact. If you try to flatten the results, which would be reasonable since the relationships between the Core and each extension are logically one-to-one, you'll end up with duplication of column names again.
So, whereas I see that you could do this, I'm not convinced it is a good idea. I'm willing to be convinced though, because it could be brilliant and I am just being incapable of seeing it.
On Thursday, October 24, 2013, Éamonn Ó Tuama [GBIF] wrote:
I think my use of the term "Simple Darwin Core" was misleading. I was not suggesting putting a MeasurementOrfact in an Occurrence core. Rather we would create a new core as discussed in the DwC sample workshop - let's call it samplingEvent for now. Its properties would be drawn from Simple DwC (mainly Event, Location classes). We would then create an abundance extension consisting of occurrence records each of which has a single MeasurementOrFact along the lines I indicated. The extension name would indicate the nature of the MeasurementOrFact to a consuming application.
-----Original Message----- From: gtuco.btuco@gmail.com [mailto:gtuco.btuco@gmail.com] On Behalf Of John Wieczorek Sent: 24 October 2013 14:26 To: Éamonn Ó Tuama [GBIF] Cc: TDWG Content Mailing List Subject: Re: [tdwg-content] Proposed new Darwin Core terms - abundance, abundanceAsPercent
I see several problems with using MeasurementOrFact in this way.
What happens when someone wants to add the next MeasurementOrFact to "Simple" Darwin Core? They won't be able to. "Every field in the Simple Darwin Core may appear either once or not at all in a single record - otherwise how could you distinguish one scientificName field from another one? Think of a database table. It will not allow you to have the same name for two different fields. Because of this design restriction (lack of flexibility for the sake of simplicity), the auxiliary fields from the ResourceRelationship and MeasurementOrFact classes are of somewhat limited utility here - you could only share one MeasurementOrFact and one ResourceRelationship per record." [2].
Why should abundance take precedence over any other potential MeasurementOrFact?
Unlike all other terms in Simple Darwin Core, the labels for the MeasurementOrFact terms will not explicitly state what is in them. There will be no definition to look up in Darwin Core to understand that the content is supposed to be an abundance, or what an abundance is.
With MeasurementOrFact in the Darwin Core Occurrence Core Type [3], data publishers using Darwin Core Archives would be tempted to use the terms to capture the measurement or fact from of most importance to them from their data set, and there would be no guidance to tell them not to do so. A consumer would have to look into the content of the measurementType field to understand what to do with the contents of the other two fields. I'd urge to keep such a mess out of the realm of Simple Darwin Core.
On Thu, Oct 24, 2013 at 1:30 PM, Éamonn Ó Tuama [GBIF] eotuama@gbif.org wrote:
Following all the excellent contributions on this list relating to expressing abundance data in Darwin Core, I think we may have uncovered a way that does not require any changes to the DwC standard. As has been pointed out, modelling sampling and abundance can be quite complex and challenging and is best done as an ontology (as in the work
of the BCO).
From the GBIF perspective, our goal is to make it as simple as possible for
data providers to deliver abundance data (in Darwin Core Archive format) that can support the science for processes such as CBD and IPBES. Here, I am using the term abundance (specifically population abundance) in the sense defined by GEO BON: "Quantity of individuals or biomass of a given taxon or functional group at a given location" [1].
The challenge is to come up with a way for Simple Darwin Core to express abundance data. It looks like we can probably work within the restrictions imposed by Simple Darwin Core [2] and use properties from MeasurementOrFact because, for the use case in question here, we only need to share one MeasurementOrFact per record.
The model: A sample, associated with a sampling protocol has one or more taxon occurrence records each of which has an abundance measure which is expressed using MeasurementOrFact.
At a minimum, we would need the following three MeasurementOrFact properties listed below (with definition and examples):
measurementType The nature of the measurement, fact, characteristic, or assertion. Examples: "tail length", "temperature", "trap line length", "survey area", "trap type".
For abundances we would use: count, percentage
measurementValue The value of the measurement, fact, characteristic, or assertion. Examples: "45", "20", "1", "14.5", "UV-light".
For abundances we would use any numeric value.
measurementMethod A description of or reference to (publication, URI) the method or protocol used to determine the measurement, fact, characteristic, or
assertion.
Examples: "minimum convex polygon around burrow entrances" for a home range area, "barometric altimeter" for an elevation.
For abundances we would use: Count/sq metre, count/litre, % of biovolume, % of species, ...
To enable automated processing, we would need to develop a controlled list for measurementMethod and, ideally, there would also be a URI for the sampling protocol used which would be recorded in the dwc:samplingProtocol field.
[1]