[tdwg-content] Proposed new Darwin Core terms - abundance, abundanceAsPercent

John Wieczorek tuco at berkeley.edu
Sat Oct 26 17:50:48 CEST 2013


I see what you are proposing, but I think there some things that have to be
thought through carefully. First, why would a new core type be needed? In
the Darwin Core Workshop we concluded that we could cast almost every use
case with an Occurrence Core in a star schema. For the abundance case, we
could construct records with a normal Occurrence Core and MeasurementOrFact
extension. Or even with an abundance extension, where the abundance terms
were defined outside of Darwin Core.

If we had a MeasurementOrFact extension specifically for abundance, what
are the technical implications when somebody sees how you've done that and
says, "Hey, I could do that for body mass too?" Now we would have another
extension based on MeasurementOrFact. If you try to flatten the results,
which would be reasonable since the relationships between the Core and each
extension are logically one-to-one, you'll end up with duplication of
column names again.

So, whereas I see that you could do this, I'm not convinced it is a good
idea. I'm willing to be convinced though, because it could be brilliant and
I am just being incapable of seeing it.

On Thursday, October 24, 2013, Éamonn Ó Tuama [GBIF] wrote:

> I think my use of the term "Simple Darwin Core" was misleading. I was not
> suggesting putting a MeasurementOrfact in an Occurrence core. Rather we
> would create a new core as discussed in the DwC sample workshop - let's
> call
> it samplingEvent for now. Its properties would be drawn from Simple DwC
> (mainly Event, Location classes). We would then create an abundance
> extension consisting of occurrence records each of which has a single
> MeasurementOrFact along the lines I indicated. The extension name would
> indicate the nature of the MeasurementOrFact to a consuming application.
>
> -----Original Message-----
> From: gtuco.btuco at gmail.com [mailto:gtuco.btuco at gmail.com] On Behalf Of
> John
> Wieczorek
> Sent: 24 October 2013 14:26
> To: Éamonn Ó Tuama [GBIF]
> Cc: TDWG Content Mailing List
> Subject: Re: [tdwg-content] Proposed new Darwin Core terms - abundance,
> abundanceAsPercent
>
> I see several problems with using MeasurementOrFact in this way.
>
> What happens when someone wants to add the next MeasurementOrFact to
> "Simple" Darwin Core? They won't be able to. "Every field in the Simple
> Darwin Core may appear either once or not at all in a single record -
> otherwise how could you distinguish one scientificName field from another
> one? Think of a database table. It will not allow you to have the same name
> for two different fields. Because of this design restriction (lack of
> flexibility for the sake of simplicity), the auxiliary fields from the
> ResourceRelationship and MeasurementOrFact classes are of somewhat limited
> utility here - you could only share one MeasurementOrFact and one
> ResourceRelationship per record." [2].
>
> Why should abundance take precedence over any other potential
> MeasurementOrFact?
>
> Unlike all other terms in Simple Darwin Core, the labels for the
> MeasurementOrFact terms will not explicitly state what is in them.
> There will be no definition to look up in Darwin Core to understand that
> the
> content is supposed to be an abundance, or what an abundance is.
>
> With MeasurementOrFact in the Darwin Core Occurrence Core Type [3], data
> publishers using Darwin Core Archives would be tempted to use the terms to
> capture the measurement or fact from of most importance to them from their
> data set, and there would be no guidance to tell them not to do so. A
> consumer would have to look into the content of the measurementType field
> to
> understand what to do with the contents of the other two fields. I'd urge
> to
> keep such a mess out of the realm of Simple Darwin Core.
>
> On Thu, Oct 24, 2013 at 1:30 PM, Éamonn Ó Tuama [GBIF] <eotuama at gbif.org>
> wrote:
> > Following all the excellent contributions on this list relating to
> > expressing abundance data in Darwin Core, I think we may have
> > uncovered a way that does not require any changes to the DwC standard.
> > As has been pointed out, modelling sampling and abundance can be quite
> > complex and challenging and is best done as an ontology (as in the work
> of
> the BCO).
> > >From the GBIF perspective, our goal is to make it as simple as
> > >possible for
> > data providers to deliver abundance data (in Darwin Core Archive
> > format) that can support the science for processes such as CBD and
> > IPBES. Here, I am using the term abundance (specifically population
> > abundance) in the sense defined by GEO BON: "Quantity of individuals
> > or biomass of a given taxon or functional group at a given location" [1].
> >
> > The challenge is to come up with a way for Simple Darwin Core to
> > express abundance data. It looks like we can probably work within the
> > restrictions imposed by Simple Darwin Core [2] and use properties from
> > MeasurementOrFact because, for the use case in question here, we only
> > need to share one MeasurementOrFact per record.
> >
> > The model: A sample, associated with a sampling protocol has one or
> > more taxon occurrence records each of which has an abundance measure
> > which is expressed using MeasurementOrFact.
> >
> > At a minimum, we would need the following three MeasurementOrFact
> > properties listed below (with definition and examples):
> >
> > measurementType
> > The nature of the measurement, fact, characteristic, or assertion.
> > Examples: "tail length", "temperature", "trap line length", "survey
> > area", "trap type".
> >
> > For abundances we would use: count, percentage
> >
> > measurementValue
> > The value of the measurement, fact, characteristic, or assertion.
> > Examples: "45", "20", "1", "14.5", "UV-light".
> >
> > For abundances we would use any numeric value.
> >
> > measurementMethod
> > A description of or reference to (publication, URI) the method or
> > protocol used to determine the measurement, fact, characteristic, or
> assertion.
> > Examples: "minimum convex polygon around burrow entrances" for a home
> > range area, "barometric altimeter" for an elevation.
> >
> > For abundances we would use: Count/sq metre, count/litre, % of
> > biovolume, % of species, ...
> >
> > To enable automated processing, we would need to develop a controlled
> > list for measurementMethod and, ideally, there would also be a URI for
> > the sampling protocol used which would be recorded in the
> > dwc:samplingProtocol field.
> >
> [1]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20131026/573ba158/attachment.html 


More information about the tdwg-content mailing list