Donald, I completely agree. We need to have some data as numbers as we move to a future of more application of biodiversity informatics data to predictive modeling, conservation, environmental engineering, etc. as discussed at the Horizons 2013 conference. Having a basic, un-nuanced abundanceAsNumber would be very useful to get started.
Chuck
From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Donald Hobern [GBIF] Sent: Friday, September 27, 2013 8:52 AM To: tuco@berkeley.edu; 'Robert Guralnick' Cc: Chuck Miller; 'TDWG Content Mailing List' Subject: Re: [tdwg-content] Proposed new Darwin Core terms - abundance, abundanceAsPercent
Thanks for considering my points. I'd can see that I really have two issues and I need to keep them separate. One is a specific issue in relation to abundance/abundanceAsPercent. The other is a much bigger concern about how we are handling basisOfRecord - I will try to write something separately on that. For now, and recognising I am rather repeating myself, but wanting to be as clear as I can:
On abundance/abundanceAsPercent and "abundanceAsNumber"
The two concepts as expressed in the current proposal both have value (abundance as a simple presentation for human eyes, abundanceAsPercent as an attempt to give quickly comparable measures).
I have no problem with properties with these definitions being offered and promoted for use. However I think it is essential that a simple, generic property is provided that is guaranteed to include WHATEVER number (nothing else) is most appropriate for comparing abundance within a sample. Personally I would prefer to call this property "abundance" and give the name "abundanceAsText" to the property defined as "abundance" in the proposal. However let me here call it "abundanceAsNumber".
It must be made as easy as possible for data publishers to specify relative abundance if there is any relevant aspect of their data that can be presented this way. In other words, they should not have to start by computing totals so they can normalise values to percentages. Equally it must be as easy as possible for aggregators and users to process this information. This means it should not require text processing to remove additional wording. It should be generic enough to accommodate both what can be placed in individualCount and whatever else might be the appropriate measure in other contexts. Hence "abundanceAsNumber".
It may be objected that none of these numbers are meaningful without adequate information on sampling protocol, level of effort, etc. I fully agree that we should not oversell what "abundanceAsNumber" offers. However consider the following example.
1. Several entomologists define and follow a standardised protocol for malaise trapping insects and for databasing all Diptera recorded in each sample.
2. Another entomologist defines a protocol focussed on searching decaying tree stumps for larvae of a number of species of Diptera: Syrphidae.
3. In each case they share their data in a denormalised form and include the following properties (and whatever others they choose - I'm not taking the time to check I use real names for these): recordingProtocolURI, sampleId, eventDate, decimalLatitude, decimalLongitude, coordinateUncertaintyInMeters, scientificName, abundanceAsNumber
Aggregating this denormalised data allows us to do many things. We can immediately answer the following questions:
1. What was the most abundant syrphid recorded in each sample (regardless of protocol)?
2. For all protocols for which any Syrphidae were recorded, which samples included at least one of syrphid A, syrphid B and syrphid C and what was the relative abundance of these three in the sample (opportunity for pie charts on maps)?
3. Within all samples for a given protocol, how does the suite of taxa recorded, and their relative abundance, vary with geography and season?
My contention is that "abundanceAsNumber" is all that we need to support many such use cases and visualisations. GBIF could offer general discovery and processing services for data including "recordingProtocolURI"+"sampleId"+"abundanceAsNumber" - whether the associated data represent bird counts, transect counts, pollen counts, trawl catches, ecogenomic data, whatever. Anything more complex will reduce the volume of relevant data shared and/or make it harder to process.
Publishers should also be encouraged to providing richer data elements as well (including documentation for the protocol and other representations of the basic abundance information, e.g. "abundanceAsText". These will enable users to interpret the data in more detail, having perhaps first used the generic abundanceAsNumber facilities to locate data likely to be of value.
I therefore rather support the early suggestion of Aaike, when he asked, "Is there a specific reason for not adopting: the abundance field as a field to store only the value, and a field abundanceUnit/abundanceType to specify whether the value is in % of species, % of biovolume, % of biomass, individuals/l, ind./m^2, ind/m^3, ind./sampling effort, ..." The suggestion may have been based on a confusion in the way that abundanceAsPercent was documented, but I'd find his suggestion very workable.
Thanks,
Donald
----------------------------------------------------------------------
Donald Hobern - GBIF Director - dhobern@gbif.orgmailto:dhobern@gbif.org
Global Biodiversity Information Facility http://www.gbif.org/
GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark
Tel: +45 3532 1471 Mob: +45 2875 1471 Fax: +45 2875 1480
----------------------------------------------------------------------
-----Original Message----- From: gtuco.btuco@gmail.commailto:gtuco.btuco@gmail.com [mailto:gtuco.btuco@gmail.com] On Behalf Of John Wieczorek Sent: Thursday, September 26, 2013 5:38 PM To: Robert Guralnick Cc: Chuck Miller; Donald Hobern [GBIF]; TDWG Content Mailing List Subject: Re: [tdwg-content] Proposed new Darwin Core terms - abundance, abundanceAsPercent
Well, if it is agreed that abundance would be a property of a MaterialSample (in the non-rigorous sense, because Darwin Core does not ever declare domains for terms), then we can propose that it have the attribute organizedInClass equal to the proposed MaterialSample for now and continue evaluating the term with the understanding that it will be defined thus IF MaterialSample is ratified.
If MaterialSample is not ratified for some reason, then we can either block the adoption of an abundance term or define it without an organizedInClass attribute (it would appear as a record-level term), or we could give it the attribute organizedInClass equal to Event as the next less offensive option.
On Thu, Sep 26, 2013 at 5:22 PM, Robert Guralnick <Robert.Guralnick@colorado.edumailto:Robert.Guralnick@colorado.edu> wrote:
@Tuco, if one agrees that abundance is a property of a materialSample
and not an occurrence, yes. Donald's examples included "malaise
traps, transects, expression of ITS or CO1 from environmental samples"
--- to me those all represent sampling and samples. Yes we have event
properties that link events to occurrences. But the point is that if
it were just counts, fine. But its not. This is about counts over a
specified area. To me this is a very important point and this pushes
us out of "occurrence". Feel pretty strongly about it, actually.
@Donald, I resonate with Donald's perspective that what is needed is
an ontology where we can represent relationships between classes and
properties more effectively for graph traversal, and of course, there
are efforts currently being developed to do just that (e.g. the
Biocollections Ontology, where this would be a trivial use case).
Best, Rob
On Thu, Sep 26, 2013 at 9:10 AM, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
@Rob,
Do you think the ratification of a MaterialSample class (and the
associated property materialSampleID) would have any effect on the
viability or definition of an abundance term in Darwin Core?
On Thu, Sep 26, 2013 at 5:04 PM, John Wieczorek <tuco@berkeley.edumailto:tuco@berkeley.edu> wrote:
Not that is is propoer justification, but abundance was recommended
to be organized within Occurrence (no actual semantic link will be
put into existence with this proposal) simply because
individualCount, whose failings inspired the abundance term, was
organized in Occurrence.
On Thu, Sep 26, 2013 at 4:57 PM, Chuck Miller
<Chuck.Miller@mobot.orgmailto:Chuck.Miller@mobot.org>
wrote:
Although abundance is not "evidence of an occurrence in nature" it
is "information pertaining to evidence", isn't it?
Chuck
From: tdwg-content-bounces@lists.tdwg.orgmailto:tdwg-content-bounces@lists.tdwg.org
[mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Robert
Guralnick
Sent: Thursday, September 26, 2013 9:37 AM
To: Donald Hobern [GBIF]
Cc: TDWG Content Mailing List
Subject: Re: [tdwg-content] Proposed new Darwin Core terms -
abundance, abundanceAsPercent
I agree with Donald here regarding the need for Abundance, but
am, to be honest, not quite I understand (or agree) with the logic
of the proposal.
Abundance is listed as a property of an occurrence, and I wonder
if that make sense given the class definition "The category of
information pertaining to evidence of an occurrence in nature, in
a collection, or in a dataset (specimen, observation, etc.)" Is
abundance "evidence of an occurrence in nature". To me, abundance
is a property of a survey and its associated methodology and is
based on multiple occurrences that come from a sample and a
definition of extent.
It seems to me to be a bad fit to scrunch abundance into the
occurrence class. I recognize that it might not quite fit
anywhere in DwC yet.
Wouldn't it be better to wait to see if materialSample is ratified
as a class within the Darwin Core?
Best, Rob
On Thu, Sep 26, 2013 at 8:01 AM, Donald Hobern [GBIF]
<dhobern@gbif.orgmailto:dhobern@gbif.org>
wrote:
Thanks, John.
You are correct. I think though that abundance is such a commonly
needed property that it would be a mistake not to make it work
easily even in Simple Darwin Core.
Donald
---- Donald Hobern - GBIF Director - dhobern@gbif.orgmailto:dhobern@gbif.org Global
Biodiversity Information Facility http://www.gbif.org/ GBIF
Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark
Tel: +45 3532 1471 Mob: +45 2875 1471 Fax: +45 2875 1480
-----Original Message-----
From: gtuco.btuco@gmail.commailto:gtuco.btuco@gmail.com [mailto:gtuco.btuco@gmail.com] On
Behalf Of John Wieczorek
Sent: Thursday, September 26, 2013 3:48 PM
To: Donald Hobern [GBIF]
Cc: aaike.dewever@naturalsciences.bemailto:aaike.dewever@naturalsciences.be; TDWG Content Mailing List
Subject: Re: [tdwg-content] Proposed new Darwin Core terms -
abundance, abundanceAsPercent
Could every concept of abundance be captured in a combination of
abundance, abundanceUnit, abundanceMethod?
If so, is there justification for creating new terms at all if the
concepts can be captured in MeasurentsOrFacts
(http://rs.tdwg.org/dwc/terms/index.htm#measureindex), which have
the following properties?
measurementType
measurementValue
measurementAccuracy
measurementUnit
measurementDeterminedDate
measurementDeterminedBy
measurementMethod
measurementRemarks
The only drawback I can see is that with MeasurementOrFacts you
could not share the abunance information in Simple Darwin Core. To
understand why, see
On Thu, Sep 26, 2013 at 10:50 AM, Donald Hobern [GBIF]
<dhobern@gbif.orgmailto:dhobern@gbif.org>
wrote:
Thanks - I think I too have missed something. If we want to make
these terms usable, there needs to be a simple way to get numbers
out of records that can be compared with one another where
sampling methods allow such comparisons. The suggested plain
text examples for Abundance don't make this possible. Forcing
normalisation into percentages seems an unnecessary hurdle and
risks encouraging the impression that number of ducks on a
reservoir is somehow comparable with percentage dry mass,
proportional expression of CO1 for a particular species in an ecogenomics sample, or whatever.
I would much rather we ensured we had a standard, preferred field
which the data publisher can populate directly with whatever
number is the most appropriate expression of the relative
abundance in the sample. That gives consumers a clear
expectation of how to interpret and
handle it.
Donald
----- Donald Hobern - GBIF Director - dhobern@gbif.orgmailto:dhobern@gbif.org Global
Biodiversity Information Facility http://www.gbif.org/ GBIF
Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark
Tel: +45 3532 1471 Mob: +45 2875 1471 Fax: +45 2875 1480
-----Original Message-----
From: tdwg-content-bounces@lists.tdwg.orgmailto:tdwg-content-bounces@lists.tdwg.org
[mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Aaike
De Wever
Sent: Thursday, September 26, 2013 8:44 AM
To: tuco@berkeley.edumailto:tuco@berkeley.edu; TDWG Content Mailing List
Subject: Re: [tdwg-content] Proposed new Darwin Core terms -
abundance, abundanceAsPercent
Dear all,
As somewhat of an outsider I have another question with regards
to the proposed terms abundance and abundanceAsPercent.
Is there a specific reason for not adopting:
- the abundance field as a field to store only the value and
- a field abundanceUnit/abundanceType to specify whether the
value is in % of species, % of biovolume, % of biomass,
individuals/l, ind./m^2, ind/m^3, ind./sampling
effort,...(instead of having a field
specific for %)?
Maybe this has been discussed during the hackathon and I missed
it in the report?
Thanks for considering this question.
With best regards,
Aaike
John Wieczorek wrote:
Dear all,
GBIF has just published "Meeting Report: GBIF hackathon-workshop
on Darwin Core and sample data (22-24 May 2013)" at
http://www.gbif.org/orc/?doc_id=5424. Now that this document is
available for public reference, I would like to formally open
the minimum 30-day comment period on two related new terms
proposed during the workshop and defined in the referenced document.
The formal proposal would add the following new terms:
Term Name: abundance
Identifier: http://rs.tdwg.org/dwc/terms/abundance
Namespace: http:/rs.tdwg.org/dwc/terms/
Label: Abundance
Definition: The number of individuals of a taxon found in a sample.
This is typically expressed as number per unit of area or
volume. In the case of vegetation and colonial/encrusting
species, percent cover can be used.
Comment: Examples: "4 per square meter", "0.32 per cubic meter",
"24%". For discussion see
http://code.google.com/p/darwincore/wiki/Occurrence (there will
be no further documentation here until the term is ratified) Type of Term:
Refines:
Status: proposed
Date Issued: 2012-03-01
Date Modified: 2013-09-25
Has Domain:
Has Range:
Refines:
Version: abundance-2013-09-25
Replaces:
IsReplaceBy:
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)
Term Name: abundanceAsPercent
Identifier: http://rs.tdwg.org/dwc/terms/abundanceAsPercent
Namespace: http:/rs.tdwg.org/dwc/terms/
Label: Abundance as Percent
Definition: 100 times the number of individuals of a taxon found
in a sample divided by the total number of individuals of all
taxa in the sample.
Comment: Examples: "2.4%". For discussion see
http://code.google.com/p/darwincore/wiki/Occurrence (there will
be no further documentation here until the term is ratified) Type of Term:
Refines:
Status: proposed
Date Issued: 2012-08-01
Date Modified: 2013-09-25
Has Domain:
Has Range:
Refines:
Version: abundanceAsPercent-2013-09-25
Replaces:
IsReplaceBy:
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)
The related issues in the Darwin Core issue tracker are
and
If there are any objections to the changes proposed for these
new terms, or comments about their definitions, please respond
to this message. If there are no objections or if consensus can
be reached on any amendments put forward, the proposal will go
before the Executive Committee for authorization to put these
additions into effect after the public commentary period.
Cheers,
John
tdwg-content mailing list
tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org
--
Aaike De Wever
BioFresh Science Officer
Freshwater Laboratory, Royal Belgian Institute of Natural
Sciences Vautierstraat 29, 1000 Brussels Belgium
tel.: +32(0)2 627 43 90
mobile.: +32(0)486 28 05 93
email: <aaike.dewever@naturalsciences.bemailto:aaike.dewever@naturalsciences.be>
skype: aaikew
LinkedIn: http://be.linkedin.com/in/aaikedewever
BioFresh: http://www.freshwaterbiodiversity.eu/ and
Belgian Biodiversity Platform: http://www.biodiversity.be
tdwg-content mailing list
tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org
tdwg-content mailing list
tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org
tdwg-content mailing list
tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org