Thanks for considering my points.  I'd can see that I really have two issues and I need to keep them separate.  One is a specific issue in relation to abundance/abundanceAsPercent.  The other is a much bigger concern about how we are handling basisOfRecord - I will try to write something separately on that.  For now, and recognising I am rather repeating myself, but wanting to be as clear as I can:

 

On abundance/abundanceAsPercent and “abundanceAsNumber”

 

The two concepts as expressed in the current proposal both have value (abundance as a simple presentation for human eyes, abundanceAsPercent as an attempt to give quickly comparable measures).

 

I have no problem with properties with these definitions being offered and promoted for use.  However I think it is essential that a simple, generic property is provided that is guaranteed to include WHATEVER number (nothing else) is most appropriate for comparing abundance within a sample.  Personally I would prefer to call this property “abundance” and give the name “abundanceAsText” to the property defined as “abundance” in the proposal.  However let me here call it “abundanceAsNumber”.

 

It must be made as easy as possible for data publishers to specify relative abundance if there is any relevant aspect of their data that can be presented this way.  In other words, they should not have to start by computing totals so they can normalise values to percentages.  Equally it must be as easy as possible for aggregators and users to process this information.  This means it should not require text processing to remove additional wording.  It should be generic enough to accommodate both what can be placed in individualCount and whatever else might be the appropriate measure in other contexts.  Hence “abundanceAsNumber”.

 

It may be objected that none of these numbers are meaningful without adequate information on sampling protocol, level of effort, etc.  I fully agree that we should not oversell what “abundanceAsNumber” offers.  However consider the following example.

 

1.       Several entomologists define and follow a standardised protocol for malaise trapping insects and for databasing all Diptera recorded in each sample. 

2.       Another entomologist defines a protocol focussed on searching decaying tree stumps for larvae of a number of species of Diptera: Syrphidae.

3.       In each case they share their data in a denormalised form and include the following properties (and whatever others they choose – I’m not taking the time to check I use real names for these):  recordingProtocolURI, sampleId, eventDate, decimalLatitude, decimalLongitude, coordinateUncertaintyInMeters, scientificName, abundanceAsNumber

 

Aggregating this denormalised data allows us to do many things.  We can immediately answer the following questions:

 

1.       What was the most abundant syrphid recorded in each sample (regardless of protocol)?

2.       For all protocols for which any Syrphidae were recorded, which samples included at least one of syrphid A, syrphid B and syrphid C and what was the relative abundance of these three in the sample (opportunity for pie charts on maps)?

3.       Within all samples for a given protocol, how does the suite of taxa recorded, and their relative abundance, vary with geography and season?

 

My contention is that “abundanceAsNumber” is all that we need to support many such use cases and visualisations.  GBIF could offer general discovery and processing services for data including “recordingProtocolURI”+”sampleId”+”abundanceAsNumber” – whether the associated data represent bird counts, transect counts, pollen counts, trawl catches, ecogenomic data, whatever.  Anything more complex will reduce the volume of relevant data shared and/or make it harder to process. 

 

Publishers should also be encouraged to providing richer data elements as well (including documentation for the protocol and other representations of the basic abundance information, e.g. “abundanceAsText”.  These will enable users to interpret the data in more detail, having perhaps first used the generic abundanceAsNumber facilities to locate data likely to be of value.

 

I therefore rather support the early suggestion of Aaike, when he asked, “Is there a specific reason for not adopting: the abundance field as a field to store only the value, and a field abundanceUnit/abundanceType to specify whether the value is in % of species, % of biovolume, % of biomass, individuals/l, ind./m^2, ind/m^3, ind./sampling effort, ...”  The suggestion may have been based on a confusion in the way that abundanceAsPercent was documented, but I’d find his suggestion very workable.

 

Thanks,

 

Donald

----------------------------------------------------------------------

Donald Hobern - GBIF Director - dhobern@gbif.org

Global Biodiversity Information Facility http://www.gbif.org/

GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ų, Denmark

Tel: +45 3532 1471  Mob: +45 2875 1471  Fax: +45 2875 1480

----------------------------------------------------------------------

 

-----Original Message-----
From: gtuco.btuco@gmail.com [mailto:gtuco.btuco@gmail.com] On Behalf Of John Wieczorek
Sent: Thursday, September 26, 2013 5:38 PM
To: Robert Guralnick
Cc: Chuck Miller; Donald Hobern [GBIF]; TDWG Content Mailing List
Subject: Re: [tdwg-content] Proposed new Darwin Core terms - abundance, abundanceAsPercent

 

Well, if it is agreed that abundance would be a property of a MaterialSample (in the non-rigorous sense, because Darwin Core does not ever declare domains for terms), then we can propose that it have the attribute organizedInClass equal to the proposed MaterialSample for now and continue evaluating the term with the understanding that it will be defined thus IF MaterialSample is ratified.

 

If MaterialSample is not ratified for some reason, then we can either block the adoption of an abundance term or define it without an organizedInClass attribute (it would appear as a record-level term), or we could give it the attribute organizedInClass equal to Event as the next less offensive option.

 

On Thu, Sep 26, 2013 at 5:22 PM, Robert Guralnick <Robert.Guralnick@colorado.edu> wrote:

> @Tuco,  if one agrees that abundance is a property of a materialSample

> and not an occurrence, yes.  Donald's examples included "malaise

> traps, transects, expression of ITS or CO1 from environmental samples"

> --- to me those all represent sampling and samples.  Yes we have event

> properties that link events to occurrences.  But the point is that if

> it were just counts, fine.  But its not.  This is about counts over a

> specified area.  To me this is a very important point and this pushes

> us out of "occurrence".  Feel pretty strongly about it, actually.

> 

> @Donald, I resonate with Donald's perspective that what is needed is

> an ontology where we can represent relationships between classes and

> properties more effectively for graph traversal, and of course, there

> are efforts currently being developed to do just that (e.g. the

> Biocollections Ontology, where this would be a trivial use case).

> 

> Best, Rob

> 

> 

> On Thu, Sep 26, 2013 at 9:10 AM, John Wieczorek <tuco@berkeley.edu> wrote:

>> 

>> @Rob,

>> 

>> Do you think the ratification of a MaterialSample class (and the

>> associated property materialSampleID) would have any effect on the

>> viability or definition of an abundance term in Darwin Core?

>> 

>> 

>> On Thu, Sep 26, 2013 at 5:04 PM, John Wieczorek <tuco@berkeley.edu> wrote:

>> > Not that is is propoer justification, but abundance was recommended

>> > to be organized within Occurrence (no actual semantic link will be

>> > put into existence with this proposal) simply because

>> > individualCount, whose failings inspired the abundance term, was

>> > organized in Occurrence.

>> >

>> > On Thu, Sep 26, 2013 at 4:57 PM, Chuck Miller

>> > <Chuck.Miller@mobot.org>

>> > wrote:

>> >> Although abundance is not “evidence of an occurrence in nature” it

>> >> is “information pertaining to evidence”, isn’t it?

>> >>

>> >>

>> >>

>> >> Chuck

>> >>

>> >>

>> >>

>> >> From: tdwg-content-bounces@lists.tdwg.org

>> >> [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Robert

>> >> Guralnick

>> >> Sent: Thursday, September 26, 2013 9:37 AM

>> >> To: Donald Hobern [GBIF]

>> >> Cc: TDWG Content Mailing List

>> >>

>> >>

>> >> Subject: Re: [tdwg-content] Proposed new Darwin Core terms -

>> >> abundance, abundanceAsPercent

>> >>

>> >>

>> >>

>> >>

>> >>

>> >>   I agree with Donald here regarding the need for Abundance, but

>> >> am, to be honest, not quite I understand (or agree) with the logic

>> >> of the proposal.

>> >> Abundance is listed as a property of an occurrence, and I wonder

>> >> if that make sense given the class definition "The category of

>> >> information pertaining to evidence of an occurrence in nature, in

>> >> a collection, or in a dataset (specimen, observation, etc.)"  Is

>> >> abundance "evidence of an occurrence in nature".  To me, abundance

>> >> is a property of a survey and its associated methodology and is

>> >> based on multiple occurrences that come from a sample and a

>> >> definition of extent.

>> >>

>> >>

>> >>

>> >>   It seems to me to be a bad fit to scrunch abundance into the

>> >> occurrence class.  I recognize that it might not quite fit

>> >> anywhere in DwC yet.

>> >> Wouldn't it be better to wait to see if materialSample is ratified

>> >> as a class within the Darwin Core?

>> >>

>> >>

>> >>

>> >> Best, Rob

>> >>

>> >>

>> >>

>> >>

>> >>

>> >>

>> >>

>> >>

>> >>

>> >> On Thu, Sep 26, 2013 at 8:01 AM, Donald Hobern [GBIF]

>> >> <dhobern@gbif.org>

>> >> wrote:

>> >>

>> >> Thanks, John.

>> >>

>> >> You are correct.  I think though that abundance is such a commonly

>> >> needed property that it would be a mistake not to make it work

>> >> easily even in Simple Darwin Core.

>> >>

>> >>

>> >> Donald

>> >>

>> >> ------------------------------------------------------------------

>> >> ---- Donald Hobern - GBIF Director - dhobern@gbif.org Global

>> >> Biodiversity Information Facility http://www.gbif.org/ GBIF

>> >> Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ų, Denmark

>> >> Tel: +45 3532 1471  Mob: +45 2875 1471  Fax: +45 2875 1480

>> >> ------------------------------------------------------------------

>> >> ----

>> >>

>> >>

>> >> -----Original Message-----

>> >>

>> >> From: gtuco.btuco@gmail.com [mailto:gtuco.btuco@gmail.com] On

>> >> Behalf Of John Wieczorek

>> >> Sent: Thursday, September 26, 2013 3:48 PM

>> >> To: Donald Hobern [GBIF]

>> >> Cc: aaike.dewever@naturalsciences.be; TDWG Content Mailing List

>> >> Subject: Re: [tdwg-content] Proposed new Darwin Core terms -

>> >> abundance, abundanceAsPercent

>> >>

>> >> Could every concept of abundance be captured in a combination of

>> >> abundance, abundanceUnit, abundanceMethod?

>> >>

>> >> If so, is there justification for creating new terms at all if the

>> >> concepts can be captured in MeasurentsOrFacts

>> >> (http://rs.tdwg.org/dwc/terms/index.htm#measureindex), which have

>> >> the following properties?

>> >>

>> >> measurementType

>> >> measurementValue

>> >> measurementAccuracy

>> >> measurementUnit

>> >> measurementDeterminedDate

>> >> measurementDeterminedBy

>> >> measurementMethod

>> >> measurementRemarks

>> >>

>> >> The only drawback I can see is that with MeasurementOrFacts you

>> >> could not share the abunance information in Simple Darwin Core. To

>> >> understand why, see

>> >> http://rs.tdwg.org/dwc/terms/simple/index.htm#rules.

>> >>

>> >>

>> >> On Thu, Sep 26, 2013 at 10:50 AM, Donald Hobern [GBIF]

>> >> <dhobern@gbif.org>

>> >> wrote:

>> >>> Thanks - I think I too have missed something.  If we want to make

>> >>> these terms usable, there needs to be a simple way to get numbers

>> >>> out of records that can be compared with one another where

>> >>> sampling methods allow such comparisons.  The suggested plain

>> >>> text examples for Abundance don't make this possible.  Forcing

>> >>> normalisation into percentages seems an unnecessary hurdle and

>> >>> risks encouraging the impression that number of ducks on a

>> >>> reservoir is somehow comparable with percentage dry mass,

>> >>> proportional expression of CO1 for a particular species in an ecogenomics sample, or whatever.

>> >>>

>> >>> I would much rather we ensured we had a standard, preferred field

>> >>> which the data publisher can populate directly with whatever

>> >>> number is the most appropriate expression of the relative

>> >>> abundance in the sample.  That gives consumers a clear

>> >>> expectation of how to interpret and

>> >> handle it.

>> >>>

>> >>> Donald

>> >>>

>> >>> -----------------------------------------------------------------

>> >>> ----- Donald Hobern - GBIF Director - dhobern@gbif.org Global

>> >>> Biodiversity Information Facility http://www.gbif.org/ GBIF

>> >>> Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ų, Denmark

>> >>> Tel: +45 3532 1471  Mob: +45 2875 1471  Fax: +45 2875 1480

>> >>> -----------------------------------------------------------------

>> >>> -----

>> >>>

>> >>>

>> >>> -----Original Message-----

>> >>> From: tdwg-content-bounces@lists.tdwg.org

>> >>> [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Aaike

>> >>> De Wever

>> >>> Sent: Thursday, September 26, 2013 8:44 AM

>> >>> To: tuco@berkeley.edu; TDWG Content Mailing List

>> >>> Subject: Re: [tdwg-content] Proposed new Darwin Core terms -

>> >>> abundance, abundanceAsPercent

>> >>>

>> >>> Dear all,

>> >>>

>> >>> As somewhat of an outsider I have another question with regards

>> >>> to the proposed terms abundance and abundanceAsPercent.

>> >>>

>> >>> Is there a specific reason for not adopting:

>> >>> * the abundance field as a field to store only the value and

>> >>> * a field abundanceUnit/abundanceType to specify whether the

>> >>> value is in % of species, % of biovolume, % of biomass,

>> >>> individuals/l, ind./m^2, ind/m^3, ind./sampling

>> >>> effort,...(instead of having a field

>> >> specific for %)?

>> >>>

>> >>> Maybe this has been discussed during the hackathon and I missed

>> >>> it in the report?

>> >>>

>> >>> Thanks for considering this question.

>> >>>

>> >>> With best regards,

>> >>> Aaike

>> >>>

>> >>> John Wieczorek wrote:

>> >>>> Dear all,

>> >>>>

>> >>>> GBIF has just published "Meeting Report: GBIF hackathon-workshop

>> >>>> on Darwin Core and sample data (22-24 May 2013)" at

>> >>>> http://www.gbif.org/orc/?doc_id=5424. Now that this document is

>> >>>> available for public reference, I would like to formally open

>> >>>> the minimum 30-day comment period on two related new terms

>> >>>> proposed during the workshop and defined in the referenced document.

>> >>>>

>> >>>> The formal proposal would add the following new terms:

>> >>>>

>> >>>> Term Name: abundance

>> >>>> Identifier: http://rs.tdwg.org/dwc/terms/abundance

>> >>>> Namespace: http:/rs.tdwg.org/dwc/terms/

>> >>>> Label: Abundance

>> >>>> Definition: The number of individuals of a taxon found in a sample.

>> >>>> This is typically expressed as number per unit of area or

>> >>>> volume. In the case of vegetation and colonial/encrusting

>> >>>> species, percent cover can be used.

>> >>>> Comment: Examples: "4 per square meter", "0.32 per cubic meter",

>> >>>> "24%". For discussion see

>> >>>> http://code.google.com/p/darwincore/wiki/Occurrence (there will

>> >>>> be no further documentation here until the term is ratified) Type of Term:

>> >>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#Property

>> >>>> Refines:

>> >>>> Status: proposed

>> >>>> Date Issued: 2012-03-01

>> >>>> Date Modified: 2013-09-25

>> >>>> Has Domain:

>> >>>> Has Range:

>> >>>> Refines:

>> >>>> Version: abundance-2013-09-25

>> >>>> Replaces:

>> >>>> IsReplaceBy:

>> >>>> Class: http://rs.tdwg.org/dwc/terms/Occurrence

>> >>>> ABCD 2.0.6: not in ABCD (someone please confirm or deny this)

>> >>>>

>> >>>> Term Name: abundanceAsPercent

>> >>>> Identifier: http://rs.tdwg.org/dwc/terms/abundanceAsPercent

>> >>>> Namespace: http:/rs.tdwg.org/dwc/terms/

>> >>>> Label: Abundance as Percent

>> >>>> Definition: 100 times the number of individuals of a taxon found

>> >>>> in a sample divided by the total number of individuals of all

>> >>>> taxa in the sample.

>> >>>> Comment: Examples: "2.4%". For discussion see

>> >>>> http://code.google.com/p/darwincore/wiki/Occurrence (there will

>> >>>> be no further documentation here until the term is ratified) Type of Term:

>> >>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#Property

>> >>>> Refines:

>> >>>> Status: proposed

>> >>>> Date Issued: 2012-08-01

>> >>>> Date Modified: 2013-09-25

>> >>>> Has Domain:

>> >>>> Has Range:

>> >>>> Refines:

>> >>>> Version: abundanceAsPercent-2013-09-25

>> >>>> Replaces:

>> >>>> IsReplaceBy:

>> >>>> Class: http://rs.tdwg.org/dwc/terms/Occurrence

>> >>>> ABCD 2.0.6: not in ABCD (someone please confirm or deny this)

>> >>>>

>> >>>> The related issues in the Darwin Core issue tracker are

>> >>>> https://code.google.com/p/darwincore/issues/detail?id=142

>> >>>> and

>> >>>> https://code.google.com/p/darwincore/issues/detail?id=187

>> >>>>

>> >>>> If there are any objections to the changes proposed for these

>> >>>> new terms, or comments about their definitions, please respond

>> >>>> to this message. If there are no objections or if consensus can

>> >>>> be reached on any amendments put forward, the proposal will go

>> >>>> before the Executive Committee for authorization to put these

>> >>>> additions into effect after the public commentary period.

>> >>>>

>> >>>> Cheers,

>> >>>>

>> >>>> John

>> >>>> _______________________________________________

>> >>>> tdwg-content mailing list

>> >>>> tdwg-content@lists.tdwg.org

>> >>>> http://lists.tdwg.org/mailman/listinfo/tdwg-content

>> >>>

>> >>> --

>> >>> Aaike De Wever

>> >>> BioFresh Science Officer

>> >>> Freshwater Laboratory, Royal Belgian Institute of Natural

>> >>> Sciences Vautierstraat 29, 1000 Brussels Belgium

>> >>> tel.: +32(0)2 627 43 90

>> >>> mobile.: +32(0)486 28 05 93

>> >>> email: <aaike.dewever@naturalsciences.be>

>> >>> skype: aaikew

>> >>> LinkedIn: <http://be.linkedin.com/in/aaikedewever>

>> >>> BioFresh: <http://www.freshwaterbiodiversity.eu/> and

>> >>> <http://data.freshwaterbiodiversity.eu/>

>> >>> Belgian Biodiversity Platform: <http://www.biodiversity.be>

>> >>> _______________________________________________

>> >>> tdwg-content mailing list

>> >>> tdwg-content@lists.tdwg.org

>> >>> http://lists.tdwg.org/mailman/listinfo/tdwg-content

>> >>>

>> >>>

>> >>

>> >>

>> >> _______________________________________________

>> >> tdwg-content mailing list

>> >> tdwg-content@lists.tdwg.org

>> >> http://lists.tdwg.org/mailman/listinfo/tdwg-content

>> >>

>> >>

>> >>

>> >>

>> >> _______________________________________________

>> >> tdwg-content mailing list

>> >> tdwg-content@lists.tdwg.org

>> >> http://lists.tdwg.org/mailman/listinfo/tdwg-content

>> >>

> 

>