Sorry to come to this discussion late. In case this is relevant, OBIS we have the following - see  
http://www.iobis.org/node/304 and http://rs.tdwg.org/dwc/terms/history/ . I would think "ObservedIndividualCount" might be close to what is being discussed here, for some cases anyway (i.e. units independent). Note, the ObservedWeight is common in fisheries science for catch compositions by weight.

The number of individuals present in the lot or container. Not an estimate of abundance or density at the collecting locality.

Highly Recommended
Samplesize: the size of the sample from which the collection/observation was drawn. It can be a volume (e.g. for a phytoplankton sample), a linear distance (e.g. for a visual transect or net haul), a surface area (e.g. for a benthic core), etc. This field must also include the units, e.g. ?200 m? for a transect, or ?0.25 m^2? for a benthic grab (use ^ to denote a superscript). Note that when multiple collections/observations are reported from the same physical sample, a code identifying the sample can be placed in the Field_Number field to allow all collections/observations from a single sample to be connected.

Highly Recommended
The number of individuals (abundance) found in a collection/record event.

The total biomass found in a collection/record event. Expressed as kg.

We may also need other properties in parallel to enable full expression of all detail.  However I think that this might be achievable through careful planning of the “abundanceAsNumber” field and some kind of “abundanceUnit” field (which could be the anchor for much more semantic machinery) – associated fields on methodology and level of effort will also be important.

However the pure number will be useful as it stands.  Given this discussion, I think the main issue is that the name “abundance” may cause too much ambiguity and confusion.  I think we are really talking about something closer to “quantity”.


The various terms to describe abundance, frequency, occurrence etc etc. appear to be so different for different taxonomic groups and different purposes: Flocks of geese, bulks of timber, kilo's of fish, ramets/genets/cover for creeping plants, abundance as numbers of stems > 10 cm dbh for others, etc etc.

Are not attempts to capture this in a single vocabulary as doomed as attempts to do develop a single ontology to capture all characters/character states across all taxa?

Some thoughts about usage of some of the terms under discussion and the ultimate goals

I often use the term "abundance" as a refinement of the term "occurrence".

I say  "Species X occurs or does not occur at Location L.  This kind of statement is refined with a statement such as "Females are rare in March" or "Juveniles are common or very abundant in July".  The details of the life stage, timing and numbers of individuals or groups are more specific but still categorical.

I think of such statements as being derived from "expert opinion" or being summary statements based on enumerations from a series of samples taken during several field trips. I think these kinds of generalities are useful for policy making.

In terms of data, the simplest category is "presence only" data. Some times this is referred to as occurrence data - some number of individuals are seen at a particular time and location. Many museum records fall into this category as there is no other information associated with the specimen.  Such records are equivalent to an incidental sighting by a field ecologist.

Today many biodiversity data are collect following a particular protocol. Here sampling effort is known.  A specific number of individuals seen in a "given area" and during a known time interval. From this information one can compute population densities.

As several others have commented, the results are dependent on the particular protocols and it can be difficult to reconcile data from two different protocols.  How are data from pitfall traps equivalent to data from baits.....? This is an important challenge because we know that usually one and often two protocols are insufficient to characterize the presence of all members of a taxon

If I am not mistaken, the goal is to establish terms that will let humans efficiently prepare data and write programs so that machines can reason with the data. If so, the key challenges seem to be

1) Distinguishing presence-only data from data collected using known protocols
2) Finding ways to reconcile data from different protocols

