[tdwg-content] Indicating which Occurrence resources are useful for documenting distributions

Steve Baskauf steve.baskauf at vanderbilt.edu
Tue Feb 16 05:29:16 CET 2010


I have had several off-list email exchanges about this topic and have 
been encouraged to bring the topic to this list for comment.  Along with 
addressing this topic specifically, I'd like to include clarification of 
how I view Occurrences, based partly on emails written during the 
October/November tdwg-content list discussion regarding the conflict 
between Darwin Core and Dublin Core usage of dcterms:type, partly on 
some off-list email discussions about various DwC terms and their 
appropriate application, and partly on my own view of the relationship 
between Occurrences and the proposed Darwin Core class Individual.  If I 
have misrepresented anything, please reply to the list with your 
perspective on the situation.   (The following namespaces are used for 
brevity: dcterms="http://purl.org/dc/terms/",  
dcmitype="http://purl.org/dc/dcmitype/", 
dwc="http://rs.tdwg.org/dwc/terms/", and 
dwctype="http://rs.tdwg.org/dwc/dwctype/".)
 

A. CHARACTERISTICS OF MEMBERS OF THE CLASS dwc:Occurrence

1. Instances of the DwC class Occurrence represent evidence that a 
particular organism existed at some point in time.  Thus a photo of a 
certain bird would qualify as an Occurrence, while a painting that was 
made by looking at a number of stuffed and live birds (but no particular 
one bird) would not be an Occurrence. 

2. Categorization of a resource as an Occurrence does not imply fitness 
of use for any particular purpose.  An Occurrence may or may not be 
useful for any of the following purposes:  serve to document the 
presence of an organism in nature at a particular place and time (i.e. 
to help model the natural distribution of a species), serve as a 
teaching tool, facilitate an identification key, illustrate a character 
state, provide an image for a trademark of the museum, etc. 

3. An Occurrence may be typed using dcterms:type (having values such as 
dcmitype:PhysicalObject, dcmitype:Sound, or dcmitype:StillImage) and 
dwc:basisOfRecord (having values such as dwctype:PreservedSpecimen, 
dwctype:LivingSpecimen, and dwctype:HumanObservation).  However, the 
typing of an Occurrence again does not imply fitness of use for any 
particular purpose.  An Occurrence of a particular type may be used for 
any or all purposes for which it is useful. 

 

B. HOW DO WE KNOW WHICH OCCURRENCES ARE USEFUL FOR DOCUMENTING 
DISTRIBUTIONS?

Having made the point that an Occurrence does not have an implied 
fitness for any particular use, it is true that one of the most common 
uses of DwC in the past has been to describe the metadata required to 
document the presence of organisms for the purpose of describing 
species' distributions.   I don't know if there is an official term for 
this use, but I'm going to refer to it as the "distribution 
documentation" use.  In the current Darwin Core standard, this use is 
facilitated by the term dwc:establishmentMeans 
(http://rs.tdwg.org/dwc/terms/index.htm#establishmentMeans), which is 
defined as "the process by which the biological individual(s) 
represented in the Occurrence became established at the location."  
dwc:establishmentMeans helps a metadata user to assess the fitness of a 
record for distribution documentation by differentiating between 
individual organisms that occur naturally and those that do not.  
Because dwc:establishmentMeans allows for values that span the range of 
native to adventive/naturalized to cultivated, a metadata user can 
select a level of stringency when searching.  dwc:establishmentMeans 
assumes that any type of Occurrence can be used to document the 
individual.  However, not all Occurrences are suitable for distribution 
documentation.  So an important question is: under what circumstances do 
Occurrences provide useful information for documenting a species 
distribution?  Another important question which should be addressed is: 
which Occurrence records should have dwc:establishmentMeans as a 
property? (or should dwc:establishmentMeans should be a property of 
something other than Occurrences?) 

 

C. EXAMPLES

In discussing these questions, I will use some examples which range from 
a relatively simple circumstance (preserved specimens in a museum or 
herbarium collection) to more complex networks of various kinds of 
resources.

Example 1: 
http://people.vanderbilt.edu/~steve.baskauf/conceptual-scheme-herbarium.gif

Example 2: 
http://people.vanderbilt.edu/~steve.baskauf/conceptual-scheme-insect.gif

Example 3: 
http://people.vanderbilt.edu/~steve.baskauf/conceptual-scheme-botanical.gif

 

The starting point in all of these example diagrams is an Individual 
organism found in the environment.  By Individual I mean the object of 
the term dwc:individualID, which is the entity that a dwcterm:Occurrence 
documents and is described by the class proposed for DwC at:  
http://code.google.com/p/darwincore/issues/detail?id=69.  (As a 
practical matter an "Individual" may also represent a small population 
of organisms of the same species.)  This Individual is the entity that 
dwc:establishmentMeans describes. 

 

The Occurrence resources that are created are represented by the 
rectangles in the diagram.  In these examples they are primarily 
specimens and images but they could also be other types of resources.  
Each Occurrence is associated with a particular dcmitype:Event in which 
the Occurrence resource was created (an "Occurrence resource creation 
event" described by the date/time and dcterms:Location of the event and 
represented by arrows in the diagram).  Because of the one-to-one 
relationship between an Occurrence resource and the Event in which it is 
created, I have chosen to think of the metadata for both things as a 
single unit.  One could also divide them conceptually into two separate 
entities, i.e. different resources having different identifiers.  
Whether this approach is desirable or not should probably be the subject 
of a separate thread.

 

D. WHICH OCCURRENCES CONTAIN LOCATION INFORMATION ABOUT INDIVIDUALS?

Since each Occurrence resource has location metadata associated with its 
resource creation event, each one has the potential for providing 
information to document a distribution.  However it is clear from these 
diagrams that the location metadata for many of these Occurrences has no 
use for distribution documentation because the Occurrences were created 
in locations that have no relationship to the Individual (e.g. the 
location in a lab where a DNA sample was extracted or in a museum where 
an image was created from a specimen).  Examination of all three 
examples shows that the resource creation events for Occurrences that 
are derived directly from the Individual (highlighted in gray) provide 
useful information about where the Individual was at a particular time 
(i.e. distribution documentation), while the resource creation events 
for all of the other Occurrences do not.  (Note that whether or not an 
Occurrence documents the distribution of the Individual's species is not 
related to the type [dcterms:type or dwc:basisOfRecord] of the 
resource.)  It seems to me that there is a need for a Darwin Core term 
that indicates whether the location metadata for a particular Occurrence 
resource creation event is useful for distribution documentation - I 
can't see any terms currently in the standard that can serve this 
function.  Perhaps term could be created such as "documentsDistribution" 
having values of "true" for those derived directly from the Individual 
and "false" for those that are not. 

 

E. TO WHAT DOES dwc:establishmentMeans APPLY?

Currently dwc:establishmentMeans is assigned to the class Occurrence. 
 So theoretically, a value for the dwc:establishmentMeans of the 
Individual could be assigned to any Occurrence resource/resource 
creation event derived from the Individual regardless of how far the 
Occurrence is removed from the Individual through multiple resource 
creation events.  I believe that this would probably be a bad idea, 
since the person creating the metadata for more distantly derived 
resources (i.e. a technician doing a DNA extraction or photographing a 
leg on a pinned specimen in example 2) may not have a clue about the 
status of the original Individual.  At best, that person would just copy 
the value of dwc:establishmentMeans from the metadata for some other 
Occurrence resource derived more directly from the Individual, which 
would be rather pointless. 

 

Alternatively, dwc:establishmentMeans could be assigned only to 
Occurrence resources/resource creation events for which 
"documentsDistribution"="true".  One could argue that each 
collector/observer (i.e. the object of dwc:recordedBy) of an Occurrence 
derived directly from the Individual could theoretically make an 
independent assessment of the correct value of dwc:establishmentMeans 
for the Individual.  It could also be argued that dwc:establishmentMeans 
should be associated with particular Occurrences because if an 
Individual's dwc:establishmentMeans status changed over time, then 
multiple Occurrences recorded from that Individual over time could have 
different values of dwc:establishmentMeans.  However, it is difficult 
for me to imagine circumstances under which this would happen.  An 
Individual that was "cultivated" is not likely to become "native" or 
vice-versa.  The one situation I can imagine is if a "native" (or 
"adventive/naturalized") individual were collected and moved to a zoo or 
botanical garden.  However, in that circumstance, upon collection the 
organism would cease to be in its natural environment and would 
conceptually become an Occurrence of type dwctype:LivingSpecimen rather 
than an Individual.  In such a circumstance, the value of 
dwc:establishmentMeans for the LivingSpecimen metadata should be 
"native" (or "adventive/naturalized") because establishmentMeans refers 
to "the biological individual(s) represented in the Occurrence", not to 
the Occurrence itself, and at the time of its collection, the 
Individual's establishmentMeans was "native" (or "adventive/naturalized"). 

 

To me, the most logical thing would be to re-assign 
dwc:establishmentMeans from the class dwc:Occurrence to the proposed 
Darwin Core class Individual.  That makes sense to me because after all, 
dwc:establishmentMeans is supposed to tell us something about an 
Individual.  However, there could be a couple problems with this.  One 
is that it is likely that many collectors of preserved specimens (e.g. 
who have simple resource creation circumstances such as in example 1) 
are not going to see any reason to bother with the concept of 
Individuals.  If they do not create records for Individuals, then in 
what context will they assign a value of dwc:establishmentMeans?  The 
other problem that I foresee is if two different collector/observers 
created recorded Occurrences from the same Individual but assigned the 
Individual different values of dwc:establishmentMeans.  When those two 
Individual records are merged into one, there would either have to be 
some kind of rule for determining which value of dwc:establishmentMeans 
to use, or two values of dwc:establishmentMeans would have to be allowed 
for a single Individual.  However, it seems to me that this circumstance 
would probably occur only very rarely.

 

F. SUMMARY

It seems to me that two things are needed here to meet the needs of 
users who create complex networks of Occurrence resources such as in 
Examples 2 and 3. 

1. A term such as "documentsDistribution" is needed to unambiguously 
indicate whether the dcterms:Location and dcmitype:Event (i.e. resource 
creation event) metadata for a particular Occurrence provide useful 
information about the Individual that the Occurrence represents.  

2. Clarification of whether the term dwc:establishmentMeans should be a 
part of the metadata for individual Occurrence resources (and if so, 
which ones) or whether it should be moved to the proposed class Individual.

 Comments???

Steve Baskauf

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20100215/e89a7205/attachment.html 


More information about the tdwg-content mailing list