[tdwg-content] Plea for competency questions. Was Re: New terms need resolution: "Evidence"

Steve Baskauf steve.baskauf at vanderbilt.edu
Tue Jul 26 20:17:11 CEST 2011

John Wieczorek has made an additional proposal 
(http://lists.tdwg.org/pipermail/tdwg-content/2011-July/002574.html) to 
resolve the issue of "Evidence" by creating a class called 
"CollectionObject".  I believe that his intended meaning for the term is 
exactly what I have intended in the past when I used the term "token".  
I believe that the proposal for the CollectionObject class is intimately 
related to the question of the definition of Individual/BiologicalEntity 
because I think that the competency questions Rich wants to address 
through the Individual/BiologicalEntity class are a subset of what I 
would consider to be the competency questions for the CollectionObject 

Because of TDWG's historical roots in the collections community, 
"occurrences" have been subconsciously or even explicitly linked with 
the evidence that documents them.  For example, PreservedSpecimens have 
been considered a subclass of Occurrence in the Darwin Core type 
vocabulary.  But in the lengthy tdwg-content discussion of 2009-10 
(summarized at 
http://code.google.com/p/darwin-sw/wiki/TdwgContentEmailSummary), there 
seemed to be a consensus that an Occurrence is a record of an Individual 
at a particular Time and Location.  That Occurrence is an independent 
entity from the evidence that serves to document it, which could include 
one (or perhaps zero) to many PreservedSpecimens, Images, text files 
recording MachineObservations, etc.  Separating the Occurrence from its 
documenting evidence makes it easier to be explicit about the number and 
types of evidence that document the Occurrence. 

So I would define competency question #1 for the proposed 
CollectionObject class as to:
1. document an Occurrence (which I will refer to as the "Occurrence" use 
of evidence).

However, I would also assert that CollectionObject need not be 
restricted to documenting an Occurrence, but that it may also:
2. document the existence of an Individual/BiologicalEntity, aggregates 
that include it, and pieces that came from it (which I will refer to as 
the "Provenance" use of evidence)
3. support an Identification to a particular Taxon (the "Identification" 
use of evidence).
[note: see http://code.google.com/p/darwin-sw/wiki/ClassToken for 
further discussion of this]

If the CollectionObject is a whole dead organism serving as a museum 
specimen, then it might simultaneously address all three of these 
questions.  The time and place listed on the specimen label provide the 
information about the Occurrence, the dead organism serves as proof of 
its own existence, and determinations typed or written on the label are 
vouched for by the presence of the dead organism.  In this case the need 
to separate these three uses of CollectionObject may not be obvious. 

But consider a more complex situation of the sort that BiSciCol would 
like to be able to handle.  A wildebeest calf is digitally photographed 
in South Africa as it is being captured.  The calf is shipped to a zoo 
in England where a blood sample is taken.  Part of this tissue sample is 
stored in liquid nitrogen, but part is used for a DNA extraction.   The 
DNA is sequenced and used in a molecular phylogeny project.  Here we 
have 5 pieces of evidence (each of which could be assigned GUIDs): the 
digital StillImage, the calf itself in the zoo (a LivingSpecimen), the 
blood sample, the DNA sample, and the DNA sequence in digital text 
form.  All five of these pieces of evidence could potentially reside as 
CollectionObjects in different physical or electronic repositories. 

*Competency question 1 (Occurrence) *
StillImage: timestamp and embedded GPS metadata associated with the 
image document the time and place where the calf was located at the time 
of capture.
LivingSpecimen: the collection record that the zoo keeps for the calf 
document the time and place where the calf was located at the time of 
(the other three pieces of evidence do not provide information about any 
time and place where the calf was located)

*Competency question 2 (Provenance)*
StillImage: a foaf:depiction of the calf (Individual/BiologicalEntity)
LivingSpecimen: owl:sameAs the calf (Individual/BiologicalEntity)
blood sample: dcterms:isPartOf the calf (Individual/BiologicalEntity)
DNA sample: dcterms:isPartOf the blood sample which dcterms:isPartOf the 
calf (Individual/BiologicalEntity)
DNA sequence: [sequencedFrom] the DNA sample which dcterms:isPartOf the 
blood sample which dcterms:isPartOf the calf (Individual/BiologicalEntity)
(all five pieces of evidence support the existence of the calf, and the 
provenance of each piece of evidence can be traced back to the calf)

*Competency question 3 (Identification)*
As part of the phylogenetic analysis, the DNA sequence could serve as 
evidence for assigning the calf to a particular taxon.
A mammal expert in Australia might examine the digital StillImage via 
the web and assert that the calf is a wildebeest.
Another mammal expert in England might examine the calf at the zoo and 
assert that the calf is a wildebeest (or perhaps look at the calf in the 
zoo when it is full grown and also look at the StillImage taken at the 
time it was captured and make the assertion based on two forms of 
(the DNA sample itself apart from the sequence probably wouldn't be used 
as evidence for an Identification; the blood sample might if the 
cytology were distinctive)
I would assert that the bottom line here is that an entity that falls 
within the proposed CollectionObject class would need to address at 
least one of these three competency questions.  It would not be 
necessary for it to address all three.  The properties of instances of 
the CollectionObject class would be that they could be connected through 
object properties to Occurrences, Individuals/BiologicalEntities, or 
Identifications; and that they could have data properties that we 
typically assign to collected items such as catalogNumber, 
collectionCode, preparation, etc.  John has suggested moving DwC terms 
for such properties from under Occurrence to the proposed 
CollectionObject class 
As John has noted, this is a significant change, but I believe that it 
is an important one if one is to accept the distinction between 
Occurrences and the evidence that documents them - an imperative in more 
complex cases such as I outlined above.
I want to end this email by returning to Rich's outlook on the 
"Individual" issue.  In his various posts, it seems to me that much of 
what he wants to accomplish through the "Individual" class falls within 
what I've defined here as competency question 2 (tracking provenance of 
resources, in particular physical things that are, include, or are taken 
from living organisms).  It seems like the CollectionObject class is 
fully capable of doing much of what he wants to accomplish.  Just get 
rid of the term "Individual" - as Paul Murray has noted 
(http://lists.tdwg.org/pipermail/tdwg-content/2011-July/002615.html) it 
already has a different meaning.  Use "CollectionObject" to address 
Rich's provenance competency questions (tracking and connecting 
collected objects) and "BiologicalEntity" to address mine (joining 
zero-to-many Occurrences to zero-to-many forms of evidence to 
zero-to-many Identifications).  So then what IS an actual individual 
organism like the wildebeest calf?  It is a BiologicalEntity if it has 
been documented as an Occurrence or assigned an Identification.  It is a 
CollectionObject if it was collected for a zoo, or shot and mounted in a 
museum.  Or it can be both simultaneously if it is both documented and 
collected.  If none of these things were done, then it's neither a 
BiologicalEntity nor a CollectionObject - it's simply a wildebeest 
calf.  Define the class/type of the thing by the properties that you 
wish to assert for it (or the competency questions that you can answer 
for it). 

With regard to the issue of taxonomically heterogeneous entities: 
tracking the provenance of taxonomically heterogeneous CollectionObjects 
and CollectionObjects that are pieces of organisms is not really a big 
deal.  A fish fin isPartOf an individual fish isPartOf a jar of a mixed 
fish isPart of a marine trawl.  However, tracking and reconciling 
Identifications of taxonomically heterogeneous collections of things and 
their subsamples (the second part of what Rich wanted to accomplish) is 
more complex task that I cannot at this point wrap my head around (see 
the "Additional Comments" at 
http://code.google.com/p/darwin-sw/wiki/TaxonomicHeterogeneity for more 
commentary on this).


Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110726/4ad51cbe/attachment.html 

More information about the tdwg-content mailing list