[tdwg-content] Plea for competency questions. Was Re: New terms need resolution: "Evidence"
Steve Baskauf
steve.baskauf at vanderbilt.edu
Tue Jul 26 20:17:11 CEST 2011
John Wieczorek has made an additional proposal
(http://lists.tdwg.org/pipermail/tdwg-content/2011-July/002574.html) to
resolve the issue of "Evidence" by creating a class called
"CollectionObject". I believe that his intended meaning for the term is
exactly what I have intended in the past when I used the term "token".
I believe that the proposal for the CollectionObject class is intimately
related to the question of the definition of Individual/BiologicalEntity
because I think that the competency questions Rich wants to address
through the Individual/BiologicalEntity class are a subset of what I
would consider to be the competency questions for the CollectionObject
class.
Because of TDWG's historical roots in the collections community,
"occurrences" have been subconsciously or even explicitly linked with
the evidence that documents them. For example, PreservedSpecimens have
been considered a subclass of Occurrence in the Darwin Core type
vocabulary. But in the lengthy tdwg-content discussion of 2009-10
(summarized at
http://code.google.com/p/darwin-sw/wiki/TdwgContentEmailSummary), there
seemed to be a consensus that an Occurrence is a record of an Individual
at a particular Time and Location. That Occurrence is an independent
entity from the evidence that serves to document it, which could include
one (or perhaps zero) to many PreservedSpecimens, Images, text files
recording MachineObservations, etc. Separating the Occurrence from its
documenting evidence makes it easier to be explicit about the number and
types of evidence that document the Occurrence.
So I would define competency question #1 for the proposed
CollectionObject class as to:
1. document an Occurrence (which I will refer to as the "Occurrence" use
of evidence).
However, I would also assert that CollectionObject need not be
restricted to documenting an Occurrence, but that it may also:
2. document the existence of an Individual/BiologicalEntity, aggregates
that include it, and pieces that came from it (which I will refer to as
the "Provenance" use of evidence)
and
3. support an Identification to a particular Taxon (the "Identification"
use of evidence).
[note: see http://code.google.com/p/darwin-sw/wiki/ClassToken for
further discussion of this]
If the CollectionObject is a whole dead organism serving as a museum
specimen, then it might simultaneously address all three of these
questions. The time and place listed on the specimen label provide the
information about the Occurrence, the dead organism serves as proof of
its own existence, and determinations typed or written on the label are
vouched for by the presence of the dead organism. In this case the need
to separate these three uses of CollectionObject may not be obvious.
But consider a more complex situation of the sort that BiSciCol would
like to be able to handle. A wildebeest calf is digitally photographed
in South Africa as it is being captured. The calf is shipped to a zoo
in England where a blood sample is taken. Part of this tissue sample is
stored in liquid nitrogen, but part is used for a DNA extraction. The
DNA is sequenced and used in a molecular phylogeny project. Here we
have 5 pieces of evidence (each of which could be assigned GUIDs): the
digital StillImage, the calf itself in the zoo (a LivingSpecimen), the
blood sample, the DNA sample, and the DNA sequence in digital text
form. All five of these pieces of evidence could potentially reside as
CollectionObjects in different physical or electronic repositories.
*Competency question 1 (Occurrence) *
StillImage: timestamp and embedded GPS metadata associated with the
image document the time and place where the calf was located at the time
of capture.
LivingSpecimen: the collection record that the zoo keeps for the calf
document the time and place where the calf was located at the time of
capture.
(the other three pieces of evidence do not provide information about any
time and place where the calf was located)
*Competency question 2 (Provenance)*
StillImage: a foaf:depiction of the calf (Individual/BiologicalEntity)
LivingSpecimen: owl:sameAs the calf (Individual/BiologicalEntity)
blood sample: dcterms:isPartOf the calf (Individual/BiologicalEntity)
DNA sample: dcterms:isPartOf the blood sample which dcterms:isPartOf the
calf (Individual/BiologicalEntity)
DNA sequence: [sequencedFrom] the DNA sample which dcterms:isPartOf the
blood sample which dcterms:isPartOf the calf (Individual/BiologicalEntity)
(all five pieces of evidence support the existence of the calf, and the
provenance of each piece of evidence can be traced back to the calf)
*Competency question 3 (Identification)*
As part of the phylogenetic analysis, the DNA sequence could serve as
evidence for assigning the calf to a particular taxon.
A mammal expert in Australia might examine the digital StillImage via
the web and assert that the calf is a wildebeest.
Another mammal expert in England might examine the calf at the zoo and
assert that the calf is a wildebeest (or perhaps look at the calf in the
zoo when it is full grown and also look at the StillImage taken at the
time it was captured and make the assertion based on two forms of
Evidence).
(the DNA sample itself apart from the sequence probably wouldn't be used
as evidence for an Identification; the blood sample might if the
cytology were distinctive)
-----------
I would assert that the bottom line here is that an entity that falls
within the proposed CollectionObject class would need to address at
least one of these three competency questions. It would not be
necessary for it to address all three. The properties of instances of
the CollectionObject class would be that they could be connected through
object properties to Occurrences, Individuals/BiologicalEntities, or
Identifications; and that they could have data properties that we
typically assign to collected items such as catalogNumber,
collectionCode, preparation, etc. John has suggested moving DwC terms
for such properties from under Occurrence to the proposed
CollectionObject class
(http://lists.tdwg.org/pipermail/tdwg-content/2011-July/002574.html).
As John has noted, this is a significant change, but I believe that it
is an important one if one is to accept the distinction between
Occurrences and the evidence that documents them - an imperative in more
complex cases such as I outlined above.
----------
I want to end this email by returning to Rich's outlook on the
"Individual" issue. In his various posts, it seems to me that much of
what he wants to accomplish through the "Individual" class falls within
what I've defined here as competency question 2 (tracking provenance of
resources, in particular physical things that are, include, or are taken
from living organisms). It seems like the CollectionObject class is
fully capable of doing much of what he wants to accomplish. Just get
rid of the term "Individual" - as Paul Murray has noted
(http://lists.tdwg.org/pipermail/tdwg-content/2011-July/002615.html) it
already has a different meaning. Use "CollectionObject" to address
Rich's provenance competency questions (tracking and connecting
collected objects) and "BiologicalEntity" to address mine (joining
zero-to-many Occurrences to zero-to-many forms of evidence to
zero-to-many Identifications). So then what IS an actual individual
organism like the wildebeest calf? It is a BiologicalEntity if it has
been documented as an Occurrence or assigned an Identification. It is a
CollectionObject if it was collected for a zoo, or shot and mounted in a
museum. Or it can be both simultaneously if it is both documented and
collected. If none of these things were done, then it's neither a
BiologicalEntity nor a CollectionObject - it's simply a wildebeest
calf. Define the class/type of the thing by the properties that you
wish to assert for it (or the competency questions that you can answer
for it).
With regard to the issue of taxonomically heterogeneous entities:
tracking the provenance of taxonomically heterogeneous CollectionObjects
and CollectionObjects that are pieces of organisms is not really a big
deal. A fish fin isPartOf an individual fish isPartOf a jar of a mixed
fish isPart of a marine trawl. However, tracking and reconciling
Identifications of taxonomically heterogeneous collections of things and
their subsamples (the second part of what Rich wanted to accomplish) is
more complex task that I cannot at this point wrap my head around (see
the "Additional Comments" at
http://code.google.com/p/darwin-sw/wiki/TaxonomicHeterogeneity for more
commentary on this).
Steve
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
http://bioimages.vanderbilt.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110726/4ad51cbe/attachment.html
More information about the tdwg-content
mailing list