John Wieczorek has made an additional proposal (http://lists.tdwg.org/pipermail/tdwg-content/2011-July/002574.html) to resolve the issue of "Evidence" by creating a class called "CollectionObject".  I believe that his intended meaning for the term is exactly what I have intended in the past when I used the term "token".  I believe that the proposal for the CollectionObject class is intimately related to the question of the definition of Individual/BiologicalEntity because I think that the competency questions Rich wants to address through the Individual/BiologicalEntity class are a subset of what I would consider to be the competency questions for the CollectionObject class.

Because of TDWG's historical roots in the collections community, "occurrences" have been subconsciously or even explicitly linked with the evidence that documents them.  For example, PreservedSpecimens have been considered a subclass of Occurrence in the Darwin Core type vocabulary.  But in the lengthy tdwg-content discussion of 2009-10 (summarized at http://code.google.com/p/darwin-sw/wiki/TdwgContentEmailSummary), there seemed to be a consensus that an Occurrence is a record of an Individual at a particular Time and Location.  That Occurrence is an independent entity from the evidence that serves to document it, which could include one (or perhaps zero) to many PreservedSpecimens, Images, text files recording MachineObservations, etc.  Separating the Occurrence from its documenting evidence makes it easier to be explicit about the number and types of evidence that document the Occurrence. 

So I would define competency question #1 for the proposed CollectionObject class as to:
1. document an Occurrence (which I will refer to as the "Occurrence" use of evidence).

However, I would also assert that CollectionObject need not be restricted to documenting an Occurrence, but that it may also:
2. document the existence of an Individual/BiologicalEntity, aggregates that include it, and pieces that came from it (which I will refer to as the "Provenance" use of evidence)
and
3. support an Identification to a particular Taxon (the "Identification" use of evidence).
[note: see http://code.google.com/p/darwin-sw/wiki/ClassToken for further discussion of this]

If the CollectionObject is a whole dead organism serving as a museum specimen, then it might simultaneously address all three of these questions.  The time and place listed on the specimen label provide the information about the Occurrence, the dead organism serves as proof of its own existence, and determinations typed or written on the label are vouched for by the presence of the dead organism.  In this case the need to separate these three uses of CollectionObject may not be obvious. 

But consider a more complex situation of the sort that BiSciCol would like to be able to handle.  A wildebeest calf is digitally photographed in South Africa as it is being captured.  The calf is shipped to a zoo in England where a blood sample is taken.  Part of this tissue sample is stored in liquid nitrogen, but part is used for a DNA extraction.   The DNA is sequenced and used in a molecular phylogeny project.  Here we have 5 pieces of evidence (each of which could be assigned GUIDs): the digital StillImage, the calf itself in the zoo (a LivingSpecimen), the blood sample, the DNA sample, and the DNA sequence in digital text form.  All five of these pieces of evidence could potentially reside as CollectionObjects in different physical or electronic repositories. 

Competency question 1 (Occurrence)
StillImage: timestamp and embedded GPS metadata associated with the image document the time and place where the calf was located at the time of capture.
LivingSpecimen: the collection record that the zoo keeps for the calf document the time and place where the calf was located at the time of capture.
(the other three pieces of evidence do not provide information about any time and place where the calf was located)

Competency question 2 (Provenance)
StillImage: a foaf:depiction of the calf (Individual/BiologicalEntity)
LivingSpecimen: owl:sameAs the calf (Individual/BiologicalEntity)
blood sample: dcterms:isPartOf the calf (Individual/BiologicalEntity)
DNA sample: dcterms:isPartOf the blood sample which dcterms:isPartOf the calf (Individual/BiologicalEntity)
DNA sequence: [sequencedFrom] the DNA sample which dcterms:isPartOf the blood sample which dcterms:isPartOf the calf (Individual/BiologicalEntity)
(all five pieces of evidence support the existence of the calf, and the provenance of each piece of evidence can be traced back to the calf)

Competency question 3 (Identification)
As part of the phylogenetic analysis, the DNA sequence could serve as evidence for assigning the calf to a particular taxon.
A mammal expert in Australia might examine the digital StillImage via the web and assert that the calf is a wildebeest.
Another mammal expert in England might examine the calf at the zoo and assert that the calf is a wildebeest (or perhaps look at the calf in the zoo when it is full grown and also look at the StillImage taken at the time it was captured and make the assertion based on two forms of Evidence).
(the DNA sample itself apart from the sequence probably wouldn't be used as evidence for an Identification; the blood sample might if the cytology were distinctive)
-----------
I would assert that the bottom line here is that an entity that falls within the proposed CollectionObject class would need to address at least one of these three competency questions.  It would not be necessary for it to address all three.  The properties of instances of the CollectionObject class would be that they could be connected through object properties to Occurrences, Individuals/BiologicalEntities, or Identifications; and that they could have data properties that we typically assign to collected items such as catalogNumber, collectionCode, preparation, etc.  John has suggested moving DwC terms for such properties from under Occurrence to the proposed CollectionObject class (http://lists.tdwg.org/pipermail/tdwg-content/2011-July/002574.html).  As John has noted, this is a significant change, but I believe that it is an important one if one is to accept the distinction between Occurrences and the evidence that documents them - an imperative in more complex cases such as I outlined above.
----------
I want to end this email by returning to Rich's outlook on the "Individual" issue.  In his various posts, it seems to me that much of what he wants to accomplish through the "Individual" class falls within what I've defined here as competency question 2 (tracking provenance of resources, in particular physical things that are, include, or are taken from living organisms).  It seems like the CollectionObject class is fully capable of doing much of what he wants to accomplish.  Just get rid of the term "Individual" - as Paul Murray has noted (http://lists.tdwg.org/pipermail/tdwg-content/2011-July/002615.html) it already has a different meaning.  Use "CollectionObject" to address Rich's provenance competency questions (tracking and connecting collected objects) and "BiologicalEntity" to address mine (joining zero-to-many Occurrences to zero-to-many forms of evidence to zero-to-many Identifications).  So then what IS an actual individual organism like the wildebeest calf?  It is a BiologicalEntity if it has been documented as an Occurrence or assigned an Identification.  It is a CollectionObject if it was collected for a zoo, or shot and mounted in a museum.  Or it can be both simultaneously if it is both documented and collected.  If none of these things were done, then it's neither a BiologicalEntity nor a CollectionObject - it's simply a wildebeest calf.  Define the class/type of the thing by the properties that you wish to assert for it (or the competency questions that you can answer for it). 

With regard to the issue of taxonomically heterogeneous entities: tracking the provenance of taxonomically heterogeneous CollectionObjects and CollectionObjects that are pieces of organisms is not really a big deal.  A fish fin isPartOf an individual fish isPartOf a jar of a mixed fish isPart of a marine trawl.  However, tracking and reconciling Identifications of taxonomically heterogeneous collections of things and their subsamples (the second part of what Rich wanted to accomplish) is more complex task that I cannot at this point wrap my head around (see the "Additional Comments" at http://code.google.com/p/darwin-sw/wiki/TaxonomicHeterogeneity for more commentary on this).

Steve


-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu