Depending on how one counts days, we are at or near the end of the 30 day comment period on the proposed DwC changes. I have already voiced support for the proposal for the CollectionObject class (below) because I believe that there are certain DwC terms that are currently mis-categorized under Occurrence (assuming that it means a record of a BiologicalEntity at an Event [=Location+time]), but which don't really have any where else to go in the standard as it currently stands. In the spirit of Darwin Core, the relationship of the CollectionObject class to other terms would not be spelled out in the normative RDF document (i.e. no domains, ranges, or subclass relationships). Properties/terms are organized under the new class to indicate that they probably could be used to describe instances of the class but since there is no formal domain declaration for terms, applying one of those terms to a resource does not imply that the resource is of type CollectionObject.
Having said that, there are efforts underway to create richer descriptions of the relationships among resources using RDF. As those efforts move forward, they will undoubtedly be making use of classes defined in DwC as the raw materials for ontologies that describe clearly these relationships. So I wanted to articulate a few thoughts about the proposed CollectionObject class. These came out of discussions that Cam and I had when we discussed creation of the "token" class which is the analog of CollectionObject in the darwin-sw ontology.
1. In some sense, we are implying two functions for the new class: to describe things that are in collections, and to describe things that are used as evidence. For CollectionObjects that are specimens, it seems intuitive that those resources should serve both functions - why else do we collect specimens? However, for other kinds of things, such as images, these two functions do not not necessarily need to go together. I collect images to document Occurrences and to serve as evidence that a BiologicalEntity exists, but many other people collect images for other purposes (e.g. art) and some people create images that never become part of a "collection" with documented metadata (e.g. cell phone snapshots). So although we can say that DwC types "PreservedSpecimen" and "StillImage" (assuming that StillImage is added to the type vocabulary) "refine" CollectionObject, we cannot say that those types are subclasses of CollectionObject. To do that would imply that all StillImages are also of type CollectionObject, which is not true in the cases of images that aren't in documented collections with catalog numbers, etc. or which aren't used as evidence.
2. Currently, in DwC most terms could easily serve as data properties (i.e. properties whose object are literals/strings). but it has few terms that are clearly intended to be used as object properties (i.e. properties whose object are identifiers for other resources). In efforts to create means to describe relationships in RDF, those object properties will have to be created. For example, in darwin-sw, Cam and I created the property dsw:hasEvidence whose subject is a dwc:Occurrence and whose object is a dsw:Token (see the properties section of http://code.google.com/p/darwin-sw/wiki/ClassOccurrence). Since the proposed CollectionObject class is the analog of dsw:Token, then one could essentially make the statement in RDF: [dwc:Occurrence] dsw:hasEvidence [dwc:CollectionObject] where the bracketed terms are URIs of instances of the classes in brackets. This is a very important kind of statement that many people would like to make. We do it in a half-hearted way with basisOfRecord, but as I've complained on previous occasions, basisOfRecord doesn't work well in descriptions of complicated relationships, such as when an Occurrence is documented by several specimens and maybe also several image, tissue samples, etc. Initially, I took the position that there wasn't really any need for an explicit Token (a.k.a. CollectionObject) class. If the dsw:hasEvidence property had no range, then any type of resource could be its object, e.g. [dwc:Occurrence] dsw:hasEvidence [an image], or [dwc:Occurrence] dsw:hasEvidence [a preserved specimen], etc. However, Cam convinced me that it made sense to have a class to explicitly type the sorts of things that one could use as evidence. We then defined Token to be the range of dsw:hasEvidence (which may or may not have been a good idea depending on your opinion of whether defining ranges of properties facilitates "naughtiness" when people use the wrong kinds of resources as objects). One would not have to go that far (i.e. formally define a range), but I think that as a general principle, it adds clarity to have a defined class which describes the types of resources that are appropriate for use with terms we define. In the particular case of a term like hasEvidence, I think it is important to expect that the object will be a thing that is part of a collection. If it is not, then how could we reasonably expect to be able to examine that evidence if we wish to verify the assertion? In other words, making the stronger statement that a StillImage is part of an accessible collection (i.e. it is a CollectionObject) rather than just an ephemeral cell phone picture provides the kind of "anchoring in fact" that we would like to enable by creating a richer vocabulary in DwC than we currently have.
I could say more, but I think that the creation of the CollectionObject class is important for reasons beyond just creating a more appropriate category on the DwC terms quick reference guide under which to list terms like catalogNumber and preparations. It also facilitates the development of more complex ontologies that go beyond "simple Darwin Core". The same could be said about BiologicalEntity, which would have relatively few terms/data properties listed under it in the quick reference guide, but which is an important conceptual entity in describing relationships among resources (e.g. what's the thing that we are documenting in an Occurrence?).
Steve
John Wieczorek wrote:
None of this was entered as an issue in the Darwin Core Issue tracker. Rather, it arose from the discussion on DigitalStillImage (Issue 68). The Darwin Core simplifies the representation of concepts related to the occurrence of life in nature. Specifically, there is a Darwin Core Occurrence class, but no separate class to explicitly represent the physical evidence for the Occurrence. This is problematic when trying to be explicit about the attributes of physical or digital objects in the semantic world. Various names have been given to this concept, including "PhysicalObject", "Evidence", "PhysicalEvidence", "Sample", "Specimen", "CollectionObject" in the ASC model, "DataSets/DataSet/Units/Unit/..." in ABCD, "Representation" (Baskauf 2010) and "Token" as a preferred alternative to "Representation" in tdwg-content (http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001742.html).
Open Issues: Determine if there is a general consensus on the need for new class for this concept.
Decide the best name for this class.
Create a new Type Vocabulary term referring to this new class.
Change the Darwin Core Type Vocabulary terms that refine Occurrence (FossilSpecimen, PreservedSpecimen, LivingSpecimen) to refine the new Type Vocabulary term referring to this new class.
Decide which properties currently organized under Occurrence should instead be organized under this new class.
As a seed for further discussion, here is a complete proposal.
Term Name: CollectionObject Identifier: http://rs.tdwg.org/dwc/terms/CollectionObject Namespace: http://rs.tdwg.org/dwc/terms/ Label: CollectionObject Definition: The category of information pertaining to persistent evidence of a biological entity (specimen, sample, image, sound, drawing, etc.). Comment: For discussion see http://code.google.com/p/darwincore/wiki/CollectionObject Type of Term: http://www.w3.org/2000/01/rdf-schema#Class Refines: Status: recommended Date Issued: 2011-07-04 Date Modified: 2011-07-04 Has Domain: Has Range: Version: CollectionObject-2011-07-04 Replaces: Is Replaced By: Class: ABCD 2.06: {DataSets/DataSet/Units/Unit/CultureCollectionUnit or DataSets/DataSet/Units/Unit/MycologicalUnit or DataSets/DataSet/Units/Unit/HerbariumUnit or DataSets/DataSet/Units/Unit/BotanicalGardenUnit or DataSets/DataSet/Units/Unit/PlantGeneticResourceUnit or DataSets/DataSet/Units/Unit/ZoologicalUnit or DataSets/DataSet/Units/Unit/PalaeontologicalUnit or DataSets/DataSet/Units/Unit/MultimediaObjects/MultimediaObject}
Term Name: CollectionObject Identifier: http://rs.tdwg.org/dwc/dwctype/CollectionObject Namespace: http://rs.tdwg.org/dwc/dwctype/ Label: CollectionObject Definition: A resource describing an instance of the CollectionObject class. Comment: For discussion see http://code.google.com/p/darwincore/wiki/DwCTypeVocabulary Type of Term: http://www.w3.org/2000/01/rdf-schema#Class Refines: Status: recommended Date Issued: 2011-07-04 Date Modified: 2011-07-04 Member Of: http://rs.tdwg.org/dwc/terms/DwCType Has Domain: Has Range: Version: CollectionObject-2011-07-04 Replaces: Is Replaced By: Class: ABCD 2.06: not in ABCD
Term Name: collectionObjectID Identifier: http://rs.tdwg.org/dwc/terms/collectionObjectID Namespace: http://rs.tdwg.org/dwc/terms/ Label: collectionObjectID Definition: An identifier for the CollectionObject. In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the collectionObjectID globally unique. Comment: For a specimen in the absence of a bona fide global unique identifier, for example, use the form: "urn:catalog:[institutionCode]:[collectionCode]:[catalogNumber]. Examples: "urn:lsid:nhm.ku.edu:Herps:32", "urn:catalog:FMNH:Mammal:145732". For discussion see http://code.google.com/p/darwincore/wiki/CollectionObject Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property Refines: http://purl.org/dc/terms/identifier Status: recommended Date Issued: 2011-07-04 Date Modified: 2011-07-04 Has Domain: Has Range: Version: collectionObjectID-2011-07-04 Replaces: Is Replaced By: Class: http://rs.tdwg.org/dwc/terms/CollectionObject ABCD 2.06: DataSets/DataSet/Units/Unit/UnitGUID
Term Name: collectionObjectRemarks Identifier: http://rs.tdwg.org/dwc/terms/collectionObjectRemarks Namespace: http://rs.tdwg.org/dwc/terms/ Label: collectionObjectRemarks Definition: Comments or notes about the Collection Object. Comment: Example: "custody transferred in 1995 from National Park Service". For discussion see http://code.google.com/p/darwincore/wiki/CollectionObject Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property Refines: Status: recommended Date Issued: 2011-07-04 Date Modified: 2009-07-04 Has Domain: Has Range: Version: collectionObjectRemarks-2011-07-04 Replaces: SampleRemarks-2009-01-18 Is Replaced By: Class: http://rs.tdwg.org/dwc/terms/CollectionObject ABCD 2.06: DataSets/DataSet/Units/Unit/Notes
Note that with the addition of CollectionObject, the InstitutionCode, CollectionCode, CatalogNumber triplet would no longer apply to an Occurrence. This is a big, non-backward compatible change.
Terms to be organized under CollectionObject rather than Occurrence: catalogNumber preparations disposition otherCatalogNumbers
Terms that might have to be added to under CollectionObject to retain capabilities currently achieved under Occurrence: associatedCollectionObjectMedia associatedCollectionObjectReferences associatedCollectionObjects associateSequences _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
.