[tdwg-content] New terms need resolution: "Evidence"
Steve Baskauf
steve.baskauf at vanderbilt.edu
Wed Aug 3 16:02:38 CEST 2011
Depending on how one counts days, we are at or near the end of the 30
day comment period on the proposed DwC changes. I have already voiced
support for the proposal for the CollectionObject class (below) because
I believe that there are certain DwC terms that are currently
mis-categorized under Occurrence (assuming that it means a record of a
BiologicalEntity at an Event [=Location+time]), but which don't really
have any where else to go in the standard as it currently stands. In
the spirit of Darwin Core, the relationship of the CollectionObject
class to other terms would not be spelled out in the normative RDF
document (i.e. no domains, ranges, or subclass relationships).
Properties/terms are organized under the new class to indicate that they
probably could be used to describe instances of the class but since
there is no formal domain declaration for terms, applying one of those
terms to a resource does not imply that the resource is of type
CollectionObject.
Having said that, there are efforts underway to create richer
descriptions of the relationships among resources using RDF. As those
efforts move forward, they will undoubtedly be making use of classes
defined in DwC as the raw materials for ontologies that describe clearly
these relationships. So I wanted to articulate a few thoughts about the
proposed CollectionObject class. These came out of discussions that Cam
and I had when we discussed creation of the "token" class which is the
analog of CollectionObject in the darwin-sw ontology.
1. In some sense, we are implying two functions for the new class: to
describe things that are in collections, and to describe things that are
used as evidence. For CollectionObjects that are specimens, it seems
intuitive that those resources should serve both functions - why else do
we collect specimens? However, for other kinds of things, such as
images, these two functions do not not necessarily need to go together.
I collect images to document Occurrences and to serve as evidence that a
BiologicalEntity exists, but many other people collect images for other
purposes (e.g. art) and some people create images that never become part
of a "collection" with documented metadata (e.g. cell phone snapshots).
So although we can say that DwC types "PreservedSpecimen" and
"StillImage" (assuming that StillImage is added to the type vocabulary)
"refine" CollectionObject, we cannot say that those types are subclasses
of CollectionObject. To do that would imply that all StillImages are
also of type CollectionObject, which is not true in the cases of images
that aren't in documented collections with catalog numbers, etc. or
which aren't used as evidence.
2. Currently, in DwC most terms could easily serve as data properties
(i.e. properties whose object are literals/strings). but it has few
terms that are clearly intended to be used as object properties (i.e.
properties whose object are identifiers for other resources). In
efforts to create means to describe relationships in RDF, those object
properties will have to be created. For example, in darwin-sw, Cam and
I created the property dsw:hasEvidence whose subject is a dwc:Occurrence
and whose object is a dsw:Token (see the properties section of
http://code.google.com/p/darwin-sw/wiki/ClassOccurrence). Since the
proposed CollectionObject class is the analog of dsw:Token, then one
could essentially make the statement in RDF:
[dwc:Occurrence] dsw:hasEvidence [dwc:CollectionObject]
where the bracketed terms are URIs of instances of the classes in
brackets. This is a very important kind of statement that many people
would like to make. We do it in a half-hearted way with basisOfRecord,
but as I've complained on previous occasions, basisOfRecord doesn't work
well in descriptions of complicated relationships, such as when an
Occurrence is documented by several specimens and maybe also several
image, tissue samples, etc. Initially, I took the position that there
wasn't really any need for an explicit Token (a.k.a. CollectionObject)
class. If the dsw:hasEvidence property had no range, then any type of
resource could be its object, e.g. [dwc:Occurrence] dsw:hasEvidence [an
image], or [dwc:Occurrence] dsw:hasEvidence [a preserved specimen],
etc. However, Cam convinced me that it made sense to have a class to
explicitly type the sorts of things that one could use as evidence. We
then defined Token to be the range of dsw:hasEvidence (which may or may
not have been a good idea depending on your opinion of whether defining
ranges of properties facilitates "naughtiness" when people use the wrong
kinds of resources as objects). One would not have to go that far (i.e.
formally define a range), but I think that as a general principle, it
adds clarity to have a defined class which describes the types of
resources that are appropriate for use with terms we define. In the
particular case of a term like hasEvidence, I think it is important to
expect that the object will be a thing that is part of a collection. If
it is not, then how could we reasonably expect to be able to examine
that evidence if we wish to verify the assertion? In other words,
making the stronger statement that a StillImage is part of an accessible
collection (i.e. it is a CollectionObject) rather than just an ephemeral
cell phone picture provides the kind of "anchoring in fact" that we
would like to enable by creating a richer vocabulary in DwC than we
currently have.
I could say more, but I think that the creation of the CollectionObject
class is important for reasons beyond just creating a more appropriate
category on the DwC terms quick reference guide under which to list
terms like catalogNumber and preparations. It also facilitates the
development of more complex ontologies that go beyond "simple Darwin
Core". The same could be said about BiologicalEntity, which would have
relatively few terms/data properties listed under it in the quick
reference guide, but which is an important conceptual entity in
describing relationships among resources (e.g. what's the thing that we
are documenting in an Occurrence?).
Steve
John Wieczorek wrote:
> None of this was entered as an issue in the Darwin Core Issue tracker.
> Rather, it arose from the discussion on DigitalStillImage (Issue 68).
> The Darwin Core simplifies the representation of concepts related to
> the occurrence of life in nature. Specifically, there is a Darwin Core
> Occurrence class, but no separate class to explicitly represent the
> physical evidence for the Occurrence. This is problematic when trying
> to be explicit about the attributes of physical or digital objects in
> the semantic world. Various names have been given to this concept,
> including "PhysicalObject", "Evidence", "PhysicalEvidence", "Sample",
> "Specimen", "CollectionObject" in the ASC model,
> "DataSets/DataSet/Units/Unit/..." in ABCD, "Representation" (Baskauf
> 2010) and "Token" as a preferred alternative to "Representation" in
> tdwg-content (http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001742.html).
>
> Open Issues:
> Determine if there is a general consensus on the need for new class
> for this concept.
>
> Decide the best name for this class.
>
> Create a new Type Vocabulary term referring to this new class.
>
> Change the Darwin Core Type Vocabulary terms that refine Occurrence
> (FossilSpecimen, PreservedSpecimen, LivingSpecimen) to refine the new
> Type Vocabulary term referring to this new class.
>
> Decide which properties currently organized under Occurrence should
> instead be organized under this new class.
>
> As a seed for further discussion, here is a complete proposal.
> ==============
> Term Name: CollectionObject
> Identifier: http://rs.tdwg.org/dwc/terms/CollectionObject
> Namespace: http://rs.tdwg.org/dwc/terms/
> Label: CollectionObject
> Definition: The category of information pertaining to persistent
> evidence of a biological entity (specimen, sample, image, sound,
> drawing, etc.).
> Comment: For discussion see
> http://code.google.com/p/darwincore/wiki/CollectionObject
> Type of Term: http://www.w3.org/2000/01/rdf-schema#Class
> Refines:
> Status: recommended
> Date Issued: 2011-07-04
> Date Modified: 2011-07-04
> Has Domain:
> Has Range:
> Version: CollectionObject-2011-07-04
> Replaces:
> Is Replaced By:
> Class:
> ABCD 2.06: {DataSets/DataSet/Units/Unit/CultureCollectionUnit or
> DataSets/DataSet/Units/Unit/MycologicalUnit or
> DataSets/DataSet/Units/Unit/HerbariumUnit or
> DataSets/DataSet/Units/Unit/BotanicalGardenUnit or
> DataSets/DataSet/Units/Unit/PlantGeneticResourceUnit or
> DataSets/DataSet/Units/Unit/ZoologicalUnit or
> DataSets/DataSet/Units/Unit/PalaeontologicalUnit or
> DataSets/DataSet/Units/Unit/MultimediaObjects/MultimediaObject}
>
> Term Name: CollectionObject
> Identifier: http://rs.tdwg.org/dwc/dwctype/CollectionObject
> Namespace: http://rs.tdwg.org/dwc/dwctype/
> Label: CollectionObject
> Definition: A resource describing an instance of the CollectionObject class.
> Comment: For discussion see
> http://code.google.com/p/darwincore/wiki/DwCTypeVocabulary
> Type of Term: http://www.w3.org/2000/01/rdf-schema#Class
> Refines:
> Status: recommended
> Date Issued: 2011-07-04
> Date Modified: 2011-07-04
> Member Of: http://rs.tdwg.org/dwc/terms/DwCType
> Has Domain:
> Has Range:
> Version: CollectionObject-2011-07-04
> Replaces:
> Is Replaced By:
> Class:
> ABCD 2.06: not in ABCD
>
> Term Name: collectionObjectID
> Identifier: http://rs.tdwg.org/dwc/terms/collectionObjectID
> Namespace: http://rs.tdwg.org/dwc/terms/
> Label: collectionObjectID
> Definition: An identifier for the CollectionObject. In the absence of
> a persistent global unique identifier, construct one from a
> combination of identifiers in the record that will most closely make
> the collectionObjectID globally unique.
> Comment: For a specimen in the absence of a bona fide global unique
> identifier, for example, use the form:
> "urn:catalog:[institutionCode]:[collectionCode]:[catalogNumber].
> Examples: "urn:lsid:nhm.ku.edu:Herps:32",
> "urn:catalog:FMNH:Mammal:145732". For discussion see
> http://code.google.com/p/darwincore/wiki/CollectionObject
> Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
> Refines: http://purl.org/dc/terms/identifier
> Status: recommended
> Date Issued: 2011-07-04
> Date Modified: 2011-07-04
> Has Domain:
> Has Range:
> Version: collectionObjectID-2011-07-04
> Replaces:
> Is Replaced By:
> Class: http://rs.tdwg.org/dwc/terms/CollectionObject
> ABCD 2.06: DataSets/DataSet/Units/Unit/UnitGUID
>
> Term Name: collectionObjectRemarks
> Identifier: http://rs.tdwg.org/dwc/terms/collectionObjectRemarks
> Namespace: http://rs.tdwg.org/dwc/terms/
> Label: collectionObjectRemarks
> Definition: Comments or notes about the Collection Object.
> Comment: Example: "custody transferred in 1995 from National Park
> Service". For discussion see
> http://code.google.com/p/darwincore/wiki/CollectionObject
> Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
> Refines:
> Status: recommended
> Date Issued: 2011-07-04
> Date Modified: 2009-07-04
> Has Domain:
> Has Range:
> Version: collectionObjectRemarks-2011-07-04
> Replaces: SampleRemarks-2009-01-18
> Is Replaced By:
> Class: http://rs.tdwg.org/dwc/terms/CollectionObject
> ABCD 2.06: DataSets/DataSet/Units/Unit/Notes
>
> Note that with the addition of CollectionObject, the InstitutionCode,
> CollectionCode, CatalogNumber triplet would no longer apply to an
> Occurrence. This is a big, non-backward compatible change.
>
> Terms to be organized under CollectionObject rather than Occurrence:
> catalogNumber
> preparations
> disposition
> otherCatalogNumbers
>
> Terms that might have to be added to under CollectionObject to retain
> capabilities currently achieved under Occurrence:
> associatedCollectionObjectMedia
> associatedCollectionObjectReferences
> associatedCollectionObjects
> associateSequences
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
> .
>
>
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
http://bioimages.vanderbilt.edu
More information about the tdwg-content
mailing list