[tdwg-content] New terms need resolution: "Evidence"

Steve Baskauf steve.baskauf at vanderbilt.edu
Wed Aug 3 16:02:38 CEST 2011


Depending on how one counts days, we are at or near the end of the 30 
day comment period on the proposed DwC changes.  I have already voiced 
support for the proposal for the CollectionObject class (below) because 
I believe that there are certain DwC terms that are currently 
mis-categorized under Occurrence (assuming that it means a record of a 
BiologicalEntity at an Event [=Location+time]), but which don't really 
have any where else to go in the standard as it currently stands.  In 
the spirit of Darwin Core, the relationship of the CollectionObject 
class to other terms would not be spelled out in the normative RDF 
document (i.e. no domains, ranges, or subclass relationships).  
Properties/terms are organized under the new class to indicate that they 
probably could be used to describe instances of the class but since 
there is no formal domain declaration for terms, applying one of those 
terms to a resource does not imply that the resource is of type 
CollectionObject. 

Having said that, there are efforts underway to create richer 
descriptions of the relationships among resources using RDF.  As those 
efforts move forward, they will undoubtedly be making use of classes 
defined in DwC as the raw materials for ontologies that describe clearly 
these relationships.  So I wanted to articulate a few thoughts about the 
proposed CollectionObject class.  These came out of discussions that Cam 
and I had when we discussed creation of the "token" class which is the 
analog of CollectionObject in the darwin-sw ontology.

1.  In some sense, we are implying two functions for the new class: to 
describe things that are in collections, and to describe things that are 
used as evidence.  For CollectionObjects that are specimens, it seems 
intuitive that those resources should serve both functions - why else do 
we collect specimens?  However, for other kinds of things, such as 
images, these two functions do not not necessarily need to go together.  
I collect images to document Occurrences and to serve as evidence that a 
BiologicalEntity exists, but many other people collect images for other 
purposes (e.g. art) and some people create images that never become part 
of a "collection" with documented metadata (e.g. cell phone snapshots).  
So although we can say that DwC types "PreservedSpecimen" and 
"StillImage" (assuming that StillImage is added to the type vocabulary) 
"refine" CollectionObject, we cannot say that those types are subclasses 
of CollectionObject.  To do that would imply that all StillImages are 
also of type CollectionObject, which is not true in the cases of images 
that aren't in documented collections with catalog numbers, etc. or 
which aren't used as evidence. 

2. Currently, in DwC most terms could easily serve as data properties 
(i.e. properties whose object are literals/strings). but it has few 
terms that are clearly intended to be used as object properties (i.e. 
properties whose object are identifiers for other resources).  In 
efforts to create means to describe relationships in RDF, those object 
properties will have to be created.  For example, in darwin-sw, Cam and 
I created the property dsw:hasEvidence whose subject is a dwc:Occurrence 
and whose object is a dsw:Token (see the properties section of 
http://code.google.com/p/darwin-sw/wiki/ClassOccurrence).  Since the 
proposed CollectionObject class is the analog of dsw:Token, then one 
could essentially make the statement in RDF:
[dwc:Occurrence] dsw:hasEvidence [dwc:CollectionObject]
where the bracketed terms are URIs of instances of the classes in 
brackets.  This is a very important kind of statement that many people 
would like to make.  We do it in a half-hearted way with basisOfRecord, 
but as I've complained on previous occasions, basisOfRecord doesn't work 
well in descriptions of complicated relationships, such as when an 
Occurrence is documented by several specimens and maybe also several 
image, tissue samples, etc.  Initially, I took the position that there 
wasn't really any need for an explicit Token (a.k.a. CollectionObject) 
class.  If the dsw:hasEvidence property had no range, then any type of 
resource could be its object, e.g. [dwc:Occurrence] dsw:hasEvidence [an 
image],  or [dwc:Occurrence] dsw:hasEvidence [a preserved specimen], 
etc.  However, Cam convinced me that it made sense to have a class to 
explicitly type the sorts of things that one could use as evidence.  We 
then defined Token to be the range of dsw:hasEvidence (which may or may 
not have been a good idea depending on your opinion of whether defining 
ranges of properties facilitates "naughtiness" when people use the wrong 
kinds of resources as objects).  One would not have to go that far (i.e. 
formally define a range), but I think that as a general principle, it 
adds clarity to have a defined class which describes the types of 
resources that are appropriate for use with terms we define.  In the 
particular case of a term like hasEvidence, I think it is important to 
expect that the object will be a thing that is part of a collection.  If 
it is not, then how could we reasonably expect to be able to examine 
that evidence if we wish to verify the assertion?  In other words, 
making the stronger statement that a StillImage is part of an accessible 
collection (i.e. it is a CollectionObject) rather than just an ephemeral 
cell phone picture provides the kind of "anchoring in fact" that we 
would like to enable by creating a richer vocabulary in DwC than we 
currently have. 

I could say more, but I think that the creation of the CollectionObject 
class is important for reasons beyond just creating a more appropriate 
category on the DwC terms quick reference guide under which to list 
terms like catalogNumber and preparations.  It also facilitates the 
development of more complex ontologies that go beyond "simple Darwin 
Core".  The same could be said about BiologicalEntity, which would have 
relatively few terms/data properties listed under it in the quick 
reference guide, but which is an important conceptual entity in 
describing relationships among resources (e.g. what's the thing that we 
are documenting in an Occurrence?).

Steve

John Wieczorek wrote:
> None of this was entered as an issue in the Darwin Core Issue tracker.
> Rather, it arose from the discussion on DigitalStillImage (Issue 68).
> The Darwin Core simplifies the representation of concepts related to
> the occurrence of life in nature. Specifically, there is a Darwin Core
> Occurrence class, but no separate class to explicitly represent the
> physical evidence for the Occurrence. This is problematic when trying
> to be explicit about the attributes of physical or digital objects in
> the semantic world. Various names have been given to this concept,
> including "PhysicalObject", "Evidence", "PhysicalEvidence", "Sample",
> "Specimen", "CollectionObject" in the ASC model,
> "DataSets/DataSet/Units/Unit/..." in ABCD, "Representation" (Baskauf
> 2010) and "Token" as a preferred alternative to "Representation" in
> tdwg-content (http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001742.html).
>
> Open Issues:
> Determine if there is a general consensus on the need for new class
> for this concept.
>
> Decide the best name for this class.
>
> Create a new Type Vocabulary term referring to this new class.
>
> Change the Darwin Core Type Vocabulary terms that refine Occurrence
> (FossilSpecimen, PreservedSpecimen, LivingSpecimen) to refine the new
> Type Vocabulary term referring to this new class.
>
> Decide which properties currently organized under Occurrence should
> instead be organized under this new class.
>
> As a seed for further discussion, here is a complete proposal.
> ==============
> Term Name: CollectionObject
> Identifier:	http://rs.tdwg.org/dwc/terms/CollectionObject
> Namespace:	http://rs.tdwg.org/dwc/terms/
> Label:	CollectionObject
> Definition:	The category of information pertaining to persistent
> evidence of a biological entity (specimen, sample, image, sound,
> drawing, etc.).
> Comment:	 For discussion see
> http://code.google.com/p/darwincore/wiki/CollectionObject
> Type of Term:	http://www.w3.org/2000/01/rdf-schema#Class
> Refines:	
> Status:	recommended
> Date Issued:	2011-07-04
> Date Modified:	2011-07-04
> Has Domain:	
> Has Range:	
> Version:	CollectionObject-2011-07-04
> Replaces:
> Is Replaced By:	
> Class:
> ABCD 2.06:	{DataSets/DataSet/Units/Unit/CultureCollectionUnit or
> DataSets/DataSet/Units/Unit/MycologicalUnit or
> DataSets/DataSet/Units/Unit/HerbariumUnit or
> DataSets/DataSet/Units/Unit/BotanicalGardenUnit or
> DataSets/DataSet/Units/Unit/PlantGeneticResourceUnit or
> DataSets/DataSet/Units/Unit/ZoologicalUnit or
> DataSets/DataSet/Units/Unit/PalaeontologicalUnit or
> DataSets/DataSet/Units/Unit/MultimediaObjects/MultimediaObject}
>
> Term Name: CollectionObject
> Identifier:	http://rs.tdwg.org/dwc/dwctype/CollectionObject
> Namespace:	http://rs.tdwg.org/dwc/dwctype/
> Label:	CollectionObject
> Definition:	A resource describing an instance of the CollectionObject class.
> Comment:	 For discussion see
> http://code.google.com/p/darwincore/wiki/DwCTypeVocabulary
> Type of Term:	http://www.w3.org/2000/01/rdf-schema#Class
> Refines:	
> Status:	recommended
> Date Issued:	2011-07-04
> Date Modified:	2011-07-04
> Member Of:	http://rs.tdwg.org/dwc/terms/DwCType
> Has Domain:	
> Has Range:	
> Version:	CollectionObject-2011-07-04
> Replaces:	
> Is Replaced By:	
> Class:	
> ABCD 2.06:	not in ABCD
>
> Term Name: collectionObjectID
> Identifier:	http://rs.tdwg.org/dwc/terms/collectionObjectID
> Namespace:	http://rs.tdwg.org/dwc/terms/
> Label:	collectionObjectID
> Definition:	An identifier for the CollectionObject. In the absence of
> a persistent global unique identifier, construct one from a
> combination of identifiers in the record that will most closely make
> the collectionObjectID globally unique.
> Comment:	For a specimen in the absence of a bona fide global unique
> identifier, for example, use the form:
> "urn:catalog:[institutionCode]:[collectionCode]:[catalogNumber].
> Examples: "urn:lsid:nhm.ku.edu:Herps:32",
> "urn:catalog:FMNH:Mammal:145732". For discussion see
> http://code.google.com/p/darwincore/wiki/CollectionObject
> Type of Term:	http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
> Refines:	http://purl.org/dc/terms/identifier
> Status:	recommended
> Date Issued:	2011-07-04
> Date Modified:	2011-07-04
> Has Domain:	
> Has Range:	
> Version:	collectionObjectID-2011-07-04
> Replaces:
> Is Replaced By:	
> Class:	http://rs.tdwg.org/dwc/terms/CollectionObject
> ABCD 2.06:	DataSets/DataSet/Units/Unit/UnitGUID
>
> Term Name: collectionObjectRemarks
> Identifier:	http://rs.tdwg.org/dwc/terms/collectionObjectRemarks
> Namespace:	http://rs.tdwg.org/dwc/terms/
> Label:	collectionObjectRemarks
> Definition:	Comments or notes about the Collection Object.
> Comment:	Example: "custody transferred in 1995 from National Park
> Service". For discussion see
> http://code.google.com/p/darwincore/wiki/CollectionObject
> Type of Term:	http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
> Refines:	
> Status:	recommended
> Date Issued:	2011-07-04
> Date Modified:	2009-07-04
> Has Domain:	
> Has Range:	
> Version:	collectionObjectRemarks-2011-07-04
> Replaces:	SampleRemarks-2009-01-18
> Is Replaced By:
> Class:	http://rs.tdwg.org/dwc/terms/CollectionObject
> ABCD 2.06:	DataSets/DataSet/Units/Unit/Notes
>
> Note that with the addition of CollectionObject, the InstitutionCode,
> CollectionCode, CatalogNumber triplet would no longer apply to an
> Occurrence. This is a big, non-backward compatible change.
>
> Terms to be organized under CollectionObject rather than Occurrence:
> catalogNumber
> preparations
> disposition
> otherCatalogNumbers
>
> Terms that might have to be added to under CollectionObject to retain
> capabilities currently achieved under Occurrence:
> associatedCollectionObjectMedia
> associatedCollectionObjectReferences
> associatedCollectionObjects
> associateSequences
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
> .
>
>   

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu



More information about the tdwg-content mailing list