My apologies to John Wieczorek for the panicked tone of my previous email to the list. It appeared to me (apparently incorrectly) that the issue was being closed without addressing the concern that I raised when I initially brought this up in my post to the MRTG wiki (http://www.keytonature.eu/wiki/MRTGv08_Type_term_inconsistent_with_DwC). I have the following general concerns about what is being proposed. I will follow with a particular use-case comment in a separate email.
*Hierarchy clarification needed. * If I am understanding the proposal as John has summarized it there would be three terms that could apply to a resource with metadata subject to DwC. A new DwC term /recordClass/ has with controlled values corresponding to the TDWG classes: "Occurrence", "Event", "Location", "Taxon". /dcterms:type/ is another term which could theoretically have DCMI Type values of: "Collection", "Dataset", "Event", "Image", "InteractiveResource", "MovingImage", "PhysicalObject", "Service", "Software", "Sound", "StillImage", or "Text" (although not all may be appropriate in the DwC context). It is not clear to me in John's proposal whether the assignment of /dcterms:type/ is intended to be independent of the value of /recordClass/, or if /dcterms:type/ is intended to be a subclass of /recordClass/ Occurrence (i.e. applicable only to resources that represent things that can document Occurrences such as Images and PhysicalObjects). In Gregor's email initiating this discussion he indicated that he felt that they should be independent. But in John's proposal, the third term: /basisOfRecord/ is clearly intended to be a subclass of /dcterms:type/ with possible values of "StillImage", "MovingImage", "Sound", "PreservedSpecimen", FossilSpecimen", LivingSpecimen", "HumanObservation", "MachineObservation". Since all of these /basisOfRecord/ objects are the bases for documenting Occurrences, that insinuates that their parent /dcterms:type/ terms should fall under /recordClass/ Occurrence.
*Problems with calling /basisOfRecord/ a subclass of /dcterms:type/. *PreservedSpecimen, FossilSpecimen, and LivingSpecimen can clearly fall under PhysicalObject, but what do we do with the rest? /basisOfRecord /terms StillImage and MovingImage could be subtypes of /dcterms:type/ Image, but what about a Sound? Is /basisOfRecord /Sound a subtype of /dcterms:type/ Sound? What about a 35mm slide picturing an organism? It is an Image, but is also a PhysicalObject. Gregor suggested that /basisOfRecord /HumanObservation and /basisOfRecord /MachineObservation could be subtypes of /dcterms:type /Event but I disagree. Just like observations, StillImages and PreservedSpecimens have Event information associated with them (the time and location of their creation) but we don't classify them as Events. An Event is a conceptually different thing from the resource that is created at that Event.
*How does this change facilitate machine processing? * In John's request for comments, he quoted "3.3. Semantic changes in Darwin Core terms" which mentioned "if ... such changes of meaning are likely to have substantial impact on either machine processing of Darwin Core terms... " It is not at all clear to me how the proposed reorganization of these three terms (/recordClass/, /dcterms:type,/ and /basisOfRecord/) will facilitate machine processing, in particular because of the problems in associating particular /basisOfRecord/ terms with /dcterms:type/ terms as I discussed in the previous paragraph. From a machine-processing standpoint, it makes a lot more sense to subclass /recordClass/ Occurrence as follows: *PhysicalObject *(including /BasisOfRecord/ "PreservedSpecimen", FossilSpecimen", LivingSpecimen", and any other relevant material objects such as Seeds, FilmImage, etc.) *DigitalObject *(including /BasisOfRecord/ "StillImage", "MovingImage", "Sound", and any other relevant file-representable objects such as DnaSequences) *NoObject *(including /BasisOfRecord/ "HumanObservation", "MachineObservation") The consuming application that is receiving the metadata can then know that if the record involves an Occurrence, if the record's subclass is: - PhysicalObject then there is an object somewhere that is not deliverable through the Internet but which could be visited in an herbarium or museum. - DigitalObject then there is a representational file that should be retrieved and presented to the user of the application in an appropriate way. - NoObject then there will only be metadata including measurements but nothing for the user to see, hear, etc. Alternatively, rather than creating the three subclasses, simply create a property for resources of the class Occurrence called "objectType" and allow it to have values of PhysicalObject, DigitalObject, or NoObject. This would serve the same purpose of informing the consuming application of the nature of the resource without having to create another hierarchical layer. From a machine-processing standpoint, I don't see a great benefit to presenting a biodiversity-related consuming application with both /dcterms:type,/ and /basisOfRecord./
*An alternative. *To me, trying to merge the /dcterms:type/ and its DCMI Type values together with the new DwC /recordClass/ and /BasisOfRecord/ is like trying to fit a square peg in a round hole. DCMI wasn't created with biodiversity records in mind and its vocabulary really doesn't mesh very well with DwC. I fully support the idea of creating the new DwC term /recordClass/ to contain what DwC formerly put in /dcterms:type/. However, I think it would just be better to leave the actual /dcterms:type/ and its DCMI Type values as an independent thing without trying to make DwC /basisOfRecord/ its subtype. Biodiversity media providers who need for their databases to mesh with non-biodiversity records should assign /dcterms:type/ values to their records, non-media providers can if they want. Biodiversity data providers (including those who provide media) should assign /recordClass/ and /BasisOfRecord/ values to their records.
Steve Baskauf
John R. WIECZOREK wrote:
Gregor,
That sounds like a good solution to all of the problems. I would propose that the basisOfRecord IS the the same thing as your proposed dwc:subtype, so we should keep basisOfRecord.
Net solution:
- keep dcterms:type
- use DCType vocabulary to control dcterms:type (so, StillImage,
PhysicalObject, Event, etc.) 3) keep basisOfRecord 4) use our DwC-specific subtypes (PreservedSpecimen, FossilSpecimen, HumanObservation, etc.) as the controlled vocabulary for basisOfRecord without a formal type vocabulary (very close to how it is now, just some of the terms would go to dcterms:type). 5) add a recordClass term 6) use the DwCType vocabulary to control the recordClass term instead of the dcterms:type term.
This solutions fixes the Dublin Core - Darwin Core controlled vocabulary problem, retains all existing terms, isolates the controlled vocabulary that is specific to our domain, making it very easy to expand without changes to the standard.
Any objections?
John
On Fri, Oct 23, 2009 at 12:33 PM, Gregor Hagedorn g.m.hagedorn@gmail.com wrote:
How about we retain basisOfRecord, but have it refine dcterms:type, drop dcterms:type and add a "recordClass" term in its place that is governed exactly as dcterms:type is currently being used in the recently ratified version of the Core?
recordClass for Taxon/Occurrence/Event sounds good.
I am less sure about keeping the "perspective-dependent" basisOfRecord-term in a place where dcterms:type. The dcterms:type vocabulary is, in principle, extensible, and meant to be extended. Except, of course, some specific xml-implementation of dublin core prevent this... To avoid problems with this one might desire to have only the strict resource type vocabulary in dcterms:type. Then this could by PhysicalObject/Event and a dwc:subtype added to express PreservedSpecimen/MachineObservation etc. Essentially, MRTG intends to use such a mrtg:subtype as well to differentiate different StillImage or Text subtypes: http://www.keytonature.eu/wiki/MRTG_Schema_v0.8#Subtype
This would then mean, DarwinCore might support: dwc:recordClass dcterms:type dwc:subtype
Gregor
.