[tdwg-content] Conflict between DarwinCore and DublinCore usage of dcterms:type / basisOfRecord
Steve Baskauf
steve.baskauf at vanderbilt.edu
Sat Oct 24 09:16:27 CEST 2009
My apologies to John Wieczorek for the panicked tone of my previous
email to the list. It appeared to me (apparently incorrectly) that the
issue was being closed without addressing the concern that I raised when
I initially brought this up in my post to the MRTG wiki
(http://www.keytonature.eu/wiki/MRTGv08_Type_term_inconsistent_with_DwC).
I have the following general concerns about what is being proposed. I
will follow with a particular use-case comment in a separate email.
*Hierarchy clarification needed. * If I am understanding the proposal as
John has summarized it there would be three terms that could apply to a
resource with metadata subject to DwC. A new DwC term /recordClass/ has
with controlled values corresponding to the TDWG classes: "Occurrence",
"Event", "Location", "Taxon". /dcterms:type/ is another term which
could theoretically have DCMI Type values of: "Collection", "Dataset",
"Event", "Image", "InteractiveResource", "MovingImage",
"PhysicalObject", "Service", "Software", "Sound", "StillImage", or
"Text" (although not all may be appropriate in the DwC context). It is
not clear to me in John's proposal whether the assignment of
/dcterms:type/ is intended to be independent of the value of
/recordClass/, or if /dcterms:type/ is intended to be a subclass of
/recordClass/ Occurrence (i.e. applicable only to resources that
represent things that can document Occurrences such as Images and
PhysicalObjects). In Gregor's email initiating this discussion he
indicated that he felt that they should be independent. But in John's
proposal, the third term: /basisOfRecord/ is clearly intended to be a
subclass of /dcterms:type/ with possible values of "StillImage",
"MovingImage", "Sound", "PreservedSpecimen", FossilSpecimen",
LivingSpecimen", "HumanObservation", "MachineObservation". Since all of
these /basisOfRecord/ objects are the bases for documenting Occurrences,
that insinuates that their parent /dcterms:type/ terms should fall under
/recordClass/ Occurrence.
*Problems with calling /basisOfRecord/ a subclass of /dcterms:type/.
*PreservedSpecimen, FossilSpecimen, and LivingSpecimen can clearly fall
under PhysicalObject, but what do we do with the rest? /basisOfRecord
/terms StillImage and MovingImage could be subtypes of /dcterms:type/
Image, but what about a Sound? Is /basisOfRecord /Sound a subtype of
/dcterms:type/ Sound? What about a 35mm slide picturing an organism?
It is an Image, but is also a PhysicalObject. Gregor suggested that
/basisOfRecord /HumanObservation and /basisOfRecord /MachineObservation
could be subtypes of /dcterms:type /Event but I disagree. Just like
observations, StillImages and PreservedSpecimens have Event information
associated with them (the time and location of their creation) but we
don't classify them as Events. An Event is a conceptually different
thing from the resource that is created at that Event.
*How does this change facilitate machine processing? * In John's request
for comments, he quoted "3.3. Semantic changes in Darwin Core terms"
which mentioned "if ... such changes of meaning are likely to have
substantial impact on either machine processing of Darwin Core terms...
" It is not at all clear to me how the proposed reorganization of these
three terms (/recordClass/, /dcterms:type,/ and /basisOfRecord/) will
facilitate machine processing, in particular because of the problems in
associating particular /basisOfRecord/ terms with /dcterms:type/ terms
as I discussed in the previous paragraph. From a machine-processing
standpoint, it makes a lot more sense to subclass /recordClass/
Occurrence as follows:
*PhysicalObject *(including /BasisOfRecord/ "PreservedSpecimen",
FossilSpecimen", LivingSpecimen", and any other relevant material
objects such as Seeds, FilmImage, etc.)
*DigitalObject *(including /BasisOfRecord/ "StillImage", "MovingImage",
"Sound", and any other relevant file-representable objects such as
DnaSequences)
*NoObject *(including /BasisOfRecord/ "HumanObservation",
"MachineObservation")
The consuming application that is receiving the metadata can then know
that if the record involves an Occurrence, if the record's subclass is:
- PhysicalObject then there is an object somewhere that is not
deliverable through the Internet but which could be visited in an
herbarium or museum.
- DigitalObject then there is a representational file that should be
retrieved and presented to the user of the application in an appropriate
way.
- NoObject then there will only be metadata including measurements but
nothing for the user to see, hear, etc.
Alternatively, rather than creating the three subclasses, simply create
a property for resources of the class Occurrence called "objectType" and
allow it to have values of PhysicalObject, DigitalObject, or NoObject.
This would serve the same purpose of informing the consuming application
of the nature of the resource without having to create another
hierarchical layer. From a machine-processing standpoint, I don't see a
great benefit to presenting a biodiversity-related consuming application
with both /dcterms:type,/ and /basisOfRecord./
*An alternative. *To me, trying to merge the /dcterms:type/ and its
DCMI Type values together with the new DwC /recordClass/ and
/BasisOfRecord/ is like trying to fit a square peg in a round hole.
DCMI wasn't created with biodiversity records in mind and its vocabulary
really doesn't mesh very well with DwC. I fully support the idea of
creating the new DwC term /recordClass/ to contain what DwC formerly put
in /dcterms:type/. However, I think it would just be better to leave
the actual /dcterms:type/ and its DCMI Type values as an independent
thing without trying to make DwC /basisOfRecord/ its subtype.
Biodiversity media providers who need for their databases to mesh with
non-biodiversity records should assign /dcterms:type/ values to their
records, non-media providers can if they want. Biodiversity data
providers (including those who provide media) should assign
/recordClass/ and /BasisOfRecord/ values to their records.
Steve Baskauf
John R. WIECZOREK wrote:
> Gregor,
>
> That sounds like a good solution to all of the problems. I would
> propose that the basisOfRecord IS the the same thing as your proposed
> dwc:subtype, so we should keep basisOfRecord.
>
> Net solution:
>
> 1) keep dcterms:type
> 2) use DCType vocabulary to control dcterms:type (so, StillImage,
> PhysicalObject, Event, etc.)
> 3) keep basisOfRecord
> 4) use our DwC-specific subtypes (PreservedSpecimen, FossilSpecimen,
> HumanObservation, etc.) as the controlled vocabulary for basisOfRecord
> without a formal type vocabulary (very close to how it is now, just
> some of the terms would go to dcterms:type).
> 5) add a recordClass term
> 6) use the DwCType vocabulary to control the recordClass term instead
> of the dcterms:type term.
>
> This solutions fixes the Dublin Core - Darwin Core controlled
> vocabulary problem, retains all existing terms, isolates the
> controlled vocabulary that is specific to our domain, making it very
> easy to expand without changes to the standard.
>
> Any objections?
>
> John
>
> On Fri, Oct 23, 2009 at 12:33 PM, Gregor Hagedorn
> <g.m.hagedorn at gmail.com> wrote:
>
>>> How about we retain basisOfRecord, but have it refine dcterms:type,
>>> drop dcterms:type and add a "recordClass" term in its place that is
>>> governed exactly as dcterms:type is currently being used in the
>>> recently ratified version of the Core?
>>>
>> recordClass for Taxon/Occurrence/Event sounds good.
>>
>> I am less sure about keeping the "perspective-dependent"
>> basisOfRecord-term in a place where dcterms:type. The dcterms:type
>> vocabulary is, in principle, extensible, and meant to be extended.
>> Except, of course, some specific xml-implementation of dublin core
>> prevent this... To avoid problems with this one might desire to have
>> only the strict resource type vocabulary in dcterms:type. Then this
>> could by PhysicalObject/Event and a dwc:subtype added to express
>> PreservedSpecimen/MachineObservation etc. Essentially, MRTG intends to
>> use such a mrtg:subtype as well to differentiate different StillImage
>> or Text subtypes:
>> http://www.keytonature.eu/wiki/MRTG_Schema_v0.8#Subtype
>>
>> This would then mean, DarwinCore might support:
>> dwc:recordClass
>> dcterms:type
>> dwc:subtype
>>
>> Gregor
>>
>>
> .
>
>
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
http://bioimages.vanderbilt.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20091024/2a1a742c/attachment.html
More information about the tdwg-content
mailing list