[tdwg-content] Public Review of the Darwin Core 'standard'

John R. WIECZOREK tuco at berkeley.edu
Fri Jul 24 20:48:51 CEST 2009

I'm drawing together Kevin's post with one posted by Hilmar Lapp some
time ago on the same subject. Both are included below.

Despite the fact that Dublin Core does actually use the hasDomain
property for some terms (admittedly very few), I believe that the
hasDomain property has been incorrectly applied in DwC as it currently
stands. The definition of hasDomain from Dublin Core is "A Class of
which a resource described by the term is an Instance." As Hilmar
points out (below), using the domains as currently defined in DwC
terms, a resource described by a dwc:collectionCode would be an
instance of a dwc:Occurrence (the hasDomain property of
dwc:collectionCode is http://rs.tdwg.org/dwc/terms/Occurrence).
Similary, a resource described by a dwc:scientificName would be an
instance of a dwc:Taxon. Clearly we will want to describe Occurrences
with dwc:scientificName without asserting that the Occurrence is a
dwc:Taxon, so the domain assignment is too constraining.

The intention in DwC was much more loose than this - the domains were
set simply to logically group the properties, not to assert instances
of classes. Yet, the groupings in the Darwin Core documentation and
schemas are extremely useful. I propose that we maintain those
informal associations, but remove the hasDomain assignments so that
misleading ontological assertions are not made.

Now, getting to Kevin's agent conundrum. If we use a common agent
property and have no domains for the terms how would we distinguish
specific roles? Instead, the specific roles are elaborated in their
own terms that all refine http://rs.tdwg.org/dwc/terms/accordingTo
(see http://rs.tdwg.org/dwc/terms/history/index.htm#accordingTo-2009-01-21),
the definition of which is "Abstract term to attribute information to
a source." This isn't the same as dcterms:source or dcterms:Agent,
though if the distinction seems too subtle, each of the roles could be
amended to be given http://purl.org/dc/terms/Agent as the value of the
hasRange property.

About the ontology, again, until there is active work in that area, I
think the pragmatic way forward is to let it be the burden of the
ontology to make use of terms from actual standards.

On Sun, Jul 12, 2009 at 5:05 PM, Kevin
Richards<RichardsK at landcareresearch.co.nz> wrote:
> I'm not sure if this is my misunderstanding of RDF and ontologies, so correct me if I am wrong...
> I noticed that the http://rs.tdwg.org/dwc/terms/geoReferencedBy property has domain http://purl.org/dc/terms/Location.
> Is this the standard way of working with Dublin Core?
> I would have though it would be better to define a "common" person/agent type property (or better still reuse FOAF or DC Agent types - http://purl.org/dc/terms/Agent).  This property has no range and can therefore be applied to any RDF class/element.  This would remove the need for repetitive elements that refer to Agent type properties, eg geoReferenceBy, identifiedBy, measurementDeterminedBy, etc.
> This was part of the intention of the TDWG Common Vocab. (http://rs.tdwg.org/ontology/voc/Common.rdf), which brings me to my next question - how do the DwC terms relate to the TDWG ontology? - do we need a "mapping" to them?  I suppose, because the DwC ontology needs to be ratified as a standard, it can't really rely on an un-ratified ontology - but it would be good to specify how the two should work together, or map to each other.
> Otherwise I think this is looking excellent - a fantastic step towards an overarching ontology for TDWG data objects.
> Kevin
On Jun 5, 2009, at 9:20 PM, John R. WIECZOREK wrote:

[...] The first s organization of the standard. The domains help
organize the properties in ways that help people to understand their
meaning and purpose.

Hilmar replied:
That's a worthwhile goal but isn't that using the wrong means? I.e.,
specifying the domain is telling something - and in an unambiguous
manner - to machines, not humans.

You can of course create user interfaces or renderings that turns the
domain specification into something meaningful to humans, too. But you
could use any term property for that, couldn't you? I.e., you could
use a property that doesn't imply specific and possibly far reaching
machine inferences. (I haven't had time to check yet but I suspect
that there are - W3C or not - conventions for how to add
human-targeted class and property annotations to vocabularies.)

(BTW for comparison purposes, note that DC doesn't assert any domain
or range for any of its terms, allowing the broadest possible use.)

On Jun 5, 2009, at 9:20 PM, John R. WIECZOREK wrote:
[...] The second reason for the domain assignments is that we lack a
formal ontology, and this is an attempt to have one to govern at least
the terms within this standard.

Hilmar replied:
I'd argue that it's worth distinguishing between a metadata vocabulary
and a formal ontology, and that having one product try to be both may
limit its ability to fully satisfy both.

A standard metadata vocabulary with the primary purpose that we all
call the same thing by the same name is broadly useful and can help
enormously with data integration across fields as diverse as genetics,
genomics, systematics, ecology, and taxonomy (and more, as we heard in
London). A formal ontology that supports inferences over integrated
data is also highly useful, but not necessarily at the same breadth.

On Jun 5, 2009, at 9:20 PM, John R. WIECZOREK wrote:
What is the potentially problematic future case of asserting that a
specimen is an dwcterms:Occurrence? It is one.

Hilmar replaied:
My example was indeed a weak one, as we are indeed referencing
specimens. For example, I couldn't use dwcterms:collectionCode to
assert a code for a museum collection, because it would imply that the
collection is an dwcterms:Occurrence.

Maybe you don't want people to use DwCTerms for anything else other
than describing specimen records, but I'd argue that without such
limitations the standard could become much more broadly useful.

As a comparison, dc:creator and dc:title are meanwhile being used in
vastly more contexts than the original authors of DC had probably
imagined. I'm not sure this would have also happened if applying
dc:title implied specific assertions about the nature of what it is
being applied to.

Just my $0.02.


More information about the tdwg-content mailing list