The current DwC Terms version (at http://rs.tdwg.org/dwc/rdf/dwcterms.rdf) defines the domain for (at least some) object properties.
Unless I am missing something this means that when I use DwC Terms for properties I implicitly assert the class of the subject, which means that either I can use DwC terms only for subjects of the type they are intended for (and why would one want to limit DwC's use in this way?), or reasoning based on RDF or OWL extractions of leads to problems.
For example, in the Phenoscape project [1] we would like to link characters, states, or OTUs to the specimens based on which a systematist defined a (or all) character(s) or state(s). The specimens would be described in a NeXML document [2] with embedded RDFa annotation by institutionCode, collectionCode, and catalogNumber. The RDF extracted from that when run in a reasoner (which we will do) would implicitly assert that the specimen is a DwCTerms:Occurrence. Right now that probably doesn't hurt much in this case, but it might in the future, and I'm not sure what is gained from forcing those implicit assertions in the vocabulary.
Should I post this as an issue to the tracker on Google Code, too?
-hilmar
[1] http://phenoscape.org [2] http://nexml.org
So would the solution be to import DwC into CDAO and make annotations be (allowed to be) DwC terms?
On Fri, Jun 5, 2009 at 5:31 PM, Hilmar Lapphlapp@duke.edu wrote:
The current DwC Terms version (at http://rs.tdwg.org/dwc/rdf/dwcterms.rdf)%C2%A0defines the domain for (at least some) object properties.
Unless I am missing something this means that when I use DwC Terms for properties I implicitly assert the class of the subject, which means that either I can use DwC terms only for subjects of the type they are intended for (and why would one want to limit DwC's use in this way?), or reasoning based on RDF or OWL extractions of leads to problems.
For example, in the Phenoscape project [1] we would like to link characters, states, or OTUs to the specimens based on which a systematist defined a (or all) character(s) or state(s). The specimens would be described in a NeXML document [2] with embedded RDFa annotation by institutionCode, collectionCode, and catalogNumber. The RDF extracted from that when run in a reasoner (which we will do) would implicitly assert that the specimen is a DwCTerms:Occurrence. Right now that probably doesn't hurt much in this case, but it might in the future, and I'm not sure what is gained from forcing those implicit assertions in the vocabulary.
Should I post this as an issue to the tracker on Google Code, too?
-hilmar
[1] http://phenoscape.org [2] http://nexml.org
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
On Jun 5, 2009, at 9:01 PM, Rutger Vos wrote:
So would the solution be to import DwC into CDAO and make annotations be (allowed to be) DwC terms?
I don't think DwC would need to imported into CDAO in order to allow annotations using DwC terms in NeXML, would it? It'd just be an additional namespace to declare.
But maybe you were asking something different?
-hilmar
On Sat, Jun 6, 2009 at 9:49 AM, Hilmar Lapphlapp@duke.edu wrote:
On Jun 5, 2009, at 9:01 PM, Rutger Vos wrote:
So would the solution be to import DwC into CDAO and make annotations be (allowed to be) DwC terms?
I don't think DwC would need to imported into CDAO in order to allow annotations using DwC terms in NeXML, would it? It'd just be an additional namespace to declare.
But maybe you were asking something different?
Mmmm... not sure - I guess I misunderstood, can DwC predicates apply to CDAO subjects?
On Jun 6, 2009, at 3:47 PM, Rutger Vos wrote:
can DwC predicates apply to CDAO subjects?
They can. The "caveat" right now is that for most DwC predicates (object properties), doing will assert that the CDAO subject instantiates whatever the domain of the DwC term is declared as.
So if you apply dwcterms:catalogNumber to a CDAO subject term, you would implicitly assert (in addition to whatever CDAO asserts about it) that that subject term is an instance of dwcterms:Occurrence (because dwcterms:catalogNumber has domain dwcterms:Occurrence). Whether that's a problem or not, depends on the context I suppose.
-hilmar
No, this one isn't appropriate for the issue tracker because it is in the discussion stage, in that there is no definitive solution to propose yet. There are a couple of quick answers to why have the domains for the property terms. The first s organization of the standard. The domains help organize the properties in ways that help people to understand their meaning and purpose. Some of them have no domain because they could apply to multiple domains, but they are few in number. The second reason for the domain assignments is that we lack a formal ontology, and this is an attempt to have one to govern at least the terms within this standard.
What is the potentially problematic future case of asserting that a specimen is an dwcterms:Occurrence? It is one.
On Fri, Jun 5, 2009 at 5:31 PM, Hilmar Lapphlapp@duke.edu wrote:
The current DwC Terms version (at http://rs.tdwg.org/dwc/rdf/dwcterms.rdf) defines the domain for (at least some) object properties.
Unless I am missing something this means that when I use DwC Terms for properties I implicitly assert the class of the subject, which means that either I can use DwC terms only for subjects of the type they are intended for (and why would one want to limit DwC's use in this way?), or reasoning based on RDF or OWL extractions of leads to problems.
For example, in the Phenoscape project [1] we would like to link characters, states, or OTUs to the specimens based on which a systematist defined a (or all) character(s) or state(s). The specimens would be described in a NeXML document [2] with embedded RDFa annotation by institutionCode, collectionCode, and catalogNumber. The RDF extracted from that when run in a reasoner (which we will do) would implicitly assert that the specimen is a DwCTerms:Occurrence. Right now that probably doesn't hurt much in this case, but it might in the future, and I'm not sure what is gained from forcing those implicit assertions in the vocabulary.
Should I post this as an issue to the tracker on Google Code, too?
-hilmar
[1] http://phenoscape.org [2] http://nexml.org
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
On Jun 5, 2009, at 9:20 PM, John R. WIECZOREK wrote:
[...] The first s organization of the standard. The domains help organize the properties in ways that help people to understand their meaning and purpose.
That's a worthwhile goal but isn't that using the wrong means? I.e., specifying the domain is telling something - and in an unambiguous manner - to machines, not humans.
You can of course create user interfaces or renderings that turns the domain specification into something meaningful to humans, too. But you could use any term property for that, couldn't you? I.e., you could use a property that doesn't imply specific and possibly far reaching machine inferences. (I haven't had time to check yet but I suspect that there are - W3C or not - conventions for how to add human- targeted class and property annotations to vocabularies.)
(BTW for comparison purposes, note that DC doesn't assert any domain or range for any of its terms, allowing the broadest possible use.)
[...] The second reason for the domain assignments is that we lack a formal ontology, and this is an attempt to have one to govern at least the terms within this standard.
I'd argue that it's worth distinguishing between a metadata vocabulary and a formal ontology, and that having one product try to be both may limit its ability to fully satisfy both.
A standard metadata vocabulary with the primary purpose that we all call the same thing by the same name is broadly useful and can help enormously with data integration across fields as diverse as genetics, genomics, systematics, ecology, and taxonomy (and more, as we heard in London). A formal ontology that supports inferences over integrated data is also highly useful, but not necessarily at the same breadth.
What is the potentially problematic future case of asserting that a specimen is an dwcterms:Occurrence? It is one.
My example was indeed a weak one, as we are indeed referencing specimens. For example, I couldn't use dwcterms:collectionCode to assert a code for a museum collection, because it would imply that the collection is an dwcterms:Occurrence.
Maybe you don't want people to use DwCTerms for anything else other than describing specimen records, but I'd argue that without such limitations the standard could become much more broadly useful.
As a comparison, dc:creator and dc:title are meanwhile being used in vastly more contexts than the original authors of DC had probably imagined. I'm not sure this would have also happened if applying dc:title implied specific assertions about the nature of what it is being applied to.
Just my $0.02.
-hilmar
Hilmar et al.,
I have removed class references from the hasDomain attribute of the Darwin Core terms and added a different attribute (organizedInClass) with which to organize the terms for human consumption. These changes will appear when I make the next commit to subversion.
Thanks for the discussion leading to this important change.
John
On Sat, Jun 6, 2009 at 7:48 AM, Hilmar Lapphlapp@duke.edu wrote:
On Jun 5, 2009, at 9:20 PM, John R. WIECZOREK wrote:
[...] The first s organization of the standard. The domains help organize the properties in ways that help people to understand their meaning and purpose.
That's a worthwhile goal but isn't that using the wrong means? I.e., specifying the domain is telling something - and in an unambiguous manner - to machines, not humans.
You can of course create user interfaces or renderings that turns the domain specification into something meaningful to humans, too. But you could use any term property for that, couldn't you? I.e., you could use a property that doesn't imply specific and possibly far reaching machine inferences. (I haven't had time to check yet but I suspect that there are - W3C or not - conventions for how to add human-targeted class and property annotations to vocabularies.)
(BTW for comparison purposes, note that DC doesn't assert any domain or range for any of its terms, allowing the broadest possible use.)
[...] The second reason for the domain assignments is that we lack a formal ontology, and this is an attempt to have one to govern at least the terms within this standard.
I'd argue that it's worth distinguishing between a metadata vocabulary and a formal ontology, and that having one product try to be both may limit its ability to fully satisfy both.
A standard metadata vocabulary with the primary purpose that we all call the same thing by the same name is broadly useful and can help enormously with data integration across fields as diverse as genetics, genomics, systematics, ecology, and taxonomy (and more, as we heard in London). A formal ontology that supports inferences over integrated data is also highly useful, but not necessarily at the same breadth.
What is the potentially problematic future case of asserting that a specimen is an dwcterms:Occurrence? It is one.
My example was indeed a weak one, as we are indeed referencing specimens. For example, I couldn't use dwcterms:collectionCode to assert a code for a museum collection, because it would imply that the collection is an dwcterms:Occurrence.
Maybe you don't want people to use DwCTerms for anything else other than describing specimen records, but I'd argue that without such limitations the standard could become much more broadly useful.
As a comparison, dc:creator and dc:title are meanwhile being used in vastly more contexts than the original authors of DC had probably imagined. I'm not sure this would have also happened if applying dc:title implied specific assertions about the nature of what it is being applied to.
Just my $0.02.
-hilmar
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
participants (3)
-
Hilmar Lapp
-
John R. WIECZOREK
-
Rutger Vos