domains of object properties. was Re: DwC for the semantic web
I´ve been in Ecuador for two weeks with only a few hours of internet contact. Flying from Quite tonight. So I´ve only scanned the Design Document and may have done so too hastily. I´ll have more time next week to look at it and the ontology. However, meanwhile, I´ve posted the following unsurprizing comment there, and hope that anyone who responds here will also respond there. ----- It´s a little hard to understand why DSW object properteries should have domains but data properties should not. Keep in mind that the formal semantics of rdfs:domain is that if rdfs:domain P C then any use a P x forces a to have rdfs:type C. This constrains the extensibility of P and probably of the use of other ontologies where P might be useful. Specifying domains tends to close the world somewhat, and I see no advantage to that....Indeed, this is more naughty than anti'naughty sensu Bob Morris. ----- --Bob Morris On Wed, Apr 27, 2011 at 9:01 PM, Steve Baskauf <steve.baskauf@vanderbilt.edu> wrote:
Dear colleagues,
With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an ontology for consideration, based on Darwin Core terms:
Darwin-SW: General site: http://code.google.com/p/darwin-sw/ Using namespace dsw = "http://purl.org/dsw/"
It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
Sincerely,
Steve Baskauf and Cam Webb steve.baskauf@Vanderbilt.Edu cwebb@oeb.harvard.edu
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 IT Staff Filtered Push Project Department of Organismal and Evolutionary Biology Harvard University email: morris.bob@gmail.com web: http://efg.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram phone (+1) 857 222 7992 (mobile)
Ooo! I knew Bob would have something to say about this! Welcome back Bob, by the way. I should probably let Cam comment on this, but he's going to be in the field without Internet for about a week, so I'll try to respond. We were very conscious of the connection between domain/range declarations, how they might cause unintended rdf:type assignments (see http://code.google.com/p/darwin-sw/wiki/ClassesAndTypes for a more on this), and of the history of this issue in the discussion that lead up to the Darwin Core standard. In fact, the issue of whether to have any range and domain declarations was probably the first thing that Cam and I discussed when we started the project and we initially didn't agree on it. In the end, we decided that DSW is different from DwC in an important way. DwC is designed to be a general purpose vocabulary where it is OK to use a term like dwc:genus as a property for pretty much anything that you want. DSW is NOT a general-purpose ontology. It is an ontology for people who see the biodiversity informatics realm according to the (maybe) consensus that came out in the discussion last Oct/Nov which we have attempted to embody in the DSW ontology. So people who see that realm in a different way SHOULD NOT use DSW. They should create their own ontology and describe the biodiversity informatics world the way that they see it. I see an RDF representation of metadata as fundamentally different from a delineated text or generic xml file. In those latter cases, there is generally an agreement between the sender and receiver about what the content of the file "means" - perhaps articulated formally in a schema or just informally in an email. In the case of RDF, the sender does not necessarily know who the receiver will be, and the receiver does not know what the content of the RDF means unless the sender uses classes and properties that are widely understood. That is why I am somewhat skeptical about just marking metadata up as RDF any old way and throwing it out into the cloud. What good does that do if "the cloud" has no idea what the metadata means and can't interpret it? DWC is an attempt to create RDF that is consistent with (what we believe to be) the way the TDWG community "understands" the DwC classes and their relationships (at least as far as we could infer it from the tdwg-content discussion from September to now). I realize that this closes the world a lot. But a major part of what Cam and I were trying to do was to create a situation where aggregators who want to collect metadata on resources identified by GUIDs can feel reasonably confident that particular rdf type's of resources from diverse sources actually "mean" the same thing to all of the providers. The fact that using the wrong object properties to describe a resource might be "naughty" will (hopefully) get people to think carefully about the (rdf:)type of the thing they are trying to describe. Most of the object properties that Cam and I defined in DSW were intended specifically to describe the relationship between two classes in the ontology and nothing else. For example dsw:idBasedOn is supposed to describe the Token (evidence) on which an Identification is based. We don't intend for that term to be used creatively by people outside the biodiversity informatics community. We intend for it to be used by people who want to clearly state that a particular object resource served as evidence for the subject Identification. Hence, the domain is dwc:Identification and the range is dsw:Token . Any other use is abuse, and if the user applies it as a property of a cat, then that user is stating that the cat is a dwc:Identification. That is silly and anyone that careless should not use DSW. On the other hand, saying that dsw:idBasedOn has the range dsw:Token allows me to indicate that basis of an Identification (whose evidence was described by somebody else) is rdf:type dsw:Token, even if I myself did not write the RDF description of that evidence myself. Likewise, applying the term dsw:locatedAt (which has the range dcterms:Location) to a place described by somebody else allows me to assert that the object is a dcterms:Location even if the person who created the RDF description of the place chose not to explicitly rdf:type the place as a dcterms:Location. In contrast, property terms from the DwC standard (which we feel are in most cases appropriate for use as data rather than object properties) do not necessarily have clear domains and ranges. Some do (e.g. dwc:identifiedBy clearly has the domain dwc:Identification) but others don't. Does dwc:genus have the domain dwc:Taxon or tn:TaxonName? Can it be used as a property of a dwc:Identification? I don't know. Is there a single class that is the domain of dwc:catalogNumber? Probably not. Is there ANY existing class that is the domain of dwc:establishmentMeans? Based on the many posts about that term, I doubt it. The point is that I don't think that there is (yet at least) a consensus on what the domain and range is for many DwC property terms. So Cam and I did not feel comfortable assigning domains and ranges for them nor dooming people to the sin of naughtiness if they disagreed with us. I hope this explains our reasoning somewhat. There is, of course, the possibility that we just made a bad call and that none of our object properties should have ranges and domains. Perhaps someone can make counterarguments to what I've just said. Steve P.S. BTW it makes me feel so much better to know that I'm not the only one to slip and say "rdfs:type"!. Thanks for the careful read, Bob, and we're looking forward to you discovering misstatements and inaccuracies which are undoubtedly present in the DSW documentation. Bob Morris wrote:
I´ve been in Ecuador for two weeks with only a few hours of internet contact. Flying from Quite tonight. So I´ve only scanned the Design Document and may have done so too hastily. I´ll have more time next week to look at it and the ontology. However, meanwhile, I´ve posted the following unsurprizing comment there, and hope that anyone who responds here will also respond there. ----- It´s a little hard to understand why DSW object properteries should have domains but data properties should not. Keep in mind that the formal semantics of rdfs:domain is that if rdfs:domain P C then any use a P x forces a to have rdfs:type C. This constrains the extensibility of P and probably of the use of other ontologies where P might be useful. Specifying domains tends to close the world somewhat, and I see no advantage to that....Indeed, this is more naughty than anti'naughty sensu Bob Morris. -----
--Bob Morris
On Wed, Apr 27, 2011 at 9:01 PM, Steve Baskauf <steve.baskauf@vanderbilt.edu> wrote:
Dear colleagues,
With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an ontology for consideration, based on Darwin Core terms:
Darwin-SW: General site: http://code.google.com/p/darwin-sw/ Using namespace dsw = "http://purl.org/dsw/"
It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
Sincerely,
Steve Baskauf and Cam Webb steve.baskauf@Vanderbilt.Edu cwebb@oeb.harvard.edu
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A. delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235 office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu
Using a domain for a property will only be a problem if disjoints are stated. Even more, someone else could add further domains to trigger further rdfs:type inferences. We should be careful, nonetheless, when using domains. I'm saying this cause the word "constraint" does not quite represent what a domain is, and we should avoid it, since it creates confusion on OWL newcomers. Very good point though. On og., 2011.eko apiren 28a 22:14, Bob Morris wrote:
I´ve been in Ecuador for two weeks with only a few hours of internet contact. Flying from Quite tonight. So I´ve only scanned the Design Document and may have done so too hastily. I´ll have more time next week to look at it and the ontology. However, meanwhile, I´ve posted the following unsurprizing comment there, and hope that anyone who responds here will also respond there. ----- It´s a little hard to understand why DSW object properteries should have domains but data properties should not. Keep in mind that the formal semantics of rdfs:domain is that if rdfs:domain P C then any use a P x forces a to have rdfs:type C. This constrains the extensibility of P and probably of the use of other ontologies where P might be useful. Specifying domains tends to close the world somewhat, and I see no advantage to that....Indeed, this is more naughty than anti'naughty sensu Bob Morris. -----
--Bob Morris
On Wed, Apr 27, 2011 at 9:01 PM, Steve Baskauf <steve.baskauf@vanderbilt.edu> wrote:
Dear colleagues,
With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an ontology for consideration, based on Darwin Core terms:
Darwin-SW: General site: http://code.google.com/p/darwin-sw/ Using namespace dsw = "http://purl.org/dsw/"
It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
Sincerely,
Steve Baskauf and Cam Webb steve.baskauf@Vanderbilt.Edu cwebb@oeb.harvard.edu
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Mikel Egaña Aranguren, PhD http://mikeleganaaranguren.com Marie Curie post-doc at Ontology Engineering Group, UPM http://www.oeg-upm.net/
Well, at the risk of being scolded, we actually DID declare some classes to be disjoint in the ontology. Those classes are Location, Event, Occurrence, IndividualOrganism, Identification, and Taxon. Based on the discussion of Oct/Nov, it was clear to us that each of those is a distinct type of thing. From the standpoint of DSW, it is "wrong" to type something as both a Location and an Occurrence, or a Taxon and an Event. If one does, a reasoner should have a fit. That is why we have taken care on the Wiki pages to define what we consider each class to be and to support that reasoning with references to either publications or discussion that took place on the list. As I said in a previous email, the point of DWS is to provide clarity about what an RDF file says, not to encourage creativity. You should note that Token is not included on the list of classes that are mutually disjoint. Another point of discussion between Cam and I was whether there actually should be a class called "Token". In the end, we decided to include it with the understanding that it would probably have at least two rdf:type's most of the time. There is nothing wrong with that. A digital image can be a dsw:Token and also be a mrtg:MultiMediaObject, dctype:StillImage, and foaf:Image if one cares to assign it all of those types directly, or indirectly via making it the object of a property that has such a range declaration (such as foaf:depiction). A Token can also be an IndividualOrganism if it's a living specimen. I consider typing something to be a message to users which informs them of what kinds of properties to expect to find out about it. If a resource is a Token, we would expect it to have the kinds of properties that evidence has, such as dsw:isBasisForId aor dsw:evidenceFor and maybe dwc:catalogNumber. If it is also a dsw:IndividualOrganism, then we expect it to have properties like dsw:hasIdentification and dsw:hasOccurrence but if it's a still image then it should have properties like dcterms:creator and intellectual rights properties. Perhaps we do deserve a scolding for doing this. But as I have said earlier, the purpose of DSW is to facilitate clear and unambiguous communication in RDF, primarily for people who want to combine metadata from different institutions based on diverse kinds of tokens (preserved specimens, images, living specimens, observations, tissue samples, etc.). That really isn't possible if each institution has their own personal idea of what something like an Occurrence is. People who have other objectives may be best served by creating a different DwC-based ontology that meets their own needs. Steve Mikel Egaña Aranguren wrote:
Using a domain for a property will only be a problem if disjoints are stated. Even more, someone else could add further domains to trigger further rdfs:type inferences.
We should be careful, nonetheless, when using domains.
I'm saying this cause the word "constraint" does not quite represent what a domain is, and we should avoid it, since it creates confusion on OWL newcomers.
Very good point though.
On og., 2011.eko apiren 28a 22:14, Bob Morris wrote:
I´ve been in Ecuador for two weeks with only a few hours of internet contact. Flying from Quite tonight. So I´ve only scanned the Design Document and may have done so too hastily. I´ll have more time next week to look at it and the ontology. However, meanwhile, I´ve posted the following unsurprizing comment there, and hope that anyone who responds here will also respond there. ----- It´s a little hard to understand why DSW object properteries should have domains but data properties should not. Keep in mind that the formal semantics of rdfs:domain is that if rdfs:domain P C then any use a P x forces a to have rdfs:type C. This constrains the extensibility of P and probably of the use of other ontologies where P might be useful. Specifying domains tends to close the world somewhat, and I see no advantage to that....Indeed, this is more naughty than anti'naughty sensu Bob Morris. -----
--Bob Morris
On Wed, Apr 27, 2011 at 9:01 PM, Steve Baskauf <steve.baskauf@vanderbilt.edu> wrote:
Dear colleagues,
With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an ontology for consideration, based on Darwin Core terms:
Darwin-SW: General site: http://code.google.com/p/darwin-sw/ Using namespace dsw = "http://purl.org/dsw/"
It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
Sincerely,
Steve Baskauf and Cam Webb steve.baskauf@Vanderbilt.Edu cwebb@oeb.harvard.edu
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A. delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235 office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu
participants (3)
-
Bob Morris
-
Mikel Egaña Aranguren
-
Steven J. Baskauf