DwC for the semantic web
Dear colleagues,
With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an ontology for consideration, based on Darwin Core terms:
Darwin-SW: General site: http://code.google.com/p/darwin-sw/ Using namespace dsw = "http://purl.org/dsw/"
It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
Sincerely,
Steve Baskauf and Cam Webb steve.baskauf@Vanderbilt.Edu cwebb@oeb.harvard.edu
Hi;
Perhaps this has been discussed before, but I think there should be a class for population.
Different populations of the same taxon are located in different places, each population with its own circumstances (e.g. some will be at risk and some not).
Cheers
On og., 2011.eko apiren 28a 04:01, Steve Baskauf wrote:
Dear colleagues,
With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an ontology for consideration, based on Darwin Core terms:
Darwin-SW: General site: http://code.google.com/p/darwin-sw/ Using namespace dsw = "http://purl.org/dsw/"
It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
Sincerely,
Steve Baskauf and Cam Webb steve.baskauf@Vanderbilt.Edu cwebb@oeb.harvard.edu
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Mikel, Thanks for your suggestion. There was quite a bit of discussion last October and November about how to handle groups of organisms above the level of an individual organism. I won't repeat it here because it's referenced in the DSW wiki pages, particularly http://code.google.com/p/darwin-sw/wiki/ClassIndividual . As a point of clarification, the class dsw:IndividualOrganism as we have defined it in DSW does not specify that an instance of the class must actually be an individual organism. It should be understood to be a taxonomically homogeneous entity which can be assigned zero to many dwc:Identification instances, be recorded by zero to many dwc:Occurrence instances, and be documented by zero to many dsw:Token instances . So really anything that meets those criteria could be considered a dsw:IndividualOrganism, including clonal organisms, colonial organisms, small populations, and I suppose also parts of organisms (such as tissue cultures) if one wanted. At one point we considered using the name TaxonomicallyHomogeneousEntity, but that seemed unwieldy.
So I guess my response would be to say that a population could be considered an instance of dsw:IndividualOrganism. It would need to be well-defined enough that it could be potentially documented as a dwc:Occurrence, but since dwc:Event and dcterms:Location (which are associated with a dwc:Occurrence) can potentially have a very broad scope, I think that there would not be a problem with classing a population as a dsw:IndividualOrganism . Certainly entities such as a stable animal herd or immobile plant population could be handled by it and maybe other things as well.
I should also note that the page http://code.google.com/p/darwin-sw/wiki/TaxonomicHeterogeneity presents the topic of how to model taxonomically heterogeneous groups of organisms, which was the topic of an extended discussion in November. We did not include a class to handle such groups in our basic ontology, mostly because we didn't know how to do it. However, the wiki page just mentioned suggests a possible approach and perhaps people with more skill in defining OWL ontologies than Cam and I have could suggest an approach that would actually be able to handle that more complex situation.
I did not state in my first email that DSW is essentially a draft intended to foster discussion (such as this). We make no claims that it is or should be "THE" ontology. Cam and I needed something functional for our projects, so we just made DSW to serve that purpose.
Thanks again for the comment/suggestion!
Steve
Mikel Egaña Aranguren wrote:
Hi;
Perhaps this has been discussed before, but I think there should be a class for population.
Different populations of the same taxon are located in different places, each population with its own circumstances (e.g. some will be at risk and some not).
Cheers
On og., 2011.eko apiren 28a 04:01, Steve Baskauf wrote:
Dear colleagues,
With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an ontology for consideration, based on Darwin Core terms:
Darwin-SW: General site: http://code.google.com/p/darwin-sw/ Using namespace dsw = "http://purl.org/dsw/"
It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
The ontology is now in use at http://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
Sincerely,
Steve Baskauf and Cam Webb steve.baskauf@Vanderbilt.Edu cwebb@oeb.harvard.edu
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi;
On og., 2011.eko apiren 28a 15:41, Steve Baskauf wrote:
Mikel, Thanks for your suggestion. There was quite a bit of discussion last October and November about how to handle groups of organisms above the level of an individual organism. I won't repeat it here because it's referenced in the DSW wiki pages, particularly http://code.google.com/p/darwin-sw/wiki/ClassIndividual . As a point of clarification, the class dsw:IndividualOrganism as we have defined it in DSW does not specify that an instance of the class must actually be an individual organism. It should be understood to be a taxonomically homogeneous entity which can be assigned zero to many dwc:Identification instances, be recorded by zero to many dwc:Occurrence instances, and be documented by zero to many dsw:Token instances . So really anything that meets those criteria could be considered a dsw:IndividualOrganism, including clonal organisms, colonial organisms, small populations, and I suppose also parts of organisms (such as tissue cultures) if one wanted. At one point we considered using the name TaxonomicallyHomogeneousEntity, but that seemed unwieldy.
So I guess my response would be to say that a population could be considered an instance of dsw:IndividualOrganism. It would need to be well-defined enough that it could be potentially documented as a dwc:Occurrence, but since dwc:Event and dcterms:Location (which are associated with a dwc:Occurrence) can potentially have a very broad scope, I think that there would not be a problem with classing a population as a dsw:IndividualOrganism . Certainly entities such as a stable animal herd or immobile plant population could be handled by it and maybe other things as well.
I should also note that the page http://code.google.com/p/darwin-sw/wiki/TaxonomicHeterogeneity presents the topic of how to model taxonomically heterogeneous groups of organisms, which was the topic of an extended discussion in November. We did not include a class to handle such groups in our basic ontology, mostly because we didn't know how to do it. However, the wiki page just mentioned suggests a possible approach and perhaps people with more skill in defining OWL ontologies than Cam and I have could suggest an approach that would actually be able to handle that more complex situation.
I see, so dsw:IndividualOrganism should do for populations. I'm asking cause I have some biodiversity data that perhaps I will publish as Linked Data (Depending on the funding :-) and the data always follow the taxon-population pattern, having for each taxon many populations. I would like to use dsw as vocabulary, most probably extending it to accomodate further concepts like Situation (At risk, etc.)
I did not state in my first email that DSW is essentially a draft intended to foster discussion (such as this). We make no claims that it is or should be "THE" ontology. Cam and I needed something functional for our projects, so we just made DSW to serve that purpose.
Thanks again for the comment/suggestion!
Have you considered including this ontology in Open Biological and Biomedical Ontologies (http://www.obofoundry.org/)?
Cheers
Steve
Mikel Egaña Aranguren wrote:
Hi;
Perhaps this has been discussed before, but I think there should be a class for population.
Different populations of the same taxon are located in different places, each population with its own circumstances (e.g. some will be at risk and some not).
Cheers
On og., 2011.eko apiren 28a 04:01, Steve Baskauf wrote:
Dear colleagues,
With the exciting development of Semantic Web technologies, many of us already need a way to consistently express DwC in RDF. In particular, we need it to meet the requirements for GUID resolution (as expressed in the TDWG applicability guide) and to be able to share and aggregate diverse kinds of biodiversity metadata in the Linked Open Data world.
After several months of development, stimulated by the tdwg-content discussions of last Fall, we would like to offer an ontology for consideration, based on Darwin Core terms:
Darwin-SW: General site:http://code.google.com/p/darwin-sw/ Using namespace dsw ="http://purl.org/dsw/"
It is a candidate for general usage, but we do not claim it is _the_ solution, and greatly respect the efforts by others to develop similar ontologies, from which we have learned much. However, for DSW, we wanted to use existing DwC terms for classes and data properties whenever possible and only create new terms when there were no existing terms that would do the job. We did feel that there was a need for clarity in how resources should be typed (i.e. rdf:type property) and for object properties that expressed the relationships among classes unambiguously. Please see the Rationale, DesignPrinciples and ClassesAndTypes wiki pages at the above address.
In the ontology, we sought to embody relationships among classes based on our perception of the community consensus of what the classes represent and how they are related to each other, as expressed in posts to the tdwg-content list. Thus each class is documented carefully on the wiki with hyperlinked references to specific tdwg-content posts. We also sought to clarify or resolve issues that were raised in the list discussion, most notably the relationship among Occurrences and the evidence that documents them (i.e. tokens), and the role of "individuals". See wiki pages for the Token, Occurrence, and IndividualOrganism classes.
The ontology is now in use athttp://bioimages.vanderbilt.edu/, and we intend to use it more widely. We would value your opinions on the fitness of this ontology as a general solution for consistently expressing DwC concepts in RDF.
Sincerely,
Steve Baskauf and Cam Webb steve.baskauf@Vanderbilt.Edu cwebb@oeb.harvard.edu
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu
Comments inline
Mikel Egaña Aranguren wrote:
I see, so dsw:IndividualOrganism should do for populations. I'm asking cause I have some biodiversity data that perhaps I will publish as Linked Data (Depending on the funding :-) and the data always follow the taxon-population pattern, having for each taxon many populations. I would like to use dsw as vocabulary, most probably extending it to accomodate further concepts like Situation (At risk, etc.)
I think that I would be correct in saying that it was our intention that DSW not be any more restrictive than necessary to provide clarity about the types (i.e. rdf:type) of resources and how classes are related to each other (by means of the object properties we defined to specify the relationships among classes). In that sense DSW would "allow" a population to be typed as an IndividualOrganism. Whether that is a good idea or not I guess would be up to you. The one issue that comes to my mind is whether you might at some time intend to define smaller units that are subsets of your populations (e.g. subpopulations or actual individual organisms). I think you could do that in the same way that you could have a single dsw:IndividualOrganism instance represent a whole organism and pieces of it that were specimens or tissue samples. This independence of scale of an "Individual" was an idea that Rich Pyle talked about in some of his posts in November. We tried to deal with it through the separation of the somewhat abstract dsw:IndividualOrganism class (representing the relationship between the entity and the Occurrence and Identification classes) from the dsw:Token class that can represent the actual physical "thing" itself (a preserved specimen, living specimen, tissue sample, etc.) all of which can be connected to the same IndividualOrganism instance by the dsw:derivedFrom property. You might be able to do something like that with populations, subpopulations, individual organisms, etc. But we haven't tried modeling that up to this point. Some of the diagrams on the http://code.google.com/p/darwin-sw/wiki/TokenIssues might be analogous to that.
In the spirit of Linked Data, there wouldn't be anything that would prevent you from assigning to populations other properties that were outside of DSW and Darwin Core, or that you defined yourself, such as a hasSituation property or something like that.
I did not state in my first email that DSW is essentially a draft intended to foster discussion (such as this). We make no claims that it is or should be "THE" ontology. Cam and I needed something functional for our projects, so we just made DSW to serve that purpose.
Thanks again for the comment/suggestion!
Have you considered including this ontology in Open Biological and Biomedical Ontologies (http://www.obofoundry.org/)?
At this point, I think that DSW needs to be played with (or shot at??) quite a bit more before we'd post it as a "mature" ontology. Cam and I have played with it enough to know that it will validate and can be read by Linked Data browsers. It works as a way to give people access to metadata when they try to resolve HTTP URI guids. But does it actually "do" everything people want an a "Semantic Web" sense? I really don't know. It hasn't been put in a triple store, tested with SPARQL queries, etc. yet.
Steve
2011/4/28 Steve Baskauf steve.baskauf@vanderbilt.edu:
As a point of clarification, the class dsw:IndividualOrganism as we have defined it in DSW does not specify that an instance of the class must actually be an individual organism [...] At one point we considered using the name TaxonomicallyHomogeneousEntity, but that seemed unwieldy.
If that's what it is, then that's what it should be called, in my opinion. To have the term InvidualOrganism not actually mean "individual organism" is asking for trouble. Call it what it is, then try to think of a better term. But don't use the wrong term just because it's less unwieldy.
That's how I name things, as a programmer (not an ontology-creator). In general, if I find that a thing is hard to name, then I don't really understand what it represents. Some concepts are simply unwieldy, though, but the name should still be accurate.
My two cents - no doubt worth less. :)
///ark Web Applications Developer Center for Applied Biodiversity Informatics California Academy of Sciences
This is a good point and one with which I generally agree. In this particular case, the term IndividualOrganism has its roots in the discussion of last October/November. At that time, I had proposed adding a class to Darwin Core called dwc:Individual which would be the object of the existing term dwc:individualID. In that discussion, there were various options that were thrown around for the class name but in the end, it was left as Individual to correspond with the individualID term (in the same way that dwc:occurrenceID goes with dwc:Occurrence, dwc:eventID goes with dwc:Event, etc.). Since Cam and my intention was for the class dsw:IndividualOrganism to mean the same thing as I had originally wanted the proposed dwc:Individual class to mean, we stuck with the "Individual" part even though the definition always allowed for small groups of organisms (see the definition of dwc:individualID at http://rs.tdwg.org/dwc/terms/index.htm#individualID). However, since DSW is designed for the semantic web/LOD world and since "individual" has a different technical meaning in that context, we thought "IndividualOrganism" was more clear than just "Individual" (I think Pete DeVries first suggested IndividualOrganism somewhere in the discussion). Whether or not that was the best choice or not, I don't know. Often "IndividualOrganism" actually does correspond to an individual organism, so calling it that does allow for a certain amount of visualization of what it represents, although as the discussion of Oct/Nov showed, there may be problems associated with trying to put too much concrete meaning on something that is essentially abstract.
By the way, as far as I know the proposal for the dwc:Individual class is still on the table since as far as I know the TAG has taken no action to submit it to an up or down vote. Given that, I suppose there is still some rationale for maintaining a connection between dsw:IndividualOrganism and the proposed dwc:Individual class, although as the definition of the dwc:Individual class was left when the subject was dropped, it was allowed to have instances which were taxonomically heterogeneous (unlike dsw:IndividualOrganism). In that sense, the proposed dwc:Individual class is more like what we called "LivingEntity" in the alternative ontology (see http://code.google.com/p/darwin-sw/wiki/TaxonomicHeterogeneity for more on this). At this point, I have lost interest in the dwc:Individual class proposal, since with DSW we are accomplishing what I wanted to do with that proposal (i.e. provide clearly defined class names that could be used with rdf:type, to provide a means to infer duplicates, and allow for resampling). In particular, we reached the conclusion that the dwc:xxxxxxID terms were not useful in the context of RDF since their meaning was not clear. Instead, we defined new terms to connect the classes that had clear domains and ranges, so dwc:individualID is not really relevant to the discussion in the context of DSW.
Steve
Mark Wilden wrote:
2011/4/28 Steve Baskauf steve.baskauf@vanderbilt.edu:
As a point of clarification, the class dsw:IndividualOrganism as we have defined it in DSW does not specify that an instance of the class must actually be an individual organism [...] At one point we considered using the name TaxonomicallyHomogeneousEntity, but that seemed unwieldy.
If that's what it is, then that's what it should be called, in my opinion. To have the term InvidualOrganism not actually mean "individual organism" is asking for trouble. Call it what it is, then try to think of a better term. But don't use the wrong term just because it's less unwieldy.
That's how I name things, as a programmer (not an ontology-creator). In general, if I find that a thing is hard to name, then I don't really understand what it represents. Some concepts are simply unwieldy, though, but the name should still be accurate.
My two cents - no doubt worth less. :)
///ark Web Applications Developer Center for Applied Biodiversity Informatics California Academy of Sciences .
On Apr 28, 2011, at 3:55 PM, Mark Wilden wrote:
As a point of clarification, the class dsw:IndividualOrganism as we have defined it in DSW does not specify that an instance of the class must actually be an individual organism [...] At one point we considered using the name TaxonomicallyHomogeneousEntity, but that seemed unwieldy.
If that's what it is, then that's what it should be called, in my opinion.
I would completely echo this. Although in some ways the label of a class is arbitrary and what counts is the definition, we communicate meaning through language, and thus how you name something matters.
Perhaps we can just try to pool ideas for a better name and see what comes up?
-hilmar
A paraphrase of a famous saying in my field is this:
There are only two hard problems in Computer Science: cache invalidation, naming things and off-by-one errors
///ark Web Applications Developer Center for Applied Biodiversity Informatics California Academy of Sciences
Well, I wouldn't be opposed to that, particularly since the class currently called dsw:IndividualOrganism isn't really the same thing as the proposed dwc:Individual class is defined anyway. Steve
Hilmar Lapp wrote:
On Apr 28, 2011, at 3:55 PM, Mark Wilden wrote:
As a point of clarification, the class dsw:IndividualOrganism as we have defined it in DSW does not specify that an instance of the class must actually be an individual organism [...] At one point we considered using the name TaxonomicallyHomogeneousEntity, but that seemed unwieldy.
If that's what it is, then that's what it should be called, in my opinion.
I would completely echo this. Although in some ways the label of a class is arbitrary and what counts is the definition, we communicate meaning through language, and thus how you name something matters.
Perhaps we can just try to pool ideas for a better name and see what comes up?
-hilmar
participants (4)
-
Hilmar Lapp
-
Mark Wilden
-
Mikel Egaña Aranguren
-
Steve Baskauf