Embedding specimen (and other) annotations in NeXML
Hi Roger and everyone else involved in DwC and the core ontology,
in preparation for the upcoming hackathon [1] here we've worked our way through a number of use-cases for attaching metadata to data elements in NeXML [2] in a way that is semantically defined. The results are here:
http://evoinfo.nescent.org/Database_Interop_Hackathon/Metadata_Support
If you can take a critical look specifically at the section on 'Specimens within collections' (http://tinyurl.com/djdby3) that'd be great.
Specific questions related to that: 1) Is this using DarwinCore in a correct and/or sanctioned way?
2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?
3) I could only find the XML schema for DarwinCore. Are there any plans for (or did I miss the existence of) a corresponding ontology defined in OWL or RDF, similar Dublin Core (which has a RDF ontology at http://purl.org/dc/elements/1.1 and DCterms ontology at http://purl.org/dc/terms) .
4) It wasn't immediately clear how one would use the TDWG core ontology for doing the same thing in RDF (or OWL) - is this ontology supposed to be used yet, and are there usage examples, or does someone have a recommendation for how one would write the same example using the core ontology?
I know it's a rather tiny subset of DwC that we're using here but that's only a start and one use-case.
Of course, any other feedback to or suggestions for this or any of the other stuff on that page is welcome too!
Cheers,
-hilmar
[1] http://evoinfo.nescent.org/Database_Interop_Hackathon [2] http://nexml.org
Hi Hilmar,
Just a quick answer right now to point you to rdf for the DwC (similar to Dublin Core's rdf) that has just begun the review process as a TDWG Standard.
http://rs.tdwg.org/dwc/rdf/dwcterms.rdf
This is a preview, but I suspect it will not change too much. I'd call your attention to the Sample class as the nearest thing to a Specimen (because it would also be an Observation). Rendered documentation can be found at http://rs.tdwg.org/dwc/terms/index.htm#Sample.
Cheers,
John
On Sun, Feb 22, 2009 at 2:03 PM, Hilmar Lapp hlapp@duke.edu wrote:
Hi Roger and everyone else involved in DwC and the core ontology,
in preparation for the upcoming hackathon [1] here we've worked our way through a number of use-cases for attaching metadata to data elements in NeXML [2] in a way that is semantically defined. The results are here:
http://evoinfo.nescent.org/Database_Interop_Hackathon/Metadata_Support
If you can take a critical look specifically at the section on 'Specimens within collections' (http://tinyurl.com/djdby3) that'd be great.
Specific questions related to that:
Is this using DarwinCore in a correct and/or sanctioned way?
Is there a TDWG vocabulary (in RDF or OWL) that has a relation for
referring to a specimen record, or are you aware of another one that has this?
- I could only find the XML schema for DarwinCore. Are there any
plans for (or did I miss the existence of) a corresponding ontology defined in OWL or RDF, similar Dublin Core (which has a RDF ontology at http://purl.org/dc/elements/1.1 and DCterms ontology at http://purl.org/dc/terms) .
- It wasn't immediately clear how one would use the TDWG core
ontology for doing the same thing in RDF (or OWL) - is this ontology supposed to be used yet, and are there usage examples, or does someone have a recommendation for how one would write the same example using the core ontology?
I know it's a rather tiny subset of DwC that we're using here but that's only a start and one use-case.
Of course, any other feedback to or suggestions for this or any of the other stuff on that page is welcome too!
Cheers,
-hilmar
[1] http://evoinfo.nescent.org/Database_Interop_Hackathon [2] http://nexml.org
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
On Feb 22, 2009, at 5:12 PM, John R. WIECZOREK wrote:
Just a quick answer right now to point you to rdf for the DwC (similar to Dublin Core's rdf) that has just begun the review process as a TDWG Standard.
Awesome! This is going to be really useful I think.
-hilmar
BTW the file at the above URL doesn't parse (at least not in Protege).
"[line=127:column=796] URI 'http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17 , MonthIdentified-2003-06-17, DayIdentified-2003-06-17' cannot be resolved against curent base URI http://rs.tdwg.org/dwc/rdf/ dwcterms.rdf"
(The root exception is java.net.URISyntaxException: Illegal character in path at index 60: http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17 , MonthIdentified-2003-06-17, DayIdentified-2003-06-17)
Those URIs will not resolve, they are URIs to elements that only ever appeared in XML schemas in the past. They are needed in there for completeness.
On Sun, Feb 22, 2009 at 2:28 PM, Hilmar Lapp hlapp@duke.edu wrote:
On Feb 22, 2009, at 5:12 PM, John R. WIECZOREK wrote:
Just a quick answer right now to point you to rdf for the DwC (similar to Dublin Core's rdf) that has just begun the review process as a TDWG Standard.
Awesome! This is going to be really useful I think.
-hilmar
BTW the file at the above URL doesn't parse (at least not in Protege).
"[line=127:column=796] URI 'http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17, MonthIdentified-2003-06-17, DayIdentified-2003-06-17' cannot be resolved against curent base URI http://rs.tdwg.org/dwc/rdf/dwcterms.rdf"
(The root exception is java.net.URISyntaxException: Illegal character in path at index 60: http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17, MonthIdentified-2003-06-17, DayIdentified-2003-06-17)
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
Well they prevent the ontology from loading in Protege. Are you saying that the java.net.URI parser has a bug?
-hilmar
On Feb 22, 2009, at 5:34 PM, John R. WIECZOREK wrote:
Those URIs will not resolve, they are URIs to elements that only ever appeared in XML schemas in the past. They are needed in there for completeness.
On Sun, Feb 22, 2009 at 2:28 PM, Hilmar Lapp hlapp@duke.edu wrote:
On Feb 22, 2009, at 5:12 PM, John R. WIECZOREK wrote:
Just a quick answer right now to point you to rdf for the DwC (similar to Dublin Core's rdf) that has just begun the review process as a TDWG Standard.
Awesome! This is going to be really useful I think.
-hilmar
BTW the file at the above URL doesn't parse (at least not in Protege).
"[line=127:column=796] URI 'http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17, MonthIdentified-2003-06-17, DayIdentified-2003-06-17' cannot be resolved against curent base URI http://rs.tdwg.org/dwc/rdf/dwcterms.rdf"
(The root exception is java.net.URISyntaxException: Illegal character in path at index 60: http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17, MonthIdentified-2003-06-17, DayIdentified-2003-06-17)
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
Hilmar, Protege seems to be right, it is clearly no good URI and we should probably use 3 rdfs:replaces instead in this case. I was just surprised to see that the W3C rdf validator validates the file perfectly fine. A strange world.
Markus
On Feb 22, 2009, at 23:44, Hilmar Lapp wrote:
Well they prevent the ontology from loading in Protege. Are you saying that the java.net.URI parser has a bug?
-hilmar
On Feb 22, 2009, at 5:34 PM, John R. WIECZOREK wrote:
Those URIs will not resolve, they are URIs to elements that only ever appeared in XML schemas in the past. They are needed in there for completeness.
On Sun, Feb 22, 2009 at 2:28 PM, Hilmar Lapp hlapp@duke.edu wrote:
On Feb 22, 2009, at 5:12 PM, John R. WIECZOREK wrote:
Just a quick answer right now to point you to rdf for the DwC (similar to Dublin Core's rdf) that has just begun the review process as a TDWG Standard.
Awesome! This is going to be really useful I think.
-hilmar
BTW the file at the above URL doesn't parse (at least not in Protege).
"[line=127:column=796] URI 'http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17, MonthIdentified-2003-06-17, DayIdentified-2003-06-17' cannot be resolved against curent base URI http://rs.tdwg.org/dwc/rdf/dwcterms.rdf"
(The root exception is java.net.URISyntaxException: Illegal character in path at index 60: http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17, MonthIdentified-2003-06-17, DayIdentified-2003-06-17)
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
No, no bug. I missed the salient point of your previous message in my haste. I see now that the problem is the content of the rdfs:replaces rdf:resource="", not a problem of resource resolution as I originally assumed. Interesting that the document was valid using the w3c RDF validator.
I have updated the rdf document with fixes that separate the replaced terms into their own instances of rdfs:replaces. Let me know if you have any other issues with the document. It is very useful having someone actually trying to make use of it at this stage in the review.
Thanks,
John
On Sun, Feb 22, 2009 at 2:44 PM, Hilmar Lapp hlapp@duke.edu wrote:
Well they prevent the ontology from loading in Protege. Are you saying that the java.net.URI parser has a bug?
-hilmar
On Feb 22, 2009, at 5:34 PM, John R. WIECZOREK wrote:
Those URIs will not resolve, they are URIs to elements that only ever appeared in XML schemas in the past. They are needed in there for completeness.
On Sun, Feb 22, 2009 at 2:28 PM, Hilmar Lapp hlapp@duke.edu wrote:
On Feb 22, 2009, at 5:12 PM, John R. WIECZOREK wrote:
Just a quick answer right now to point you to rdf for the DwC (similar to Dublin Core's rdf) that has just begun the review process as a TDWG Standard.
Awesome! This is going to be really useful I think.
-hilmar
BTW the file at the above URL doesn't parse (at least not in Protege).
"[line=127:column=796] URI 'http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17, MonthIdentified-2003-06-17, DayIdentified-2003-06-17' cannot be resolved against curent base URI http://rs.tdwg.org/dwc/rdf/dwcterms.rdf"
(The root exception is java.net.URISyntaxException: Illegal character in path at index 60: http://rs.tdwg.org/dwc/curatorial/YearIdentified-2003-06-17, MonthIdentified-2003-06-17, DayIdentified-2003-06-17)
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
Thanks John - this now works and loads into Protege. Great!
A couple of random comments from a first inspection:
- There are lots of individuals that seem to correspond to classes and object properties (and seem to be replaced by them, which sounds odd). Is this by intention?
- There is a class 'none'. Why? (It doesn't have a definition either. It seems to be used to indicate that some properties don't have a domain, but if that's true why not simply not provide a domain for those?)
- seeAlso often has the value 'not in ABCD', which is probably the wrong kind of value for this annotation.
- There are two classes with label 'Taxon' (but different URIs). Intentional?
- Same properties and some classes have the date suffix in their label. Intentional?
- Many classes and all object properties that aren't a subProperty have a date suffix in their identifier. Why, and if there is a good reason, why not all? Is term versioning the motivation? If so, are there so many updates expected that versions couldn't be handled by ontology releases (i.e., versioning through the ontology namespace, such as in http://purl.org/dc/elements/1.1).
The date suffix will become part of the element name in OWL instance documents:
dwcterms:Taxon-2008-11-29 dwcterms:ScientificName-2009-01-21 Ictalurus punctatus </dwcterms:ScientificName-2009-01-21> </dwcterms:Taxon-2008-11-29>
Or am I missing something?
-hilmar
Comments inline...
On Mon, Feb 23, 2009 at 8:29 AM, Hilmar Lapp hlapp@duke.edu wrote:
Thanks John - this now works and loads into Protege. Great!
A couple of random comments from a first inspection:
- There are lots of individuals that seem to correspond to classes and
object properties (and seem to be replaced by them, which sounds odd). Is this by intention?
I don't know what you mean by this. Can you give me an example?
- There is a class 'none'. Why? (It doesn't have a definition either. It
seems to be used to indicate that some properties don't have a domain, but if that's true why not simply not provide a domain for those?)
Shouldn't be a class 'none'. I'm removing it.
- seeAlso often has the value 'not in ABCD', which is probably the wrong
kind of value for this annotation.
None of those values is technically very useful, as they are references to xpaths in a schema. As such, "not in ABCD" is no different functionally speaking from any of the other values. If seeAlso is not appropriate for this type of content (where is the reasoning for that?), then there really should be another attribute for this commentary. I just didn't want to have to make one if it wasn't necessary.
- There are two classes with label 'Taxon' (but different URIs).
Intentional?
I see only one. What are the URIs?
- Same properties and some classes have the date suffix in their label.
Intentional?
Every one of them has a date suffix in its label as far as I can see. What are the exceptions? Is it intentional to have the? Yes. It is directly a consequence of the versioning mechanism.
- Many classes and all object properties that aren't a subProperty have a
date suffix in their identifier. Why, and if there is a good reason, why not all? Is term versioning the motivation? If so, are there so many updates expected that versions couldn't be handled by ontology releases (i.e., versioning through the ontology namespace, such as in http://purl.org/dc/elements/1.1).
Again, all of them have the date suffix as far I I can see. Yes, term versioning is the motivation. They are all in the one document so that the entirety of the documentation, including history, can be produced from it. It isn't so much that there are a lot of updates expected as that there is a lot of historical nonsense to resolve unequivocally. Here it is all in one place.
The date suffix will become part of the element name in OWL instance documents:
dwcterms:Taxon-2008-11-29 dwcterms:ScientificName-2009-01-21 Ictalurus punctatus </dwcterms:ScientificName-2009-01-21> </dwcterms:Taxon-2008-11-29>
Or am I missing something?
No, you aren't missing anything. You're right. So, what is lacking in the rdf is a set of current term names without date suffixes, such as dwcterms:Taxon and dwcterms:ScientificName that can persist despite versioning. That was the whole point of the persistent URIs for the terms - so that dwcterms:ScientificName would persist despite the details of its attributes. My automated construction of the rdf document doesn't take this into account. The solution, I think, is to add each base term without version information and give these terms rdfs:replaces with the value of the term with the version number. This would give a complete progression for any term up to the one currently in use. I have done that in a copy of the file just committed. Does that solve the problem adequately? Does it create any new ones?
On Feb 23, 2009, at 5:55 PM, John R. WIECZOREK wrote:
On Mon, Feb 23, 2009 at 8:29 AM, Hilmar Lapp hlapp@duke.edu wrote:
Thanks John - this now works and loads into Protege. Great!
A couple of random comments from a first inspection:
- There are lots of individuals that seem to correspond to classes
and object properties (and seem to be replaced by them, which sounds odd). Is this by intention?
I don't know what you mean by this. Can you give me an example?
As far as I can see the replaces relationships to classes have gone away. For object properties, an example is AcceptedTaxon:
<rdf:Description rdf:ID="AcceptedTaxon"> <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Property "/> <rdfs:domain rdf:resource="http://rs.tdwg.org/dwc/terms/Taxon%22/%3E <rdfs:replaces rdf:resource="http://rs.tdwg.org/dwc/terms/AcceptedTaxon-2008-11-19 "/> </rdf:Description>
This seems fine.
BTW note that unless I'm missing something the base URLs http://rs.tdwg.org/dwc/terms/ and http://rs.tdwg.org/dwc/rdf/dwcterms.rdf result in different URIs for concepts (hence making them effectively different) for what you may want to be the same (the URLs seem to be used interchangeably - but maybe the idea behind this is different?).
[...]
- seeAlso often has the value 'not in ABCD', which is probably the
wrong kind of value for this annotation.
None of those values is technically very useful, as they are references to xpaths in a schema. As such, "not in ABCD" is no different functionally speaking from any of the other values. If seeAlso is not appropriate for this type of content (where is the reasoning for that?)
According to the RDF Schema spec, rdfs:seeAlso is "[...] used to indicate a resource that might provide additional information about the subject resource." The value of the property is supposed to be a rdfs:Resource (i.e., a URI to/for a resource).
See http://www.w3.org/TR/rdf-schema/#ch_seealso
[...]
- There are two classes with label 'Taxon' (but different URIs).
Intentional?
I see only one. What are the URIs?
Now there are actually three:
http://rs.tdwg.org/dwc/rdf/dwcterms.rdf#Taxon http://rs.tdwg.org/dwc/rdf/dwcterms.rdf#Taxon-2008-11-19 http://rs.tdwg.org/dwc/terms/Taxon
All three have label "Taxon", and are classes, and there is no asserted relationship between them.
- Same properties and some classes have the date suffix in their
label. Intentional?
Every one of them has a date suffix in its label as far as I can see. What are the exceptions?
There are no classes anymore now with a suffix in the label, but many of the properties have it. For example:
SamplingEvent-2008-11-19 LivingSpecimen-2008-11-19 PreservedSpecimen-2008-11-19
For those there seems to be another term without the suffix.
[...] Again, all of them have the date suffix as far I I can see. Yes, term versioning is the motivation. They are all in the one document so that the entirety of the documentation, including history, can be produced from it. It isn't so much that there are a lot of updates expected as that there is a lot of historical nonsense to resolve unequivocally. Here it is all in one place.
Maybe it would be worth producing two versions - one with the history that allows the maintainers to resolve issues, and one for public consumption that has the cruft removed?
[...] The solution, I think, is to add each base term without version information and give these terms rdfs:replaces with the value of the term with the version number. This would give a complete progression for any term up to the one currently in use. I have done that in a copy of the file just committed. Does that solve the problem adequately? Does it create any new ones?
Looks a lot better now - thanks!
BTW note that the server returns the ontology with mime-type text/html rather than application/rdf+xml. Not a big deal, just the browser will try to display it rather than downloading.
-hilmar
Comments inline...
On Tue, Feb 24, 2009 at 10:52 AM, Hilmar Lapp hlapp@duke.edu wrote:
On Feb 23, 2009, at 5:55 PM, John R. WIECZOREK wrote:
On Mon, Feb 23, 2009 at 8:29 AM, Hilmar Lapp hlapp@duke.edu wrote:
Thanks John - this now works and loads into Protege. Great!
A couple of random comments from a first inspection:
- There are lots of individuals that seem to correspond to classes
and object properties (and seem to be replaced by them, which sounds odd). Is this by intention?
I don't know what you mean by this. Can you give me an example?
As far as I can see the replaces relationships to classes have gone away. For object properties, an example is AcceptedTaxon:
<rdf:Description rdf:ID="AcceptedTaxon"> <rdf:type rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Property "/> <rdfs:domain rdf:resource="http://rs.tdwg.org/dwc/terms/Taxon%22/%3E <rdfs:replaces rdf:resource="http://rs.tdwg.org/dwc/terms/AcceptedTaxon-2008-11-19 "/> </rdf:Description>
This seems fine.
BTW note that unless I'm missing something the base URLs http://rs.tdwg.org/dwc/terms/ and http://rs.tdwg.org/dwc/rdf/dwcterms.rdf result in different URIs for concepts (hence making them effectively different) for what you may want to be the same (the URLs seem to be used interchangeably - but maybe the idea behind this is different?).
I think this is a result of not including xmlns="http://rs.tdwg.org/dwc/terms/ in the rdf:RDF section. Now all of those terms with IDs and no namespaces should have the xmlns="http://rs.tdwg.org/dwc/terms/ namespace and the unintended collision will go away. Correct?
[...]
- seeAlso often has the value 'not in ABCD', which is probably the
wrong kind of value for this annotation.
None of those values is technically very useful, as they are references to xpaths in a schema. As such, "not in ABCD" is no different functionally speaking from any of the other values. If seeAlso is not appropriate for this type of content (where is the reasoning for that?)
According to the RDF Schema spec, rdfs:seeAlso is "[...] used to indicate a resource that might provide additional information about the subject resource." The value of the property is supposed to be a rdfs:Resource (i.e., a URI to/for a resource).
Changed this to a dwc-defined attribute abcdEquivalence, defined in a new version of http://rs.tdwg.org/dwc/terms/attributes/dwcattributes.rdf
[...]
- There are two classes with label 'Taxon' (but different URIs).
Intentional?
I see only one. What are the URIs?
Now there are actually three:
http://rs.tdwg.org/dwc/rdf/dwcterms.rdf#Taxon http://rs.tdwg.org/dwc/rdf/dwcterms.rdf#Taxon-2008-11-19 http://rs.tdwg.org/dwc/terms/Taxon
All three have label "Taxon", and are classes, and there is no asserted relationship between them.
Should only be Taxon and it's historical version Taxon-2008-11-19, both in the namespace http://rs.tdwg.org/dwc/terms now. The labels are intentionally the same.
- Same properties and some classes have the date suffix in their
label. Intentional?
Every one of them has a date suffix in its label as far as I can see. What are the exceptions?
There are no classes anymore now with a suffix in the label, but many of the properties have it. For example:
SamplingEvent-2008-11-19 LivingSpecimen-2008-11-19 PreservedSpecimen-2008-11-19
For those there seems to be another term without the suffix.
Vocabulary has been removed to its own file http://rs.tdwg.org/dwc/rdf/dwctype.rdf.
[...] Again, all of them have the date suffix as far I I can see. Yes, term versioning is the motivation. They are all in the one document so that the entirety of the documentation, including history, can be produced from it. It isn't so much that there are a lot of updates expected as that there is a lot of historical nonsense to resolve unequivocally. Here it is all in one place.
Maybe it would be worth producing two versions - one with the history that allows the maintainers to resolve issues, and one for public consumption that has the cruft removed?
Good suggestion. That;s what I have done. There are now three files: http://rs.tdwg.org/dwc/rdf/dwctype.rdf for types http://rs.tdwg.org/dwc/rdf/dwcterms.rdf for the current terms, and http://rs.tdwg.org/dwc/rdf/dwctermshistory.rdf for all terms with their historical versions
[...] The solution, I think, is to add each base term without version information and give these terms rdfs:replaces with the value of the term with the version number. This would give a complete progression for any term up to the one currently in use. I have done that in a copy of the file just committed. Does that solve the problem adequately? Does it create any new ones?
Looks a lot better now - thanks!
BTW note that the server returns the ontology with mime-type text/html rather than application/rdf+xml. Not a big deal, just the browser will try to display it rather than downloading.
Updated the mime-types in the rs.tdwg.org repository. These had only been set in the development repository.
On Feb 25, 2009, at 3:44 PM, John R. WIECZOREK wrote:
[...] I think this is a result of not including xmlns="http://rs.tdwg.org/dwc/terms/ in the rdf:RDF section. Now all of those terms with IDs and no namespaces should have the xmlns="http://rs.tdwg.org/dwc/terms/ namespace and the unintended collision will go away. Correct?
That doesn't help with that really. I think what you want here is using xml:base rather than xmlns="...".
[...] Vocabulary has been removed to its own file http://rs.tdwg.org/dwc/rdf/dwctype.rdf.
Looks great, and the proper inheritance is there. Cool!
[...] Updated the mime-types in the rs.tdwg.org repository. These had only been set in the development repository.
Oddly enough the mime-type is still text/html (maybe you need to tell apache to serve them with the right mime type?).
The http://rs.tdwg.org/dwc/rdf/dwcterms.rdf has a syntax error now in the rdf:RDF element attributes - the last xmlns="..." attribute lacks the closing double quote, which of course makes the XML parser bomb.
When I add that, the content looks great, except that all terms are duplicated, once with a local URI and once with the http://rs.tdwg.org/dwc/terms/ URI. Replacing the last xmlns="..." attribute with xml:base (while keeping the value) fixes this partially. Terms are still duplicated, now once with and once without the leading hash but otherwise the same base URI.
I can fix that by replacing rdf:ID with rdf:about. rdf:ID creates local instances, so maybe that's what you wanted to do, but if not, rdf:about would be the more appropriate attribute anyway (i.e., you are describing a resource, namely the term).
After that, the only remaining issue is that term URIs with the http://rs.tdwg.org/dwc/terms/ base URI don't resolve. They aren't mandated to, but using term URIs that don't resolve don't allow anyone to retrieve any semantics (or even the label). Can you instruct the web server (through redirect, for example) to make them resolve?
-hilmar
Comments inline...
On Wed, Feb 25, 2009 at 6:23 PM, Hilmar Lapp hlapp@duke.edu wrote:
On Feb 25, 2009, at 3:44 PM, John R. WIECZOREK wrote:
[...] I think this is a result of not including xmlns="http://rs.tdwg.org/dwc/terms/ in the rdf:RDF section. Now all of those terms with IDs and no namespaces should have the xmlns="http://rs.tdwg.org/dwc/terms/ namespace and the unintended collision will go away. Correct?
That doesn't help with that really. I think what you want here is using xml:base rather than xmlns="...".
Right. I have made it so.
[...] Vocabulary has been removed to its own file http://rs.tdwg.org/dwc/rdf/dwctype.rdf.
Looks great, and the proper inheritance is there. Cool!
[...] Updated the mime-types in the rs.tdwg.org repository. These had only been set in the development repository.
Oddly enough the mime-type is still text/html (maybe you need to tell apache to serve them with the right mime type?).
I don't have access to the web server configuration on TDWG site. Markus? I wonder if you have the same issue from the development repository. See http://code.google.com/p/darwincore/source/browse/trunk/rdf/dwctype.rdf and http://code.google.com/p/darwincore/source/browse/trunk/rdf/dwcterms.rdf Note the file properties.
The http://rs.tdwg.org/dwc/rdf/dwcterms.rdf has a syntax error now in the rdf:RDF element attributes - the last xmlns="..." attribute lacks the closing double quote, which of course makes the XML parser bomb.
Fixed.
When I add that, the content looks great, except that all terms are duplicated, once with a local URI and once with the http://rs.tdwg.org/dwc/terms/%C2%A0URI. Replacing the last xmlns="..." attribute with xml:base (while keeping the value) fixes this partially. Terms are still duplicated, now once with and once without the leading hash but otherwise the same base URI.
I can fix that by replacing rdf:ID with rdf:about. rdf:ID creates local instances, so maybe that's what you wanted to do, but if not, rdf:about would be the more appropriate attribute anyway (i.e., you are describing a resource, namely the term).
That is indeed what I want. Now it is done.
After that, the only remaining issue is that term URIs with the http://rs.tdwg.org/dwc/terms/%C2%A0base URI don't resolve. They aren't mandated to, but using term URIs that don't resolve don't allow anyone to retrieve any semantics (or even the label). Can you instruct the web server (through redirect, for example) to make them resolve?
I can't, again because I don't have access to the web server. Markus?
Was this particular question answered?
"2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?"
Is the Occurrence RDF vocab at http://rs.tdwg.org/ontology/voc/TaxonOccurrence what you are looking for? I'm not sure I fully understood what you are after, but have a look and see if it matches your requirements.
Kevin
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Hilmar Lapp Sent: Monday, 23 February 2009 11:04 a.m. To: Technical Architecture Group mailing list; rogerhyam Hyam Cc: Enrico Pontelli; Rutger A. Vos; Arlin Stoltzfus; nexml-discuss@lists.sourceforge.net; Brandon Chisham Subject: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
Hi Roger and everyone else involved in DwC and the core ontology,
in preparation for the upcoming hackathon [1] here we've worked our way through a number of use-cases for attaching metadata to data elements in NeXML [2] in a way that is semantically defined. The results are here:
http://evoinfo.nescent.org/Database_Interop_Hackathon/Metadata_Support
If you can take a critical look specifically at the section on 'Specimens within collections' (http://tinyurl.com/djdby3) that'd be great.
Specific questions related to that: 1) Is this using DarwinCore in a correct and/or sanctioned way?
2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?
3) I could only find the XML schema for DarwinCore. Are there any plans for (or did I miss the existence of) a corresponding ontology defined in OWL or RDF, similar Dublin Core (which has a RDF ontology at http://purl.org/dc/elements/1.1 and DCterms ontology at http://purl.org/dc/terms) .
4) It wasn't immediately clear how one would use the TDWG core ontology for doing the same thing in RDF (or OWL) - is this ontology supposed to be used yet, and are there usage examples, or does someone have a recommendation for how one would write the same example using the core ontology?
I know it's a rather tiny subset of DwC that we're using here but that's only a start and one use-case.
Of course, any other feedback to or suggestions for this or any of the other stuff on that page is welcome too!
Cheers,
-hilmar
[1] http://evoinfo.nescent.org/Database_Interop_Hackathon [2] http://nexml.org
-- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu : ===========================================================
_______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
On Feb 23, 2009, at 2:26 PM, Kevin Richards wrote:
Was this particular question answered?
No, not yet.
"2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?"
Is the Occurrence RDF vocab at http://rs.tdwg.org/ontology/voc/ TaxonOccurrence what you are looking for? I'm not sure I fully understood what you are after, but have a look and see if it matches your requirements.
Thanks for the link. It seems nearly identical to DwC or shares at least a lot of the terms, no? But it doesn't seem to have an appropriate relation either.
What I am looking for, formally speaking, is an object property that would have a specimen as its range, and hence allows me to relate a specimen record to some entity (a class or an individual). Am I making sense?
There are some pieces along those lines in the DwC terms, but they don't (seem to) connect:
- There are classes PreservedSpecimen, FossilSpecimen, and LivingSpecimen, but they don't have a supertype (other than owl:Thing) that you could make the range.
- There is a class Sample, but that doesn't have any of the Specimen classes as subtypes. Maybe that's correct since a sample and a specimen aren't really the same thing (rather, the former is obtained from the latter).
- There is an object property Sample-2008-11-19 (domain is the class 'none'), but there is no further annotation as to what that property should mean.
- There are object properties LivingSpecimen-2008-11-19 and PreservedSpecimen-2008-11-19 that have PhysicalObject as their super- property (why not Specimen?), and there is FossilSpecimen-2008-11-19 which has PreservedSpecimen (no date suffix) as super-property.
-hilmar
Kevin
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag- bounces@lists.tdwg.org] On Behalf Of Hilmar Lapp Sent: Monday, 23 February 2009 11:04 a.m. To: Technical Architecture Group mailing list; rogerhyam Hyam Cc: Enrico Pontelli; Rutger A. Vos; Arlin Stoltzfus; nexml- discuss@lists.sourceforge.net; Brandon Chisham Subject: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
Hi Roger and everyone else involved in DwC and the core ontology,
in preparation for the upcoming hackathon [1] here we've worked our way through a number of use-cases for attaching metadata to data elements in NeXML [2] in a way that is semantically defined. The results are here:
http://evoinfo.nescent.org/Database_Interop_Hackathon/Metadata_Support
If you can take a critical look specifically at the section on 'Specimens within collections' (http://tinyurl.com/djdby3) that'd be great.
Specific questions related to that:
Is this using DarwinCore in a correct and/or sanctioned way?
Is there a TDWG vocabulary (in RDF or OWL) that has a relation for
referring to a specimen record, or are you aware of another one that has this?
- I could only find the XML schema for DarwinCore. Are there any
plans for (or did I miss the existence of) a corresponding ontology defined in OWL or RDF, similar Dublin Core (which has a RDF ontology at http://purl.org/dc/elements/1.1 and DCterms ontology at http:// purl.org/dc/terms) .
- It wasn't immediately clear how one would use the TDWG core
ontology for doing the same thing in RDF (or OWL) - is this ontology supposed to be used yet, and are there usage examples, or does someone have a recommendation for how one would write the same example using the core ontology?
I know it's a rather tiny subset of DwC that we're using here but that's only a start and one use-case.
Of course, any other feedback to or suggestions for this or any of the other stuff on that page is welcome too!
Cheers,
-hilmar
[1] http://evoinfo.nescent.org/Database_Interop_Hackathon [2] http://nexml.org
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
Ok, got you. This occurrence vocab is part of the larger tdwg architecture vocabs (http://wiki.tdwg.org/twiki/bin/view/TAG/LsidVocs). So there are references to Specimen objects/records, but only as a link to another TDWG object, eg http://rs.tdwg.org/ontology/voc/TaxonConcept#circumscribedBy.
Can you just use an rdf:Resource with type http://rs.tdwg.org/ontology/voc/TaxonOccurrence ?
Kevin
-----Original Message----- From: Hilmar Lapp [mailto:hlapp@duke.edu] Sent: Tuesday, 24 February 2009 8:50 a.m. To: Kevin Richards Cc: Technical Architecture Group mailing list; Enrico Pontelli; Rutger Vos; Arlin Stoltzfus; Brandon Chisham Subject: Re: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
On Feb 23, 2009, at 2:26 PM, Kevin Richards wrote:
Was this particular question answered?
No, not yet.
"2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?"
Is the Occurrence RDF vocab at http://rs.tdwg.org/ontology/voc/ TaxonOccurrence what you are looking for? I'm not sure I fully understood what you are after, but have a look and see if it matches your requirements.
Thanks for the link. It seems nearly identical to DwC or shares at least a lot of the terms, no? But it doesn't seem to have an appropriate relation either.
What I am looking for, formally speaking, is an object property that would have a specimen as its range, and hence allows me to relate a specimen record to some entity (a class or an individual). Am I making sense?
There are some pieces along those lines in the DwC terms, but they don't (seem to) connect:
- There are classes PreservedSpecimen, FossilSpecimen, and LivingSpecimen, but they don't have a supertype (other than owl:Thing) that you could make the range.
- There is a class Sample, but that doesn't have any of the Specimen classes as subtypes. Maybe that's correct since a sample and a specimen aren't really the same thing (rather, the former is obtained from the latter).
- There is an object property Sample-2008-11-19 (domain is the class 'none'), but there is no further annotation as to what that property should mean.
- There are object properties LivingSpecimen-2008-11-19 and PreservedSpecimen-2008-11-19 that have PhysicalObject as their super- property (why not Specimen?), and there is FossilSpecimen-2008-11-19 which has PreservedSpecimen (no date suffix) as super-property.
-hilmar
Kevin
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag- bounces@lists.tdwg.org] On Behalf Of Hilmar Lapp Sent: Monday, 23 February 2009 11:04 a.m. To: Technical Architecture Group mailing list; rogerhyam Hyam Cc: Enrico Pontelli; Rutger A. Vos; Arlin Stoltzfus; nexml- discuss@lists.sourceforge.net; Brandon Chisham Subject: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
Hi Roger and everyone else involved in DwC and the core ontology,
in preparation for the upcoming hackathon [1] here we've worked our way through a number of use-cases for attaching metadata to data elements in NeXML [2] in a way that is semantically defined. The results are here:
http://evoinfo.nescent.org/Database_Interop_Hackathon/Metadata_Support
If you can take a critical look specifically at the section on 'Specimens within collections' (http://tinyurl.com/djdby3) that'd be great.
Specific questions related to that:
Is this using DarwinCore in a correct and/or sanctioned way?
Is there a TDWG vocabulary (in RDF or OWL) that has a relation for
referring to a specimen record, or are you aware of another one that has this?
- I could only find the XML schema for DarwinCore. Are there any
plans for (or did I miss the existence of) a corresponding ontology defined in OWL or RDF, similar Dublin Core (which has a RDF ontology at http://purl.org/dc/elements/1.1 and DCterms ontology at http:// purl.org/dc/terms) .
- It wasn't immediately clear how one would use the TDWG core
ontology for doing the same thing in RDF (or OWL) - is this ontology supposed to be used yet, and are there usage examples, or does someone have a recommendation for how one would write the same example using the core ontology?
I know it's a rather tiny subset of DwC that we're using here but that's only a start and one use-case.
Of course, any other feedback to or suggestions for this or any of the other stuff on that page is welcome too!
Cheers,
-hilmar
[1] http://evoinfo.nescent.org/Database_Interop_Hackathon [2] http://nexml.org
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
-- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu : ===========================================================
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
On Feb 23, 2009, at 4:43 PM, Kevin Richards wrote:
Can you just use an rdf:Resource with type http://rs.tdwg.org/ontology/voc/TaxonOccurrence ?
That wouldn't make a relation (object property), would it?
-hilmar
Comments inline...
On Mon, Feb 23, 2009 at 11:49 AM, Hilmar Lapp hlapp@duke.edu wrote:
On Feb 23, 2009, at 2:26 PM, Kevin Richards wrote:
Was this particular question answered?
No, not yet.
"2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?"
Is the Occurrence RDF vocab at http://rs.tdwg.org/ontology/voc/ TaxonOccurrence what you are looking for? I'm not sure I fully understood what you are after, but have a look and see if it matches your requirements.
Thanks for the link. It seems nearly identical to DwC or shares at least a lot of the terms, no? But it doesn't seem to have an appropriate relation either.
It is no accident that they have a lot in common. The idea of a TaxonOccurrence as a concept was at the core of Darwin Core, which the TDWG Ontology used as a model for its properties. At the time the Ontology was actively developed Darwin Core had no notion of classes, so its elements consisted of what are now properties of several Darwin Core classes.
What I am looking for, formally speaking, is an object property that would have a specimen as its range, and hence allows me to relate a specimen record to some entity (a class or an individual). Am I making sense?
There are some pieces along those lines in the DwC terms, but they don't (seem to) connect:
- There are classes PreservedSpecimen, FossilSpecimen, and
LivingSpecimen, but they don't have a supertype (other than owl:Thing) that you could make the range.
- There is a class Sample, but that doesn't have any of the Specimen
classes as subtypes. Maybe that's correct since a sample and a specimen aren't really the same thing (rather, the former is obtained from the latter).
Following what Dublin Core does with type vocabulary for dcterms:type, the classes PreservedSpecimen, FossilSpecimen, LivingSpecimen, and others are terms that act as the controlled vocabulary for dwcterm:BasisOfRecord (see http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm). FossilSpecimen refines PreservedSpecimen as PreservedSpecimen refines dcterms:PhysicalObject. FossilSpecimen has neither PreservedSpecimen nor Sample (the class equivalent to a specimen in Darwin Core) as its range because it is a controlled vocabulary term. Instances of Samples having PreservedSpecimen, LivingSpecimen, or FossilSpecimen as the BasisOfRecord could be considered specimens, as they all refine dcterms:PhysicalObject. PhysicalObjects as Samples in Darwin Core are specimens.
- There is an object property Sample-2008-11-19 (domain is the class
'none'), but there is no further annotation as to what that property should mean.
I have removed all references to the domain class 'none'. In Darwin Core Sample has no domain. The details of the relationship of Sample to other classes is left to implementation as there is more than one reasonable viewpoint - a Sample could be a property of a SamplingEvent (something resulting from an event) or a SamplingEvent could be a property of a Sample (the place where is the Sample was gathered). Darwin Core makes no commitment to one viewpoint or the other outside of implementation.
- There are object properties LivingSpecimen-2008-11-19 and
PreservedSpecimen-2008-11-19 that have PhysicalObject as their super- property (why not Specimen?), and there is FossilSpecimen-2008-11-19 which has PreservedSpecimen (no date suffix) as super-property.
These are refinements showing the relationships between the options for controlled vocabulary of the BasisOfRecord property, nothing more (see also http://dublincore.org/documents/dcmi-type-vocabulary/).
On Feb 23, 2009, at 6:33 PM, John R. WIECZOREK wrote:
[...] Following what Dublin Core does with type vocabulary for dcterms:type, the classes PreservedSpecimen, FossilSpecimen, LivingSpecimen, and others are terms that act as the controlled vocabulary for dwcterm:BasisOfRecord
That's a good idea. But do they then need to be included in the dwcterms ontology? For someone looking at the dwcterms ontology in an ontology editor it's not obvious that this would be their intended use.
As for the hierarchy, the problem I was trying to point out is that the PreservedSpecimen that is the super-property of FossilSpecimen is not the same PreservedSpecimen that is the sub-property of PhysicalObject, because they have different URIs (http://rs.tdwg.org/dwc/dwctype/PreservedSpecimen and http://rs.tdwg.org/dwc/rdf/dwcterms.rdf#PreservedSpecimen, respectively).
[...] I have removed all references to the domain class 'none'.
Indeed, thanks!
[...] The details of the relationship of Sample to other classes is left to implementation as there is more than one reasonable viewpoint - a Sample could be a property of a SamplingEvent
BTW did you inadvertently delete the Sample property? I can't find it anymore, it's a class now. (But maybe that's what you intended?)
-hilmar
Comments inline...
On Tue, Feb 24, 2009 at 9:58 AM, Hilmar Lapp hlapp@duke.edu wrote:
On Feb 23, 2009, at 6:33 PM, John R. WIECZOREK wrote:
[...] Following what Dublin Core does with type vocabulary for dcterms:type, the classes PreservedSpecimen, FossilSpecimen, LivingSpecimen, and others are terms that act as the controlled vocabulary for dwcterm:BasisOfRecord
That's a good idea. But do they then need to be included in the dwcterms ontology? For someone looking at the dwcterms ontology in an ontology editor it's not obvious that this would be their intended use.
I have removed the type vocabulary from dwcterms.rdf and put it into dwctype.rdf in the same folder. This fits with what Dublin Core does as well, so good suggestion.
As for the hierarchy, the problem I was trying to point out is that the PreservedSpecimen that is the super-property of FossilSpecimen is not the same PreservedSpecimen that is the sub-property of PhysicalObject, because they have different URIs (http://rs.tdwg.org/dwc/dwctype/PreservedSpecimen and http://rs.tdwg.org/dwc/rdf/dwcterms.rdf#PreservedSpecimen, respectively).
I believe this is solved with the new arrangement of rdf schemas and proper namespace usage.
[...] I have removed all references to the domain class 'none'.
Indeed, thanks!
[...] The details of the relationship of Sample to other classes is left to implementation as there is more than one reasonable viewpoint - a Sample could be a property of a SamplingEvent
BTW did you inadvertently delete the Sample property? I can't find it anymore, it's a class now. (But maybe that's what you intended?)
Sample was intended to be a Class. I believe it is correct now.
Given that this ontology work has to be formally correct in a several ways, I'd like to suggest (strongly) that the
name of "TaxonOccurrence" concept
be changed to
"OrganismOccurrence".
One of the properties of OrganismOccurrence would be the TaxonomicIdentification.
If this has a parent "thing", it should be something like the ObservationMeasurement thing that Simon Cox (or other larger ontologies) described at least year's meeting.
-Stan
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Kevin Richards Sent: Monday, February 23, 2009 11:26 AM To: Hilmar Lapp; Technical Architecture Group mailing list; rogerhyam Hyam Cc: Enrico Pontelli; Rutger A. Vos; Arlin Stoltzfus; Brandon Chisham; nexml-discuss@lists.sourceforge.net Subject: Re: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
Was this particular question answered?
"2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?"
Is the Occurrence RDF vocab at http://rs.tdwg.org/ontology/voc/TaxonOccurrence what you are looking for? I'm not sure I fully understood what you are after, but have a look and see if it matches your requirements.
Kevin
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Hilmar Lapp Sent: Monday, 23 February 2009 11:04 a.m. To: Technical Architecture Group mailing list; rogerhyam Hyam Cc: Enrico Pontelli; Rutger A. Vos; Arlin Stoltzfus; nexml-discuss@lists.sourceforge.net; Brandon Chisham Subject: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
Hi Roger and everyone else involved in DwC and the core ontology,
in preparation for the upcoming hackathon [1] here we've worked our way through a number of use-cases for attaching metadata to data elements in NeXML [2] in a way that is semantically defined. The results are here:
http://evoinfo.nescent.org/Database_Interop_Hackathon/Metadata_Support
If you can take a critical look specifically at the section on 'Specimens within collections' (http://tinyurl.com/djdby3) that'd be great.
Specific questions related to that: 1) Is this using DarwinCore in a correct and/or sanctioned way?
2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?
3) I could only find the XML schema for DarwinCore. Are there any plans for (or did I miss the existence of) a corresponding ontology defined in OWL or RDF, similar Dublin Core (which has a RDF ontology at http://purl.org/dc/elements/1.1 and DCterms ontology at http://purl.org/dc/terms) .
4) It wasn't immediately clear how one would use the TDWG core ontology for doing the same thing in RDF (or OWL) - is this ontology supposed to be used yet, and are there usage examples, or does someone have a recommendation for how one would write the same example using the core ontology?
I know it's a rather tiny subset of DwC that we're using here but that's only a start and one use-case.
Of course, any other feedback to or suggestions for this or any of the other stuff on that page is welcome too!
Cheers,
-hilmar
[1] http://evoinfo.nescent.org/Database_Interop_Hackathon [2] http://nexml.org
-- =========================================================== : Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu : ===========================================================
_______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
This thread has prompted me to ask some naive questions about the process under which the vocabularies are formed. Maybe I'm the only one who is confused about the vocabularies, their status, and the process of forming new terms, but it seems maybe I'm not alone. And clarification on some of these points will help me with our direction on the development of the Observation Ontology under the OSR group, which I think will fit right in with Stan's point about fitting some of the concepts into a broader Observation framework.
For me there is a lot of confusion over the TDWG vocabularies, partly because they capture concepts that are present in existing TDWG standards, but are generally incomplete. For example, the TCS standard provides the field 'Specimens/Specimen', which I think is relevant to Hilmar's question. However, the listed TDWG vocabulary for TCS is the TaxonConcept vocabulary (http://rs.tdwg.org/ontology/voc/TaxonConcept), which does not provide a class for the TCS 'Specimen' concept. In addition, the ABCD TDWG standard also seem to have a way for specimens to be represented, but they are generalized as 'Unit's with a 'RecordBasis' of 'PreservedSpecimen'. So there are at least two official TDWG 'standards' for representing Specimen information, in addition to whatever DwC does. It seems to me that the best thing to do would be to finish the LSID vocabularies for TCS and ABCD so that they completely represent the concepts in TCS and ABCD, then get that approved as a valid way to represent these TDWG standards. In the process, one could try to resolve the differences in modeling approaches employed by the different standards, such as mapping the Specimen concept in TCS to its corresponding concept in DwC and ABCD. This would help avoid multiple TDWG standards defining overlapping versions of these concepts, and let people use the vocabularies in place of the XML schema versions of these standards.
What is the process for approval of the LSID vocabularies? They seem to be bypassing the normal TDWG standards track. Some of the vocabularies have a status of 'Available' (like TaxonConcept, even though it is incomplete), while others are marked as 'Developmental'.
The page on OntologyGovernance (http://wiki.tdwg.org/twiki/bin/view/TAG/TDWGOntologyGovernance) states: "Relationship Between TDWG Standards and the Ontology -- Concepts are standardized by being included in TDWG Standards. Once they have been mentioned in a standard the Ontology Manager has the responsibility of maintaining their URIs and descriptions as per the standard. Concepts must be promoted to the live branch before the standard enters the standards process. "
So it seems that the OntologyManager replaces the standards process for the purpose of the vocabularies. Is this correct? And does the OntologyManager make sure that concepts like 'Specimen' that are defined in TCS make it into the corresponding LSID vocabulary before it is classified as 'Available'? And how does the OntologyManager decide which concept and representation for 'Specimen' to use -- the one from TCS or the one from ABCD? Does 'Available' have the same weight as a published TDWG standard, and if so, shouldn't these vocabularies be listed on the Standards page as well? Finally, does the existence of a concept such as 'Specimen' in TCS have any bearing on the development of new standards such as DwC that may want to define the concept differently, or more completely?
Matt
On Mon, Feb 23, 2009 at 11:22 AM, Blum, Stan sblum@calacademy.org wrote:
Given that this ontology work has to be formally correct in a several ways, I'd like to suggest (strongly) that the
name of "TaxonOccurrence" concept
be changed to
"OrganismOccurrence".
One of the properties of OrganismOccurrence would be the TaxonomicIdentification.
If this has a parent "thing", it should be something like the ObservationMeasurement thing that Simon Cox (or other larger ontologies) described at least year's meeting.
-Stan
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Kevin Richards Sent: Monday, February 23, 2009 11:26 AM To: Hilmar Lapp; Technical Architecture Group mailing list; rogerhyam Hyam Cc: Enrico Pontelli; Rutger A. Vos; Arlin Stoltzfus; Brandon Chisham; nexml-discuss@lists.sourceforge.net Subject: Re: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
Was this particular question answered?
"2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?"
Is the Occurrence RDF vocab at http://rs.tdwg.org/ontology/voc/TaxonOccurrence what you are looking for? I'm not sure I fully understood what you are after, but have a look and see if it matches your requirements.
Kevin
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Hilmar Lapp Sent: Monday, 23 February 2009 11:04 a.m. To: Technical Architecture Group mailing list; rogerhyam Hyam Cc: Enrico Pontelli; Rutger A. Vos; Arlin Stoltzfus; nexml-discuss@lists.sourceforge.net; Brandon Chisham Subject: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
Hi Roger and everyone else involved in DwC and the core ontology,
in preparation for the upcoming hackathon [1] here we've worked our way through a number of use-cases for attaching metadata to data elements in NeXML [2] in a way that is semantically defined. The results are here:
http://evoinfo.nescent.org/Database_Interop_Hackathon/Metadata_Support
If you can take a critical look specifically at the section on 'Specimens within collections' (http://tinyurl.com/djdby3) that'd be great.
Specific questions related to that:
Is this using DarwinCore in a correct and/or sanctioned way?
Is there a TDWG vocabulary (in RDF or OWL) that has a relation for
referring to a specimen record, or are you aware of another one that has this?
- I could only find the XML schema for DarwinCore. Are there any
plans for (or did I miss the existence of) a corresponding ontology defined in OWL or RDF, similar Dublin Core (which has a RDF ontology at http://purl.org/dc/elements/1.1 and DCterms ontology at http://purl.org/dc/terms) .
- It wasn't immediately clear how one would use the TDWG core
ontology for doing the same thing in RDF (or OWL) - is this ontology supposed to be used yet, and are there usage examples, or does someone have a recommendation for how one would write the same example using the core ontology?
I know it's a rather tiny subset of DwC that we're using here but that's only a start and one use-case.
Of course, any other feedback to or suggestions for this or any of the other stuff on that page is welcome too!
Cheers,
-hilmar
[1] http://evoinfo.nescent.org/Database_Interop_Hackathon [2] http://nexml.org
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Matt, all ontology work within TDWG so far should be classified as experimental in my opinion. As you have pointed out, none of this work has gone through a proper standards process (with the exception of NCD maybe), but was driven by immediate needs for LSID resolution.
Also there is a large overlap between the xml schema based TDWG standards such as ABCD, TCS, DWC, NCD and SDD. Plus pretty much all of them use different schema design patterns, so it wouldn't be trivial to integrate them into one. Some standards like TCS actually have placeholders (TCS even has a PlaceholderType complex type) to refer to specimens or publications, as it was not considered the job of TCS to define those. But then again one needed to express the relationship to those objects at a time when the those standards were not yet finalised or used different, incompatible design patterns.
To me the only way out would be to step back and try to reconcile all those standards into a single ontology with an corresponding xml schema. The new DarwinCore terms dont exactly try to do that, but at least they represent a consistent way of representing the basics of our domain, in particular specimen, names & taxonomies. Please keep in mind when looking at the DwC terms that the development had the flexibility and simplicity of Dublin Core in mind rather than a true RDF ontology. It therefore also tries to be technology independent in its core, but provides guidelines for the different implementation/ serialisation technologies like XML, (X)HTML, RDF or tab delimited text files that make use of the same term definitions.
I am really glad that you pointed out all these problems, as I think we still have serious work in front of us.
Markus
On Feb 23, 2009, at 21:51, Matt Jones wrote:
This thread has prompted me to ask some naive questions about the process under which the vocabularies are formed. Maybe I'm the only one who is confused about the vocabularies, their status, and the process of forming new terms, but it seems maybe I'm not alone. And clarification on some of these points will help me with our direction on the development of the Observation Ontology under the OSR group, which I think will fit right in with Stan's point about fitting some of the concepts into a broader Observation framework.
For me there is a lot of confusion over the TDWG vocabularies, partly because they capture concepts that are present in existing TDWG standards, but are generally incomplete. For example, the TCS standard provides the field 'Specimens/Specimen', which I think is relevant to Hilmar's question. However, the listed TDWG vocabulary for TCS is the TaxonConcept vocabulary (http://rs.tdwg.org/ontology/voc/TaxonConcept), which does not provide a class for the TCS 'Specimen' concept. In addition, the ABCD TDWG standard also seem to have a way for specimens to be represented, but they are generalized as 'Unit's with a 'RecordBasis' of 'PreservedSpecimen'. So there are at least two official TDWG 'standards' for representing Specimen information, in addition to whatever DwC does. It seems to me that the best thing to do would be to finish the LSID vocabularies for TCS and ABCD so that they completely represent the concepts in TCS and ABCD, then get that approved as a valid way to represent these TDWG standards. In the process, one could try to resolve the differences in modeling approaches employed by the different standards, such as mapping the Specimen concept in TCS to its corresponding concept in DwC and ABCD. This would help avoid multiple TDWG standards defining overlapping versions of these concepts, and let people use the vocabularies in place of the XML schema versions of these standards.
What is the process for approval of the LSID vocabularies? They seem to be bypassing the normal TDWG standards track. Some of the vocabularies have a status of 'Available' (like TaxonConcept, even though it is incomplete), while others are marked as 'Developmental'.
The page on OntologyGovernance (http://wiki.tdwg.org/twiki/bin/view/TAG/TDWGOntologyGovernance) states: "Relationship Between TDWG Standards and the Ontology -- Concepts are standardized by being included in TDWG Standards. Once they have been mentioned in a standard the Ontology Manager has the responsibility of maintaining their URIs and descriptions as per the standard. Concepts must be promoted to the live branch before the standard enters the standards process. "
So it seems that the OntologyManager replaces the standards process for the purpose of the vocabularies. Is this correct? And does the OntologyManager make sure that concepts like 'Specimen' that are defined in TCS make it into the corresponding LSID vocabulary before it is classified as 'Available'? And how does the OntologyManager decide which concept and representation for 'Specimen' to use -- the one from TCS or the one from ABCD? Does 'Available' have the same weight as a published TDWG standard, and if so, shouldn't these vocabularies be listed on the Standards page as well? Finally, does the existence of a concept such as 'Specimen' in TCS have any bearing on the development of new standards such as DwC that may want to define the concept differently, or more completely?
Matt
On Mon, Feb 23, 2009 at 11:22 AM, Blum, Stan sblum@calacademy.org wrote:
Given that this ontology work has to be formally correct in a several ways, I'd like to suggest (strongly) that the
name of "TaxonOccurrence" concept
be changed to
"OrganismOccurrence".
One of the properties of OrganismOccurrence would be the TaxonomicIdentification.
If this has a parent "thing", it should be something like the ObservationMeasurement thing that Simon Cox (or other larger ontologies) described at least year's meeting.
-Stan
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Kevin Richards Sent: Monday, February 23, 2009 11:26 AM To: Hilmar Lapp; Technical Architecture Group mailing list; rogerhyam Hyam Cc: Enrico Pontelli; Rutger A. Vos; Arlin Stoltzfus; Brandon Chisham; nexml-discuss@lists.sourceforge.net Subject: Re: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
Was this particular question answered?
"2) Is there a TDWG vocabulary (in RDF or OWL) that has a relation for referring to a specimen record, or are you aware of another one that has this?"
Is the Occurrence RDF vocab at http://rs.tdwg.org/ontology/voc/TaxonOccurrence what you are looking for? I'm not sure I fully understood what you are after, but have a look and see if it matches your requirements.
Kevin
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Hilmar Lapp Sent: Monday, 23 February 2009 11:04 a.m. To: Technical Architecture Group mailing list; rogerhyam Hyam Cc: Enrico Pontelli; Rutger A. Vos; Arlin Stoltzfus; nexml-discuss@lists.sourceforge.net; Brandon Chisham Subject: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
Hi Roger and everyone else involved in DwC and the core ontology,
in preparation for the upcoming hackathon [1] here we've worked our way through a number of use-cases for attaching metadata to data elements in NeXML [2] in a way that is semantically defined. The results are here:
http://evoinfo.nescent.org/Database_Interop_Hackathon/ Metadata_Support
If you can take a critical look specifically at the section on 'Specimens within collections' (http://tinyurl.com/djdby3) that'd be great.
Specific questions related to that:
Is this using DarwinCore in a correct and/or sanctioned way?
Is there a TDWG vocabulary (in RDF or OWL) that has a relation for
referring to a specimen record, or are you aware of another one that has this?
- I could only find the XML schema for DarwinCore. Are there any
plans for (or did I miss the existence of) a corresponding ontology defined in OWL or RDF, similar Dublin Core (which has a RDF ontology at http://purl.org/dc/elements/1.1 and DCterms ontology at http://purl.org/dc/terms) .
- It wasn't immediately clear how one would use the TDWG core
ontology for doing the same thing in RDF (or OWL) - is this ontology supposed to be used yet, and are there usage examples, or does someone have a recommendation for how one would write the same example using the core ontology?
I know it's a rather tiny subset of DwC that we're using here but that's only a start and one use-case.
Of course, any other feedback to or suggestions for this or any of the other stuff on that page is welcome too!
Cheers,
-hilmar
[1] http://evoinfo.nescent.org/Database_Interop_Hackathon [2] http://nexml.org
--
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
--
Matthew B. Jones Director of Informatics Research and Development National Center for Ecological Analysis and Synthesis (NCEAS) UC Santa Barbara jones@nceas.ucsb.edu Ph: 1-907-523-1960 http://www.nceas.ucsb.edu/ecoinfo
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
The TDWG 'house' is organic and messy, if not in disarray. I'll explain this state with a descriptive history instead of a logical exposition.
* We changed the way TDWG approves standards (in 2006). ABCD, TCS, and SDD were the last standards approved by the old method (a vote by members). New standards simply have to pass expert and public review (as judged by the executive committee). ABCD, TCS, and SDD aren't consistent because...(blah, blah; life is hard).
* The TDWG ontology or vocabulary is NOT a standard. The terms are supposed to be taken from standards, and perhaps even non-standard schemas that are widely used. The ontology managers are supposed to apply the filter of use; terms in the ontology should be widely used. (We expect that some terms in approved standards will not be widely used.)
* With the ontology/vocabulary comes a shift or at least a broadening towards RDF (away from XML schemas), though there were/are some skeptics.
The existing pages on the TDWG vocabulary are pretty old, and I don't think there has been any kind of review "event" for these. Editing and reviewing these is on our to-do list for this year.
I think IT WOULD BE VERY HELPFUL if we put up a list of REQUIRED READING before getting into this review, and perhaps even the current review of DarwinCore. John Wieczorek and his core collaborators have drawn heavily from the DCMI initiative in shaping the most recent draft of the DarwinCore. I would really appreciate references to any other good statements about RDF and ontologies. These should go on the TAG homepage.
Matt, thanks for pointing out the emperor's clothes!
-Stan
-----Original Message----- From: mbjones.89@gmail.com [mailto:mbjones.89@gmail.com] On Behalf Of Matt Jones Sent: Monday, February 23, 2009 12:51 PM To: Blum, Stan Cc: Kevin Richards; Hilmar Lapp; Technical Architecture Group mailing list; rogerhyam Hyam; Mark Schildhauer Subject: Re: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
This thread has prompted me to ask some naive questions about the process under which the vocabularies are formed. Maybe I'm the only one who is confused about the vocabularies, their status, and the process of forming new terms, but it seems maybe I'm not alone. And clarification on some of these points will help me with our direction on the development of the Observation Ontology under the OSR group, which I think will fit right in with Stan's point about fitting some of the concepts into a broader Observation framework.
For me there is a lot of confusion over the TDWG vocabularies, partly because they capture concepts that are present in existing TDWG standards, but are generally incomplete. For example, the TCS standard provides the field 'Specimens/Specimen', which I think is relevant to Hilmar's question. However, the listed TDWG vocabulary for TCS is the TaxonConcept vocabulary (http://rs.tdwg.org/ontology/voc/TaxonConcept), which does not provide a class for the TCS 'Specimen' concept. In addition, the ABCD TDWG standard also seem to have a way for specimens to be represented, but they are generalized as 'Unit's with a 'RecordBasis' of 'PreservedSpecimen'. So there are at least two official TDWG 'standards' for representing Specimen information, in addition to whatever DwC does. It seems to me that the best thing to do would be to finish the LSID vocabularies for TCS and ABCD so that they completely represent the concepts in TCS and ABCD, then get that approved as a valid way to represent these TDWG standards. In the process, one could try to resolve the differences in modeling approaches employed by the different standards, such as mapping the Specimen concept in TCS to its corresponding concept in DwC and ABCD. This would help avoid multiple TDWG standards defining overlapping versions of these concepts, and let people use the vocabularies in place of the XML schema versions of these standards.
What is the process for approval of the LSID vocabularies? They seem to be bypassing the normal TDWG standards track. Some of the vocabularies have a status of 'Available' (like TaxonConcept, even though it is incomplete), while others are marked as 'Developmental'.
The page on OntologyGovernance (http://wiki.tdwg.org/twiki/bin/view/TAG/TDWGOntologyGovernance) states: "Relationship Between TDWG Standards and the Ontology -- Concepts are standardized by being included in TDWG Standards. Once they have been mentioned in a standard the Ontology Manager has the responsibility of maintaining their URIs and descriptions as per the standard. Concepts must be promoted to the live branch before the standard enters the standards process. "
So it seems that the OntologyManager replaces the standards process for the purpose of the vocabularies. Is this correct? And does the OntologyManager make sure that concepts like 'Specimen' that are defined in TCS make it into the corresponding LSID vocabulary before it is classified as 'Available'? And how does the OntologyManager decide which concept and representation for 'Specimen' to use -- the one from TCS or the one from ABCD? Does 'Available' have the same weight as a published TDWG standard, and if so, shouldn't these vocabularies be listed on the Standards page as well? Finally, does the existence of a concept such as 'Specimen' in TCS have any bearing on the development of new standards such as DwC that may want to define the concept differently, or more completely?
Matt
Hi Guys,
I think I should chip a bit of perspective/history in as I am at least partly responsible for the mess. Matt hits a few nails on the head as usual.
In the beginning there were a bunch of more or less unrelated XML Schemas. Kings amongst these were DwC in it's different flavours and ABCD (in all its complexity and a couple of versions).
Back in 2005 I was given the job of taking TCS (the XML Schema version) through the old standards process. TaxonConcepts and TaxonNames are really joining objects. They have very few literal properties but just point out to lots of other objects like people and specimens. I soon became frustrated with not being able to simply include chunks of other schemas (either ABCD or DwC but which?) without importing them wholesale and so hard linking to specific versions. i.e. effectively building a single XML Schema for our little world. And of course once we had one for our world the process would repeat with the rest of the world. I refuse to define fields for some ones contact details again ....
Having got TCS through the old TDWG standards process I was given the job of "TDWG Technical Architect" for two years. By this time I had had my conversion on the road to Damascus and decide that RDF and semantic technologies were really the best tools for this kind of semantic integration. There are use cases that are not supported though and are more appropriately handled with XML validated in some way but there is nothing to stop this XML mapping directly to RDF or even being an actual XML RDF serialization. (we are just talking frame based modeling at this level not OWL)
TDWG had just 'got' XML Schema and had big investment in it. DiGIR, BioCASe and TAPIR protocols all relied on it. I could not say "Don't do XML Schema for this stuff" or "Change all your schemas so they map to a RDF or a frame based model". So I spent two years trying to lead people down a weaving path that involved integrating XML based 'application schemas' with semantics in RDF. The message was far too complex for much of the audience and the momentum of what was going on was too great to get all the way to the goal but we certainly ended up with people talking in terms of modeling in a different way and concepts being identified by URIs. This is still part of that conversation.
A group of us went through a modeling process with Jessie Kennedy to start building an ontology for the whole of TDWG but this was just like building one big XML Schema and became bogged down - though I don't think Jessie would agree with that as she felt we could have completed something with a couple more meetings. It was a good modeling exercise but we didn't finish it. If had finished we would then have had to impose it on the rest of the organization. An illustration of the complexity of managing ontologies is that we were talking of observations being separate from specimens and specimens being more linked to collections management. I made the recommendation that there should be separate observations interest group and a specimen/collection curation interest group but the executive committee didn't support this approach. So we have the same group potentially discussing preservation techniques and GPS locations of organisms because historically observations have been extracted from specimens - unless you are a birder or ecologist (or anyone outside TDWG) in which case it is nuts.
At the same time we were introducing the notion on GUIDs which is integrally linked with the RDF way of looking at things. We need RDF based return types for LSIDs which were being issued by the nomenclators. I took TCS (XML Schema version) and created two RDF/OWL vocabularies out of it so we at least had the name spaces set up that could be used in the RDF about names and taxa.
http://rs.tdwg.org/ontology/voc/TaxonName
http://rs.tdwg.org/ontology/voc/TaxonConcept
I have been out of the loop on another project for a year and it is beyond me to develop and understand the issues across the whole of the TDWG space but I am willing to stand buy these, develop them further and try and take them through a ratification process. They are not perfect (certainly not good OWL) but I think they could be useful. Although they are still experimental they are being used to some extent.
At the same time I put together the TaxonOccurrence vocabulary
http://rs.tdwg.org/ontology/voc/TaxonOccurrence
This is shamefully stolen from the schema version of DwC and written (nee cut and paste) in a day or so. It was meant as a straw man that I hoped would be knocked down by the Observations and Specimens Group. We do need a ontology/vocabulary for specimens and possibly another for occurrence data but there are people with greater domain knowledge than me who should be doing it. I'd be happy if we took this down.
I believe the way forward is with small, modular ontologies that have as little entailment in them as possible - ideal for importing into larger ontologies without blowing them up. There seems to be a modeling approach which considers upper classes the most important. I believe the class hierarchy is actually only part of the ontology. It is perfectly possible to define functional classes first, standardize them and then import them into other ontologies that assert the class hierarchies. In fact this is the only way you can have shared semantics where we agree on an object but it has a different place in your world to the place it has in my world. The alternative is to get everyone to see the world from a single view of reality and the only way to do that is to pay them lots of money to see it that way!
We need versioning and standardization of constructs at a very fine grained level - even individual property level. If we don't take this approach nothing will be standardized till we agree on everything i.e. never.
Sorry for the long post and a tendency to over justify!
All the best,
Roger
On 23 Feb 2009, at 22:29, Blum, Stan wrote:
The TDWG 'house' is organic and messy, if not in disarray. I'll explain this state with a descriptive history instead of a logical exposition.
- We changed the way TDWG approves standards (in 2006). ABCD, TCS,
and SDD were the last standards approved by the old method (a vote by members). New standards simply have to pass expert and public review (as judged by the executive committee). ABCD, TCS, and SDD aren't consistent because...(blah, blah; life is hard).
- The TDWG ontology or vocabulary is NOT a standard. The terms are
supposed to be taken from standards, and perhaps even non-standard schemas that are widely used. The ontology managers are supposed to apply the filter of use; terms in the ontology should be widely used. (We expect that some terms in approved standards will not be widely used.)
- With the ontology/vocabulary comes a shift or at least a
broadening towards RDF (away from XML schemas), though there were/are some skeptics.
The existing pages on the TDWG vocabulary are pretty old, and I don't think there has been any kind of review "event" for these. Editing and reviewing these is on our to-do list for this year.
I think IT WOULD BE VERY HELPFUL if we put up a list of REQUIRED READING before getting into this review, and perhaps even the current review of DarwinCore. John Wieczorek and his core collaborators have drawn heavily from the DCMI initiative in shaping the most recent draft of the DarwinCore. I would really appreciate references to any other good statements about RDF and ontologies. These should go on the TAG homepage.
Matt, thanks for pointing out the emperor's clothes!
-Stan
-----Original Message----- From: mbjones.89@gmail.com [mailto:mbjones.89@gmail.com] On Behalf Of Matt Jones Sent: Monday, February 23, 2009 12:51 PM To: Blum, Stan Cc: Kevin Richards; Hilmar Lapp; Technical Architecture Group mailing list; rogerhyam Hyam; Mark Schildhauer Subject: Re: [tdwg-tag] Embedding specimen (and other) annotations in NeXML
This thread has prompted me to ask some naive questions about the process under which the vocabularies are formed. Maybe I'm the only one who is confused about the vocabularies, their status, and the process of forming new terms, but it seems maybe I'm not alone. And clarification on some of these points will help me with our direction on the development of the Observation Ontology under the OSR group, which I think will fit right in with Stan's point about fitting some of the concepts into a broader Observation framework.
For me there is a lot of confusion over the TDWG vocabularies, partly because they capture concepts that are present in existing TDWG standards, but are generally incomplete. For example, the TCS standard provides the field 'Specimens/Specimen', which I think is relevant to Hilmar's question. However, the listed TDWG vocabulary for TCS is the TaxonConcept vocabulary (http://rs.tdwg.org/ontology/voc/TaxonConcept), which does not provide a class for the TCS 'Specimen' concept. In addition, the ABCD TDWG standard also seem to have a way for specimens to be represented, but they are generalized as 'Unit's with a 'RecordBasis' of 'PreservedSpecimen'. So there are at least two official TDWG 'standards' for representing Specimen information, in addition to whatever DwC does. It seems to me that the best thing to do would be to finish the LSID vocabularies for TCS and ABCD so that they completely represent the concepts in TCS and ABCD, then get that approved as a valid way to represent these TDWG standards. In the process, one could try to resolve the differences in modeling approaches employed by the different standards, such as mapping the Specimen concept in TCS to its corresponding concept in DwC and ABCD. This would help avoid multiple TDWG standards defining overlapping versions of these concepts, and let people use the vocabularies in place of the XML schema versions of these standards.
What is the process for approval of the LSID vocabularies? They seem to be bypassing the normal TDWG standards track. Some of the vocabularies have a status of 'Available' (like TaxonConcept, even though it is incomplete), while others are marked as 'Developmental'.
The page on OntologyGovernance (http://wiki.tdwg.org/twiki/bin/view/TAG/TDWGOntologyGovernance) states: "Relationship Between TDWG Standards and the Ontology -- Concepts are standardized by being included in TDWG Standards. Once they have been mentioned in a standard the Ontology Manager has the responsibility of maintaining their URIs and descriptions as per the standard. Concepts must be promoted to the live branch before the standard enters the standards process. "
So it seems that the OntologyManager replaces the standards process for the purpose of the vocabularies. Is this correct? And does the OntologyManager make sure that concepts like 'Specimen' that are defined in TCS make it into the corresponding LSID vocabulary before it is classified as 'Available'? And how does the OntologyManager decide which concept and representation for 'Specimen' to use -- the one from TCS or the one from ABCD? Does 'Available' have the same weight as a published TDWG standard, and if so, shouldn't these vocabularies be listed on the Standards page as well? Finally, does the existence of a concept such as 'Specimen' in TCS have any bearing on the development of new standards such as DwC that may want to define the concept differently, or more completely?
Matt
------------------------------------------------------------- Roger Hyam Roger@BiodiversityCollectionsIndex.org http://www.BiodiversityCollectionsIndex.org ------------------------------------------------------------- Royal Botanic Garden Edinburgh 20A Inverleith Row, Edinburgh, EH3 5LR, UK Tel: +44 131 552 7171 ext 3015 Fax: +44 131 248 2901 http://www.rbge.org.uk/ -------------------------------------------------------------
Roger,
2009/2/24 Roger Hyam rogerhyam@mac.com: wrote ...
I believe the way forward is with small, modular ontologies that have as little entailment in them as possible - ideal for importing into larger ontologies without blowing them up. There seems to be a modeling approach which considers upper classes the most important. I believe the class hierarchy is actually only part of the ontology. It is perfectly possible to define functional classes first, standardize them and then import them into other ontologies that assert the class hierarchies. In fact this is the only way you can have shared semantics where we agree on an object but it has a different place in your world to the place it has in my world. The alternative is to get everyone to see the world from a single view of reality and the only way to do that is to pay them lots of money to see it that way!
We need versioning and standardization of constructs at a very fine grained level - even individual property level. If we don't take this approach nothing will be standardized till we agree on everything i.e. never.
I also happen to believe that we have no alternative other than to adopt this approach. It has worked well for us in the past.
We simply can not afford to continue propagating schema that do not build upon a standardized core of common elements.
Our current crop of endorsed and proposed standards are more application schema than standards. For the most part they have little relevance outside of the communities promoting them and their adoption is rarely achievable without compromise or modification? We are still quite happily using HISPID because it constrains content to suit our purpose in a way that ABCD and DwC can not.
We are about to experiment with a wiki based, one per page, approach to collaboratively working through this real core of "fine grained" concepts for species profiles. Page trees will be used to represent alternative classifications or higher level groupings. A full on ontological thread to follow this up will still be necessary but starting from here (again) we need to be as open and accessible as possible. The formal ontology will document outcomes rather than driving them. I suggest that we try the same approach at the TDWG level. We may be surprised.
My group's current efforts to develop services (SOAP, TAPIR, etc) delivering TCS and profiles from disparate data sources through an intermediate repository based on the TDWG vocabularies, including TaxonName and TaxonConcept, demonstrate that even this stuff is still coarser than it needs be to be really useful. I should be able to apply them without compromising requirements or precluding our ability to participate freely in communities using orthogonal schema.
greg
participants (8)
-
"Markus Döring (GBIF)"
-
Blum, Stan
-
greg whitbread
-
Hilmar Lapp
-
John R. WIECZOREK
-
Kevin Richards
-
Matt Jones
-
Roger Hyam