Re-organisation of TDWG Ontology: Danger silence will == acquiescence!
Hi All,
I need to do some work on the Taxon Name and Taxon Concept vocabularies and believe I have come up with a good way of organising the TDWG ontology space (everything within http:/rs.tdwg.org/ontology).
The following are the changes I suggest:
All files should be OWL DL compliant All files should be openable in Protege 4 (I believe this is now good enough to use for editing these small ontologies) We take a highly structured modular approach I call this the Bricks and Mortar design pattern Some files are 'Bricks' and as such import or reference no other files, classes or individuals. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to define things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. This pattern gives us maximum re-usability as the same Brick could be used in different ways. It does not bind us to any one implementation of one object. An example of the usage pattern would be to define TaxonName, TaxonConcept, Rank, NomenclaturalCode as separate bricks that don't reference each other at all then create a TCS ontology that imports these 4 bricks and defines their relationships. We move to some other method of presenting the ontologies on line - possibly the OWLDoc plug-in for Protege. This would lose us the branded look we have at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I volunteer to do manage the space this year if people are happy going down this route.
From the point of view of deployed systems (the nomenclators) there may be a need for a namespace change on some properties but I would review what is in use and this would be trivial - if necessary at all.
What do you think? I will take silence as acquiescence on the grounds that any movement is better than none - though I don't suppose I will get round to doing anything about changes till after e-Biosphere in June.
All the best,
Roger
----------------------------------------------------- Roger Hyam - Project Officer WP4 Pan European Species Infrastructure +44 75 90 60 80 16 -----------------------------------------------------
silence
:-)
Gregor
I think this is a good idea. However, I would be more specific about what compliance means. Perhaps the most future proof is something like: "The ontology must be valid OWL DL as determined by the Manchester owl Lint plugin to Protege 4." Probably this is equivalent to requiring determination by the Manchester WonderWeb OWL validator.
The biggest recurring problem I've had with identifying the OWL species of TDWG ontologies is that frameworks try to resolve some of the legacy DC terms to the old versions of DC, many of which are not OWL DL for relatively simple reasons. Try, e.g. http://rs.tdwg.org/ontology/voc/SpeciesProfileModel.rdfhttp://rs.tdwg.org/ontology/voc/InstitutionType.rdfin WonderWeb, which quickly indicts some older DC that can easily be replaced with the 2008 versions. (It also finds some similar lint in TDWG owl files).
Aside from that, I remain silent, in agreement with Gregor.
On Tue, May 12, 2009 at 4:59 AM, Roger Hyam rogerhyam@mac.com wrote:
Hi All, I need to do some work on the Taxon Name and Taxon Concept vocabularies and believe I have come up with a good way of organising the TDWG ontology space (everything within http:/rs.tdwg.org/ontology).
The following are the changes I suggest:
- All files should be OWL DL compliant
- All files should be openable in Protege 4 (I believe this is now good
enough to use for editing these small ontologies)
- We take a highly structured modular approach I call this the Bricks
and Mortar design pattern - Some files are 'Bricks' and as such *import or reference no other files, classes or individuals*. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. - Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to define things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. - This pattern gives us maximum re-usability as the same Brick could be used in different ways. It does not bind us to any one implementation of one object. - An example of the usage pattern would be to define TaxonName, TaxonConcept, Rank, NomenclaturalCode as separate bricks that don't reference each other at all then create a TCS ontology that imports these 4 bricks and defines their relationships.
- We move to some other method of presenting the ontologies on line -
possibly the OWLDoc plug-in for Protege. This would lose us the branded look we have at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I volunteer to do manage the space this year if people are happy going down this route.
From the point of view of deployed systems (the nomenclators) there may be a need for a namespace change on some properties but I would review what is in use and this would be trivial - if necessary at all.
What do you think? I will take silence as acquiescence on the grounds that any movement is better than none - though I don't suppose I will get round to doing anything about changes till after e-Biosphere in June.
All the best,
Roger
Roger Hyam - Project Officer WP4 Pan European Species Infrastructure
+44 75 90 60 80 16
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Roger,
We have been working up to a similar suggestion. The plan was to implement in the the first instance using the TDWG wiki itself, if it is up to it. A page for each brick and mortar pages to build them into the graph in different ways. OWL DL files and Protege 4 compliance may be where we need to be, and perhaps individual effort will be required initially to take these pages there, but there are many working in this space with the will to contribute and we need to accommodate them with a more collaborative approach to ontology development and an appropriate level of accessibility. Opening up the process might just lead to some standardized TDWG standards.
greg
2009/5/12 Bob Morris morris.bob@gmail.com:
I think this is a good idea. However, I would be more specific about what compliance means. Perhaps the most future proof is something like: "The ontology must be valid OWL DL as determined by the Manchester owl Lint plugin to Protege 4." Probably this is equivalent to requiring determination by the Manchester WonderWeb OWL validator.
The biggest recurring problem I've had with identifying the OWL species of TDWG ontologies is that frameworks try to resolve some of the legacy DC terms to the old versions of DC, many of which are not OWL DL for relatively simple reasons. Try, e.g. http://rs.tdwg.org/ontology/voc/SpeciesProfileModel.rdf in WonderWeb, which quickly indicts some older DC that can easily be replaced with the 2008 versions. (It also finds some similar lint in TDWG owl files).
Aside from that, I remain silent, in agreement with Gregor.
On Tue, May 12, 2009 at 4:59 AM, Roger Hyam rogerhyam@mac.com wrote:
Hi All, I need to do some work on the Taxon Name and Taxon Concept vocabularies and believe I have come up with a good way of organising the TDWG ontology space (everything within http:/rs.tdwg.org/ontology). The following are the changes I suggest:
All files should be OWL DL compliant All files should be openable in Protege 4 (I believe this is now good enough to use for editing these small ontologies) We take a highly structured modular approach I call this the Bricks and Mortar design pattern
Some files are 'Bricks' and as such import or reference no other files, classes or individuals. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to define things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. This pattern gives us maximum re-usability as the same Brick could be used in different ways. It does not bind us to any one implementation of one object. An example of the usage pattern would be to define TaxonName, TaxonConcept, Rank, NomenclaturalCode as separate bricks that don't reference each other at all then create a TCS ontology that imports these 4 bricks and defines their relationships.
We move to some other method of presenting the ontologies on line - possibly the OWLDoc plug-in for Protege. This would lose us the branded look we have at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I volunteer to do manage the space this year if people are happy going down this route. From the point of view of deployed systems (the nomenclators) there may be a need for a namespace change on some properties but I would review what is in use and this would be trivial - if necessary at all. What do you think? I will take silence as acquiescence on the grounds that any movement is better than none - though I don't suppose I will get round to doing anything about changes till after e-Biosphere in June. All the best, Roger
Roger Hyam - Project Officer WP4 Pan European Species Infrastructure +44 75 90 60 80 16
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
-- Robert A. Morris Professor of Computer Science UMASS-Boston ram@cs.umb.edu http://bdei.cs.umb.edu/ http://www.cs.umb.edu/~ram http://www.cs.umb.edu/~ram/calendar.html phone (+1)617 287 6466
If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.
Please consider the environment before printing this email.
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Dear Roger,
This is probably off topic (and maybe I missed something), but it really worries me that much of the TDWG vocabulary is unique to TDWG. Journal publishers are busy pumping out Dublin Core and PRISM (sometimes with FOAF). I'm trying to link this to records nomenclators pump out. I seem to recall some statement that where possible TDWG would reuse existing vocabularies, but I don't see this happening. In the meantime, I'm busy mapping our idiosyncratic vocabulary to that used by publishers.
Regards
Rod
On 12 May 2009, at 09:59, Roger Hyam wrote:
Hi All,
I need to do some work on the Taxon Name and Taxon Concept vocabularies and believe I have come up with a good way of organising the TDWG ontology space (everything within http:/ rs.tdwg.org/ontology).
The following are the changes I suggest:
All files should be OWL DL compliant All files should be openable in Protege 4 (I believe this is now good enough to use for editing these small ontologies) We take a highly structured modular approach I call this the Bricks and Mortar design pattern Some files are 'Bricks' and as such import or reference no other files, classes or individuals. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to define things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. This pattern gives us maximum re-usability as the same Brick could be used in different ways. It does not bind us to any one implementation of one object. An example of the usage pattern would be to define TaxonName, TaxonConcept, Rank, NomenclaturalCode as separate bricks that don't reference each other at all then create a TCS ontology that imports these 4 bricks and defines their relationships. We move to some other method of presenting the ontologies on line - possibly the OWLDoc plug-in for Protege. This would lose us the branded look we have at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I volunteer to do manage the space this year if people are happy going down this route.
From the point of view of deployed systems (the nomenclators) there may be a need for a namespace change on some properties but I would review what is in use and this would be trivial - if necessary at all.
What do you think? I will take silence as acquiescence on the grounds that any movement is better than none - though I don't suppose I will get round to doing anything about changes till after e-Biosphere in June.
All the best,
Roger
Roger Hyam - Project Officer WP4 Pan European Species Infrastructure
+44 75 90 60 80 16
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
--------------------------------------------------------- Roderic Page Professor of Taxonomy DEEB, FBLS Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
Several people have espoused this existing vocabulary reuse idea and it seems on the surface to make sense (why reinvent, duplicate, etc.) Has this been adopted as a general working principle for TDWG? Or should it?
If it has been adopted, de facto or de jure, is there an agreed priority in which the vocabularies should be raided for terms? Or is this just being too anally retentive and taxonomist?
jim
On Wed, May 13, 2009 at 12:17 AM, Roderic Page r.page@bio.gla.ac.uk wrote:
Dear Roger, This is probably off topic (and maybe I missed something), but it really worries me that much of the TDWG vocabulary is unique to TDWG. Journal publishers are busy pumping out Dublin Core and PRISM (sometimes with FOAF). I'm trying to link this to records nomenclators pump out. I seem to recall some statement that where possible TDWG would reuse existing vocabularies, but I don't see this happening. In the meantime, I'm busy mapping our idiosyncratic vocabulary to that used by publishers. Regards Rod
On 12 May 2009, at 09:59, Roger Hyam wrote:
Hi All, I need to do some work on the Taxon Name and Taxon Concept vocabularies and believe I have come up with a good way of organising the TDWG ontology space (everything within http:/rs.tdwg.org/ontology). The following are the changes I suggest:
All files should be OWL DL compliant All files should be openable in Protege 4 (I believe this is now good enough to use for editing these small ontologies) We take a highly structured modular approach I call this the Bricks and Mortar design pattern
Some files are 'Bricks' and as such import or reference no other files, classes or individuals. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to define things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. This pattern gives us maximum re-usability as the same Brick could be used in different ways. It does not bind us to any one implementation of one object. An example of the usage pattern would be to define TaxonName, TaxonConcept, Rank, NomenclaturalCode as separate bricks that don't reference each other at all then create a TCS ontology that imports these 4 bricks and defines their relationships.
We move to some other method of presenting the ontologies on line - possibly the OWLDoc plug-in for Protege. This would lose us the branded look we have at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I volunteer to do manage the space this year if people are happy going down this route. From the point of view of deployed systems (the nomenclators) there may be a need for a namespace change on some properties but I would review what is in use and this would be trivial - if necessary at all. What do you think? I will take silence as acquiescence on the grounds that any movement is better than none - though I don't suppose I will get round to doing anything about changes till after e-Biosphere in June. All the best, Roger
Roger Hyam - Project Officer WP4 Pan European Species Infrastructure +44 75 90 60 80 16
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Roderic Page Professor of Taxonomy DEEB, FBLS Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
In response Rod's suggestion about vocabulary reuse: Absolutely, we should make every effort to re-use existing vocabularies That promotes interoperability whereas duplication does not. This is a case of the golden rule -- we should adopt others' vocabularies as we would have them adopt ours.
In response to Jim's question about which neighboring or encompassing domain vacabularies should we re-use? All that are compatible and demonstrably more general and well established; e.g., Dublin Core, FOAF, etc.
I support Roger's plans for methods, BUT...
I strongly suggest that we take this opportunity to dispense with (reject, clean out) the naming convention that used prefixes on terms like "base" and "core". Just name the concepts to optimize common understanding and specificity. Trying to convey ontological structure with names seems ill-advised. The prefixes put people off and imposed an artificial constraint (only two levels).
-Stan
I agree. The less we define the better the ontology will be.
Roger
On 12 May 2009, at 15:17, Roderic Page wrote:
Dear Roger,
This is probably off topic (and maybe I missed something), but it really worries me that much of the TDWG vocabulary is unique to TDWG. Journal publishers are busy pumping out Dublin Core and PRISM (sometimes with FOAF). I'm trying to link this to records nomenclators pump out. I seem to recall some statement that where possible TDWG would reuse existing vocabularies, but I don't see this happening. In the meantime, I'm busy mapping our idiosyncratic vocabulary to that used by publishers.
Regards
Rod
On 12 May 2009, at 09:59, Roger Hyam wrote:
Hi All,
I need to do some work on the Taxon Name and Taxon Concept vocabularies and believe I have come up with a good way of organising the TDWG ontology space (everything within http:/ rs.tdwg.org/ontology).
The following are the changes I suggest:
All files should be OWL DL compliant All files should be openable in Protege 4 (I believe this is now good enough to use for editing these small ontologies) We take a highly structured modular approach I call this the Bricks and Mortar design pattern Some files are 'Bricks' and as such import or reference no other files, classes or individuals. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to define things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. This pattern gives us maximum re-usability as the same Brick could be used in different ways. It does not bind us to any one implementation of one object. An example of the usage pattern would be to define TaxonName, TaxonConcept, Rank, NomenclaturalCode as separate bricks that don't reference each other at all then create a TCS ontology that imports these 4 bricks and defines their relationships. We move to some other method of presenting the ontologies on line - possibly the OWLDoc plug-in for Protege. This would lose us the branded look we have at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I volunteer to do manage the space this year if people are happy going down this route.
From the point of view of deployed systems (the nomenclators) there may be a need for a namespace change on some properties but I would review what is in use and this would be trivial - if necessary at all.
What do you think? I will take silence as acquiescence on the grounds that any movement is better than none - though I don't suppose I will get round to doing anything about changes till after e-Biosphere in June.
All the best,
Roger
Roger Hyam - Project Officer WP4 Pan European Species Infrastructure
+44 75 90 60 80 16
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Roderic Page Professor of Taxonomy DEEB, FBLS Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
------------------------------------------------------------- Roger Hyam Roger@BiodiversityCollectionsIndex.org http://www.BiodiversityCollectionsIndex.org ------------------------------------------------------------- Royal Botanic Garden Edinburgh 20A Inverleith Row, Edinburgh, EH3 5LR, UK Tel: +44 131 552 7171 ext 3015 Fax: +44 131 248 2901 http://www.rbge.org.uk/ -------------------------------------------------------------
As an aside, there is a technological aspect this approach which is symptomatic of where TDWG finds itself at the moment. Unless a taxonomist speaks fluent OWL and owns and can drive a copy of Protege, they will not able to participate, effectively excluding most taxonomists on the planet. (on second thought, this might actually be a good thing...)
How do we propose to engage those who work daily with the 'bricks and mortar' of taxonomy but who are just not equipped to understand what is being done with and presented by the technology? Or has the fabric of taxonomy now finally become too important and to complicated to be left to taxonomists?
Perhaps a translation of the ontologies into a non technical format that taxonomists could read and comment on might be a way get greater engagement from a wider range of taxonomists? Given the opportunity and the means, they might be able to offer significant narrative for the vocabularies. Or maybe not...
In the absence of a workable alternative to what is being proposed, I too must offer silence...
jim
On Tue, May 12, 2009 at 6:59 PM, Roger Hyam rogerhyam@mac.com wrote:
Hi All, I need to do some work on the Taxon Name and Taxon Concept vocabularies and believe I have come up with a good way of organising the TDWG ontology space (everything within http:/rs.tdwg.org/ontology). The following are the changes I suggest:
All files should be OWL DL compliant All files should be openable in Protege 4 (I believe this is now good enough to use for editing these small ontologies) We take a highly structured modular approach I call this the Bricks and Mortar design pattern
Some files are 'Bricks' and as such import or reference no other files, classes or individuals. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to define things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. This pattern gives us maximum re-usability as the same Brick could be used in different ways. It does not bind us to any one implementation of one object. An example of the usage pattern would be to define TaxonName, TaxonConcept, Rank, NomenclaturalCode as separate bricks that don't reference each other at all then create a TCS ontology that imports these 4 bricks and defines their relationships.
We move to some other method of presenting the ontologies on line - possibly the OWLDoc plug-in for Protege. This would lose us the branded look we have at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I volunteer to do manage the space this year if people are happy going down this route. From the point of view of deployed systems (the nomenclators) there may be a need for a namespace change on some properties but I would review what is in use and this would be trivial - if necessary at all. What do you think? I will take silence as acquiescence on the grounds that any movement is better than none - though I don't suppose I will get round to doing anything about changes till after e-Biosphere in June. All the best, Roger
Roger Hyam - Project Officer WP4 Pan European Species Infrastructure +44 75 90 60 80 16
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
I like to keep things simple, and like to play ball outside our community. Wonderful blog post by Benjamin Good hints at this (http://i9606.blogspot.com/2009/05/cwa-at-ymca.html ), I suspect we may be guilty of over-engineering things a bit (that said, I really appreciate the effort that's gone into this so far).
However, what I find more pressing than OWL + Protege, etc. would be tools to validate metadata that's already in the wild. For example, Zoobank has minted a LSID, urn:lsid:zoobank.org:pub:20B00870-7416-4583-ADE0-4302E5571B66 , that includes the triple:
tpub:url "doi: 10.3897/zookeys.10.157"
Now, "doi: 10.3897/zookeys.10.157" (note the space between "doi:" and the start of the identifier) isn't, by any criterion, a URL. Furthermore, there's actually a term in the vocabulary tpub:doi that could be used for the DOI (but I'd argue we should use prism:doi). But we also need to be careful how we write them (note the extraneous space in the example above). And don't get me started on what can end up in date fields.
Without some tools to quickly check the metadata that is being produced, I suspect we are going to get a bit of a mess. A tool such as http://feedvalidator.org would be a great help.
Regards
Rod
On 12 May 2009, at 15:22, Jim Croft wrote:
As an aside, there is a technological aspect this approach which is symptomatic of where TDWG finds itself at the moment. Unless a taxonomist speaks fluent OWL and owns and can drive a copy of Protege, they will not able to participate, effectively excluding most taxonomists on the planet. (on second thought, this might actually be a good thing...)
How do we propose to engage those who work daily with the 'bricks and mortar' of taxonomy but who are just not equipped to understand what is being done with and presented by the technology? Or has the fabric of taxonomy now finally become too important and to complicated to be left to taxonomists?
Perhaps a translation of the ontologies into a non technical format that taxonomists could read and comment on might be a way get greater engagement from a wider range of taxonomists? Given the opportunity and the means, they might be able to offer significant narrative for the vocabularies. Or maybe not...
In the absence of a workable alternative to what is being proposed, I too must offer silence...
jim
On Tue, May 12, 2009 at 6:59 PM, Roger Hyam rogerhyam@mac.com wrote:
Hi All, I need to do some work on the Taxon Name and Taxon Concept vocabularies and believe I have come up with a good way of organising the TDWG ontology space (everything within http:/rs.tdwg.org/ontology). The following are the changes I suggest:
All files should be OWL DL compliant All files should be openable in Protege 4 (I believe this is now good enough to use for editing these small ontologies) We take a highly structured modular approach I call this the Bricks and Mortar design pattern
Some files are 'Bricks' and as such import or reference no other files, classes or individuals. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to define things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. This pattern gives us maximum re-usability as the same Brick could be used in different ways. It does not bind us to any one implementation of one object. An example of the usage pattern would be to define TaxonName, TaxonConcept, Rank, NomenclaturalCode as separate bricks that don't reference each other at all then create a TCS ontology that imports these 4 bricks and defines their relationships.
We move to some other method of presenting the ontologies on line - possibly the OWLDoc plug-in for Protege. This would lose us the branded look we have at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I volunteer to do manage the space this year if people are happy going down this route. From the point of view of deployed systems (the nomenclators) there may be a need for a namespace change on some properties but I would review what is in use and this would be trivial - if necessary at all. What do you think? I will take silence as acquiescence on the grounds that any movement is better than none - though I don't suppose I will get round to doing anything about changes till after e-Biosphere in June. All the best, Roger
Roger Hyam - Project Officer WP4 Pan European Species Infrastructure
+44 75 90 60 80 16
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
-- _________________ Jim Croft ~ jim.croft@gmail.com ~ +61-2-62509499 ~ http://www.google.com/profiles/jim.croft
"Words, as is well known, are the great foes of reality."
- Joseph Conrad, author (1857-1924)
"I know that you believe that you understood what you think I said, but I am not sure you realize that what you heard is not what I meant."
- attributed to Robert McCloskey, US State Department spokesman
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
--------------------------------------------------------- Roderic Page Professor of Taxonomy DEEB, FBLS Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
No time to comment on this very interesting thread to the extent that I should (perhaps more later), but I will make one comment:
For example, Zoobank has minted a LSID, urn:lsid:zoobank.org:pub:20B00870-7416-4583-ADE0-4302E5571B66 , that includes the triple:
tpub:url "doi: 10.3897/zookeys.10.157"
Just for the record:
Do NOT, by any means, regard ZooBank as a production LSID provider/resolver! ZooBank is, and remains, very much a prototype in all senses of the word (keep in mind that it's being developed entirely by a guy who knows painfully little about web service development, and who seldom achieves more than four contiguous hours of sleep in any 24-hour period -- volunteer developers and funding sources are most, most, most welcome!). So, the point is, there are MANY MANY aspects of ZooBank that are known to be non-functional, or mis-functional, and (as in this case) there are many opportunities for users to enter information incorrectly or into the wrong fields (the doi obviously should have been in the identifier field; not the URL field).
This is NOT directed at Rod -- in fact, perhaps more than just about anyone else, Rod has been instrumentally supportive of the ZooBank cause (perhaps without realizing it). Moreover, I welcome cases of ZooBank's shortcomings to be used in this fashion as an exemplar of how not to do something (helps me become aware of a problem that needs to be fixed). But I've seen in other discussion venues the misperception that ZooBank is some sort standard bearer; when clearly it is not -- at least not at the technology level (not at any level, really).
Having said, that, I do want to echo Rod's point that this is only one example of many where data content "in the wild" will fail to conform with expectations as documented in various technology specifications. I see this as an unavoidable consequence of a situation where the number of people who want to "play the game" (in terms of providing standards-compliant data online) vastly exceeds the number of people with the skills and knowledge to do so correctly. Until biodiversity informatics garners the same level of societal support (and commensurate funding) as, say, global climate change; this may not improve very quickly.
In our context, I believe the obvious path to salvation is through metadata validation tools of the sort Rod endorses, plus the masterful work of people like Markus for developing software tools to conceal the complexities that render most of us glassy-eyed.
Gotta go...
Aloha, Rich
P.S. After writing (but before sending) the above, I went through the rest of the thread ("Newcomer's", etc.). I most definitetly agree with Lynette, and the other people who agree with Lynette. But it's tricky -- you need a space for the techies to run full-throttle, without being encumbered by the need for frequent translations to the vernacular. I'm speaking here in the context of a non-techie for semantic web; but as a techie in other domains (e.g., subaquatic respiration facilitation technologies). Then you need a space for the "in plain [insert your native language here] explanations", for the rest of us to understand what's going on. But the most critical piece -- and the one that's often missing -- are the translators who can bridge the gap (and keep the latter space update in response to progress in the former space). Such translators not only require robust comminication skills in both spaces; but also must have both the time, and the inclination, to play that role.
I had said:
So, the point is, there are MANY MANY aspects of ZooBank that are known to be non-functional, or mis-functional, and (as in this case) there are many opportunities for users to enter information incorrectly or into the wrong fields (the doi obviously should have been in the identifier field; not the URL field).
I want to retract that last bit! I just now discovered that the problem is with the code behind the user interface, and NOT the user putting information in the wrong field! My apologies for the misleading statement (unintentional, though it was). Of course, there is still ample opportunity on the ZooNBank website for someone to enter the wrong information in the wrong field, but the specific case pointed out by Rod was not an example of such.
Aloha, Rich
1) Seeing data marked up in a standard can make it a lot easier to figure out how to implement that standard yourself. It is difficult to see if you are interpreting the standard correctly, and you sometimes find mistakes inthe examples like the one listed above.
For instance, I saw an example where the Datum was recorded as "epsg:4326". I understood that it should be interpreted as "WGS84". But, I did not expect that Datum would appear in that form, and I would not have written a parser to recognize that.
In regards to RDF/OWL, having a sample data set allows you to see if you can ask the questions that you want to.
These were the reason's why I was looking for some sample data a while back.
For data that is going to be used in triple stores and used with SPARQL, it seems best to replace literals with URI's as much as possible. The system seems faster, and more predictable. It appears the system finds a long but standard URI, preferable to a string of characters. This make sense if you think about how this is stored, but it is the opposite of how most people think.
I am a little concerned about adopting a standard which people have not actually tried to work with.
2) It might be advantageous to split out some parts to speed this up.
If we could come up with a proposed representation for location, it would not be too difficult to generate some sample data sets that people could then test.
In the testing phase, we should see if there are any problems with the standard, and what kinds of errors people make when trying to produce records that comply with the standards.
Also, we might solicit feedback from some of the other people in the Semantic Web community. This might help improve the location standard and improve the chances that it might get more widely adopted.
Finally, we produce example code in several different programming languages and environments that demonstrates how to produce the location markup. For instance, code that creates location records in Ruby on Rails, Python, Java, maybe even FileMaker. That code could then be added as libraries for each of those languages.
If we have a standard table format for location, then the data can be output in a number of different formats including variations of OWL/RDF based on different ontologies.
Maybe something like:
Simple RDF - with ontology (data mixes easily with other data sets) OWL/RDF - with DL ontology (may only work with DarwinCore Data) OWL/RDF - experimental ontology (things that people would like to try without complete ratification)
- Pete
On Sun, May 17, 2009 at 4:46 AM, Richard Pyle deepreef@bishopmuseum.orgwrote:
I had said:
So, the point is, there are MANY MANY aspects of ZooBank that are known to be non-functional, or mis-functional, and (as in this case) there are many opportunities for users to enter information incorrectly or into the wrong fields (the doi obviously should have been in the identifier field; not the URL field).
I want to retract that last bit! I just now discovered that the problem is with the code behind the user interface, and NOT the user putting information in the wrong field! My apologies for the misleading statement (unintentional, though it was). Of course, there is still ample opportunity on the ZooNBank website for someone to enter the wrong information in the wrong field, but the specific case pointed out by Rod was not an example of such.
Aloha, Rich
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
I'm not too alarmed that domain specialists might not be able to use the tools that create ontologies, if as you say, they can appropriately provide the input about the domain to those defining the ontology, and can use simple tools for critiquing the result. To make an analogy: OWL ontologies are analogous to the formal specification of a programming language, say Java. OWL instances documents, e.g. the representation of a specimen record, are analagous to programs in the programming language, say Java programs. You don't need to know how to be a compiler writer, or even be able to read a document like http://java.sun.com/docs/books/jls/third_edition/html/syntax.html#18.1 to be able to write Java programs. What you need are good tools, such as compilers, libraries, API documentation, integrated development environments.
Rod put it more simply than I (he usually does---maybe everybody usually does): "Without some tools to quickly check the metadata that is being produced, I suspect we are going to get a bit of a mess. " Or, as I tried to say: Domain specialists should be most concerned about the utility and correctness of instance documents, not formal specifications.
Bob
On Tue, May 12, 2009 at 10:22 AM, Jim Croft jim.croft@gmail.com wrote:
As an aside, there is a technological aspect this approach which is symptomatic of where TDWG finds itself at the moment. Unless a taxonomist speaks fluent OWL and owns and can drive a copy of Protege, they will not able to participate, effectively excluding most taxonomists on the planet. (on second thought, this might actually be a good thing...)
How do we propose to engage those who work daily with the 'bricks and mortar' of taxonomy but who are just not equipped to understand what is being done with and presented by the technology? Or has the fabric of taxonomy now finally become too important and to complicated to be left to taxonomists?
Perhaps a translation of the ontologies into a non technical format that taxonomists could read and comment on might be a way get greater engagement from a wider range of taxonomists? Given the opportunity and the means, they might be able to offer significant narrative for the vocabularies. Or maybe not...
In the absence of a workable alternative to what is being proposed, I too must offer silence...
jim
On Tue, May 12, 2009 at 6:59 PM, Roger Hyam rogerhyam@mac.com wrote:
Hi All, I need to do some work on the Taxon Name and Taxon Concept vocabularies
and
believe I have come up with a good way of organising the TDWG ontology
space
(everything within http:/rs.tdwg.org/ontology). The following are the changes I suggest:
All files should be OWL DL compliant All files should be openable in Protege 4 (I believe this is now good
enough
to use for editing these small ontologies) We take a highly structured modular approach I call this the Bricks and Mortar design pattern
Some files are 'Bricks' and as such import or reference no other files, classes or individuals. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to
define
things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. This pattern gives us maximum re-usability as the same Brick could be
used
in different ways. It does not bind us to any one implementation of one object. An example of the usage pattern would be to define TaxonName,
TaxonConcept,
Rank, NomenclaturalCode as separate bricks that don't reference each
other
at all then create a TCS ontology that imports these 4 bricks and defines their relationships.
We move to some other method of presenting the ontologies on line -
possibly
the OWLDoc plug-in for Protege. This would lose us the branded look we
have
at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I
volunteer
to do manage the space this year if people are happy going down this
route.
From the point of view of deployed systems (the nomenclators) there may
be a
need for a namespace change on some properties but I would review what is
in
use and this would be trivial - if necessary at all. What do you think? I will take silence as acquiescence on the grounds
that
any movement is better than none - though I don't suppose I will get
round
to doing anything about changes till after e-Biosphere in June. All the best, Roger
Roger Hyam - Project Officer WP4 Pan European Species Infrastructure
+44 75 90 60 80 16
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
-- _________________ Jim Croft ~ jim.croft@gmail.com ~ +61-2-62509499 ~ http://www.google.com/profiles/jim.croft
"Words, as is well known, are the great foes of reality."
- Joseph Conrad, author (1857-1924)
"I know that you believe that you understood what you think I said, but I am not sure you realize that what you heard is not what I meant."
- attributed to Robert McCloskey, US State Department spokesman
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Hi Roger, I was wondering if you needed any help?
I have using Protege. And most recently I have been trying to make my ontology work with the other major ontologies and TDWG.
The items that I needed, that were not there originally, were a SpeciesConceptID (different from the name) and the ability to play well with the LinkedData community.
These two issues seem to have been resolved. :-)
- Pete
Also, Protege is free http://protege.stanford.edu/
On Tue, May 12, 2009 at 3:59 AM, Roger Hyam rogerhyam@mac.com wrote:
Hi All, I need to do some work on the Taxon Name and Taxon Concept vocabularies and believe I have come up with a good way of organising the TDWG ontology space (everything within http:/rs.tdwg.org/ontology).
The following are the changes I suggest:
- All files should be OWL DL compliant
- All files should be openable in Protege 4 (I believe this is now good
enough to use for editing these small ontologies)
- We take a highly structured modular approach I call this the Bricks
and Mortar design pattern - Some files are 'Bricks' and as such *import or reference no other files, classes or individuals*. e.g. TaxonName does not mention a higher 'Name' object in the class hierarchy. - Other files are 'Mortar'. These files import Bricks and stipulate relationships between things. Because we are using OWL it is easy to define things like the class hierarchy or the range of a property in a separate file to the file the original class or property was defined in. - This pattern gives us maximum re-usability as the same Brick could be used in different ways. It does not bind us to any one implementation of one object. - An example of the usage pattern would be to define TaxonName, TaxonConcept, Rank, NomenclaturalCode as separate bricks that don't reference each other at all then create a TCS ontology that imports these 4 bricks and defines their relationships.
- We move to some other method of presenting the ontologies on line -
possibly the OWLDoc plug-in for Protege. This would lose us the branded look we have at the moment but would be more flexible and consistent in the long run.
As I need to do this for the TaxonName TaxonConcept vocabularies I volunteer to do manage the space this year if people are happy going down this route.
From the point of view of deployed systems (the nomenclators) there may be a need for a namespace change on some properties but I would review what is in use and this would be trivial - if necessary at all.
What do you think? I will take silence as acquiescence on the grounds that any movement is better than none - though I don't suppose I will get round to doing anything about changes till after e-Biosphere in June.
All the best,
Roger
Roger Hyam - Project Officer WP4 Pan European Species Infrastructure
+44 75 90 60 80 16
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
participants (9)
-
Blum, Stan
-
Bob Morris
-
greg whitbread
-
Gregor Hagedorn
-
Jim Croft
-
Peter DeVries
-
Richard Pyle
-
Roderic Page
-
Roger Hyam