Information about how controlled vocabularies would be handled under the draft Standards Documentation Specification
At the TDWG meeting in December, I led an informative session describing the main points of the draft Standards Documentation Specification (SDS) and its sister standard, the draft Vocabulary Management Specification. At that session, some participants seem to be taken aback at the prescription by the SDS that controlled vocabularies should be SKOS concept schemes rather than ontologies. There wasn't enough time at the meeting to fully explore that issue and I hoped that it would come up for further discussion during the public comment period.
We are now midway through the 30 day public comment period on the SDS and so far, that issue has not come up. I was recently listening to the recording of John Wieczorek's nice Darwin Core Hour presentation on controlled vocabularies and it was apparent to me that the creation of controlled vocabularies is an issue of interest to many in the community. So I've written a blog post (http://baskauf.blogspot.com/2017/03/controlled-values-again.html) that attempts to explain in non-technical terms how the SDS specifies that controlled vocabularies should be expressed in machine-readable form. For those who are interested in the gory details, I've also included at the end a more detailed explanation of the rationale for specifying that controlled vocabularies should, in most cases, be described as SKOS concept schemes rather than ontologies.
If you care about the creation of controlled vocabularies, you should take a look at this post and create an official comment if there are things you don't like about the approach taken in the proposed specification. Our review manager, Dag Endresen, has requested that issues be raised on the issue tracker at https://github.com/tdwg/vocab/issues . However, historically back-and-forth discussion about proposed standards has also taken place on this list, so I think that responding here for clarification and discussion would be appropriate prior to submitting an official comment.
Steve
Steve,
Well I'm not sure if +1s are worthwhile, but I think you nicely defined the issues.
I hope that people don't get caught up on the "preferred labels" issue. If it is the major issue then an ontology may be overkill (e.g. why bother minting a URI if for all intents and practical purposes you're going to require a unique string reference). I think that CVs could be ontologies, or SKOS concept schemes, either should lead to improvements, both are in theory better than simple strings.
* I'm a user who is unhappy with what is being expressed under a DC term * I want to propose an improvement (let's call this a CV) * I understand what I want to emphasize (e.g. precision, simplicity of reference, ability to infer, ability to compute, multi-lingual support, ability to quickly amend, ability to prevent quick changes, ability to quickly select, etc.) * I consult a bullet point list that describes the pros/cons of using a SKOS or an ontology * I select one or the other formalization, and use it to propose the improvement
A simple bullet point list that specifically points out the pros/cons of using SKOS vs. an ontology would be key in this scenario. If I use X I get Y,Z, if I use A I get B,C. Where YZBC are things the user wants to emphasize.
Matt
On Sat, Mar 11, 2017 at 8:50 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
At the TDWG meeting in December, I led an informative session describing the main points of the draft Standards Documentation Specification (SDS) and its sister standard, the draft Vocabulary Management Specification. At that session, some participants seem to be taken aback at the prescription by the SDS that controlled vocabularies should be SKOS concept schemes rather than ontologies. There wasn't enough time at the meeting to fully explore that issue and I hoped that it would come up for further discussion during the public comment period.
We are now midway through the 30 day public comment period on the SDS and so far, that issue has not come up. I was recently listening to the recording of John Wieczorek's nice Darwin Core Hour presentation on controlled vocabularies and it was apparent to me that the creation of controlled vocabularies is an issue of interest to many in the community. So I've written a blog post (http://baskauf.blogspot.com/2017/03/controlled-values-again.html) that attempts to explain in non-technical terms how the SDS specifies that controlled vocabularies should be expressed in machine-readable form. For those who are interested in the gory details, I've also included at the end a more detailed explanation of the rationale for specifying that controlled vocabularies should, in most cases, be described as SKOS concept schemes rather than ontologies. If you care about the creation of controlled vocabularies, you should take a look at this post and create an official comment if there are things you don't like about the approach taken in the proposed specification. Our review manager, Dag Endresen, has requested that issues be raised on the issue tracker at https://github.com/tdwg/vocab/issues . However, historically back-and-forth discussion about proposed standards has also taken place on this list, so I think that responding here for clarification and discussion would be appropriate prior to submitting an official comment.
Steve
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: PMB 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 322-4942 If you fax, please phone or email so that I will know to look for it. http://bioimages.vanderbilt.edu http://vanderbilt.edu/trees
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Matt, I think there has been sentiment in the past to develop additional informative documents to go along with (but not be a part of) our standards that would provide more examples, how-to's etc. That would certainly be helpful for a group wanting to develop a controlled vocabulary. At this point we want to get things right in the standards documents themselves (particularly the normative parts). The helpful ancillary documents can be developed as we get more experience applying the standard, kind of like what's going on with the DwC Q&A site. Steve
Matt Yoder wrote:
Steve,
Well I'm not sure if +1s are worthwhile, but I think you nicely defined the issues.
I hope that people don't get caught up on the "preferred labels" issue. If it is the major issue then an ontology may be overkill (e.g. why bother minting a URI if for all intents and practical purposes you're going to require a unique string reference). I think that CVs could be ontologies, or SKOS concept schemes, either should lead to improvements, both are in theory better than simple strings.
- I'm a user who is unhappy with what is being expressed under a DC term
- I want to propose an improvement (let's call this a CV)
- I understand what I want to emphasize (e.g. precision, simplicity of
reference, ability to infer, ability to compute, multi-lingual support, ability to quickly amend, ability to prevent quick changes, ability to quickly select, etc.)
- I consult a bullet point list that describes the pros/cons of using
a SKOS or an ontology
- I select one or the other formalization, and use it to propose the improvement
A simple bullet point list that specifically points out the pros/cons of using SKOS vs. an ontology would be key in this scenario. If I use X I get Y,Z, if I use A I get B,C. Where YZBC are things the user wants to emphasize.
Matt
On Sat, Mar 11, 2017 at 8:50 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
At the TDWG meeting in December, I led an informative session describing the main points of the draft Standards Documentation Specification (SDS) and its sister standard, the draft Vocabulary Management Specification. At that session, some participants seem to be taken aback at the prescription by the SDS that controlled vocabularies should be SKOS concept schemes rather than ontologies. There wasn't enough time at the meeting to fully explore that issue and I hoped that it would come up for further discussion during the public comment period.
We are now midway through the 30 day public comment period on the SDS and so far, that issue has not come up. I was recently listening to the recording of John Wieczorek's nice Darwin Core Hour presentation on controlled vocabularies and it was apparent to me that the creation of controlled vocabularies is an issue of interest to many in the community. So I've written a blog post (http://baskauf.blogspot.com/2017/03/controlled-values-again.html) that attempts to explain in non-technical terms how the SDS specifies that controlled vocabularies should be expressed in machine-readable form. For those who are interested in the gory details, I've also included at the end a more detailed explanation of the rationale for specifying that controlled vocabularies should, in most cases, be described as SKOS concept schemes rather than ontologies. If you care about the creation of controlled vocabularies, you should take a look at this post and create an official comment if there are things you don't like about the approach taken in the proposed specification. Our review manager, Dag Endresen, has requested that issues be raised on the issue tracker at https://github.com/tdwg/vocab/issues . However, historically back-and-forth discussion about proposed standards has also taken place on this list, so I think that responding here for clarification and discussion would be appropriate prior to submitting an official comment.
Steve
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: PMB 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 322-4942 If you fax, please phone or email so that I will know to look for it. http://bioimages.vanderbilt.edu http://vanderbilt.edu/trees
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
participants (2)
-
Matt Yoder
-
Steve Baskauf