RE: tdwg-tag Digest, Vol 22, Issue 2
Sorry, I only saw the latest additions to this thread now.
The example database that Marcus presents represents a common way that people would be providing data. I think that if can agree on a common flat vocabulary of broad terms (as begun at the SPM workshop) we can at least provide his alphabetically sorted list of categorised statements back to a consumer (still pretty useful, I think). This could be expressed as a SKOS vocabulary as suggested by Roger but we could eventually take it further as he also suggests and develop and OWL ontology for reasoning over the categories.
Gregor asked if TAPIR finds "it easier to search elements having specific attribute values than elements having child elements with specific value" like so - <hasInformation> <InfoItem category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123"> <....> </InfoItem>
This could be important if we were to adopt a model where an InfoItem can have multiple categories, like so:
<hasInformation> <InfoItem>
<category="voc.x.org/DescriptiveConcepts/Dispersalhttp://some.resource.org/ 123"/
<category="voc.x.org/DescriptiveConcepts/Reproductionhttp://some.resource.o rg/123"/ <....> </InfoItem>
Éamonn
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of tdwg-tag-request@lists.tdwg.org Sent: 09 October 2007 12:00 To: tdwg-tag@lists.tdwg.org Subject: tdwg-tag Digest, Vol 22, Issue 2
Send tdwg-tag mailing list submissions to tdwg-tag@lists.tdwg.org
To subscribe or unsubscribe via the World Wide Web, visit http://lists.tdwg.org/mailman/listinfo/tdwg-tag or, via email, send a message with subject or body 'help' to tdwg-tag-request@lists.tdwg.org
You can reach the person managing the list at tdwg-tag-owner@lists.tdwg.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of tdwg-tag digest..."
Today's Topics:
1. Re: SPM Categories or Subclassing - again. (Gregor Hagedorn) 2. Re: SPM Categories or Subclassing - again. (D ? ring, Markus)
----------------------------------------------------------------------
Message: 1 Date: Mon, 8 Oct 2007 14:41:39 +0200 From: "Gregor Hagedorn" g.m.hagedorn@gmail.com Subject: Re: [tdwg-tag] SPM Categories or Subclassing - again. To: tdwg-tag@lists.tdwg.org Message-ID: 5ebbead70710080541l3cba340cyf2ebced50b4b9598@mail.gmail.com Content-Type: text/plain; charset="iso-8859-1"
In SDD we used what is here called the "tagging approach", i.e. for descriptive concepts and characters we use
<Text ref="some.resource.org/123"><Content>Free form text</Content></Text>
In SDD, all element/attribute names refer to classes/attributes in UML or Java, whereas the xml values are corresponding values. Expressing concepts and characters as classes was considered impractical if the aim is generic software (classes are creatable at runtime, but at a price) because the number of characters is often very large.
My experience shows that the number of character in identification sets varies between 50 and over 1000. Some studies in manual data integration (in LIAS) convinced me that the number of "integratable" characters is often less than imagined. A similar case study may also be the FRIDA data sets, which limit the number of common characters (common to all plant families) to 200, and from thereon use separate characters for each family.
Clearly, when limiting the approach to just "very high level concepts", both class and value approaches are possible. The benefit of the SPM 0.2 approach is that it does enable reasoning (as well as may simplify tapir usage). However, as presented in Bratislava, I am doubtful that this limit holds. In my experience the limit between free form text and categorical data tends be diffuse rather than sharp and my intuition is that the SPM mechanism, being extensible, will be extended.
I would like to note that the current TAG strategy is to provide both RDF and xml schema. The schema approach to SPM, however, seems to be more and more undesirable as the number of descriptive concepts grows.
I have a question for TAPIR:
Does TAPIR find it easier to search elements having specific attribute values than elements having child elements with specific value:
<hasInformation> <InfoItem category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123>">
<....> </InfoItem>
I might be useful to know. Although RDF allows this for literals, it seems to require the rdf:resource="" step for categorical data, so this may or may not help.
----------
PS: SDD calls the concepts "characters" based on the tradition and character is clearly inappropriate. However, Category to me seems to express nothing other than the type of the value. All contextValue and Value in SPM in version 0.2 refer to categories. Can anyone suggest a better term? What is the opposite of "value"?
I can think of "Class" (value = instance), or - but perhaps too much limited to descriptions - "Feature" (also in GML). Anything else?
Gregor
----------------------------------------------------- Gregor Hagedorn (G.M.Hagedorn@gmail.com) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Kvnigin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
Hi Guys
While I am far from understanding the technicalities, I do know that there must be a clear path from SKOS (if you start there) to Semantic Web technologies like OWL (lite or whatever). Maybe there is already.
Lee
Lee Belbin Manager, TDWG Infrastructure Project Email: lee@tdwg.org Phone: +61(0)419 374 133 -----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Eamonn O Tuama Sent: Wednesday, 10 October 2007 2:25 AM To: tdwg-tag@lists.tdwg.org Subject: [tdwg-tag] RE: tdwg-tag Digest, Vol 22, Issue 2
Sorry, I only saw the latest additions to this thread now.
The example database that Marcus presents represents a common way that people would be providing data. I think that if can agree on a common flat vocabulary of broad terms (as begun at the SPM workshop) we can at least provide his alphabetically sorted list of categorised statements back to a consumer (still pretty useful, I think). This could be expressed as a SKOS vocabulary as suggested by Roger but we could eventually take it further as he also suggests and develop and OWL ontology for reasoning over the categories.
Gregor asked if TAPIR finds "it easier to search elements having specific attribute values than elements having child elements with specific value" like so - <hasInformation> <InfoItem category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123"> <....> </InfoItem>
This could be important if we were to adopt a model where an InfoItem can have multiple categories, like so:
<hasInformation> <InfoItem>
<category="voc.x.org/DescriptiveConcepts/Dispersalhttp://some.resource.org/ 123"/
<category="voc.x.org/DescriptiveConcepts/Reproductionhttp://some.resource.o rg/123"/ <....> </InfoItem>
Éamonn
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of tdwg-tag-request@lists.tdwg.org Sent: 09 October 2007 12:00 To: tdwg-tag@lists.tdwg.org Subject: tdwg-tag Digest, Vol 22, Issue 2
Send tdwg-tag mailing list submissions to tdwg-tag@lists.tdwg.org
To subscribe or unsubscribe via the World Wide Web, visit http://lists.tdwg.org/mailman/listinfo/tdwg-tag or, via email, send a message with subject or body 'help' to tdwg-tag-request@lists.tdwg.org
You can reach the person managing the list at tdwg-tag-owner@lists.tdwg.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of tdwg-tag digest..."
Today's Topics:
1. Re: SPM Categories or Subclassing - again. (Gregor Hagedorn) 2. Re: SPM Categories or Subclassing - again. (D ? ring, Markus)
----------------------------------------------------------------------
Message: 1 Date: Mon, 8 Oct 2007 14:41:39 +0200 From: "Gregor Hagedorn" g.m.hagedorn@gmail.com Subject: Re: [tdwg-tag] SPM Categories or Subclassing - again. To: tdwg-tag@lists.tdwg.org Message-ID: 5ebbead70710080541l3cba340cyf2ebced50b4b9598@mail.gmail.com Content-Type: text/plain; charset="iso-8859-1"
In SDD we used what is here called the "tagging approach", i.e. for descriptive concepts and characters we use
<Text ref="some.resource.org/123"><Content>Free form text</Content></Text>
In SDD, all element/attribute names refer to classes/attributes in UML or Java, whereas the xml values are corresponding values. Expressing concepts and characters as classes was considered impractical if the aim is generic software (classes are creatable at runtime, but at a price) because the number of characters is often very large.
My experience shows that the number of character in identification sets varies between 50 and over 1000. Some studies in manual data integration (in LIAS) convinced me that the number of "integratable" characters is often less than imagined. A similar case study may also be the FRIDA data sets, which limit the number of common characters (common to all plant families) to 200, and from thereon use separate characters for each family.
Clearly, when limiting the approach to just "very high level concepts", both class and value approaches are possible. The benefit of the SPM 0.2 approach is that it does enable reasoning (as well as may simplify tapir usage). However, as presented in Bratislava, I am doubtful that this limit holds. In my experience the limit between free form text and categorical data tends be diffuse rather than sharp and my intuition is that the SPM mechanism, being extensible, will be extended.
I would like to note that the current TAG strategy is to provide both RDF and xml schema. The schema approach to SPM, however, seems to be more and more undesirable as the number of descriptive concepts grows.
I have a question for TAPIR:
Does TAPIR find it easier to search elements having specific attribute values than elements having child elements with specific value:
<hasInformation> <InfoItem category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123>">
<....> </InfoItem>
I might be useful to know. Although RDF allows this for literals, it seems to require the rdf:resource="" step for categorical data, so this may or may not help.
----------
PS: SDD calls the concepts "characters" based on the tradition and character is clearly inappropriate. However, Category to me seems to express nothing other than the type of the value. All contextValue and Value in SPM in version 0.2 refer to categories. Can anyone suggest a better term? What is the opposite of "value"?
I can think of "Class" (value = instance), or - but perhaps too much limited to descriptions - "Feature" (also in GML). Anything else?
Gregor
----------------------------------------------------- Gregor Hagedorn (G.M.Hagedorn@gmail.com) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Kvnigin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
SKOS Compatibility with OWL is a requirement under discussion, but not yet accepted as of http://www.w3.org/TR/2007/WD-skos-ucr-20070516/#Candidate
Suggested design patterns for doing so are in http://isegserv.itd.rl.ac.uk/public/skos/2007/10/f2f/skos-owl-patterns.html
All of this was to have been discussed at a meeting last weekend http://www.w3.org/2006/07/SWD/wiki/AmsterdamAgenda
SKOS claims to be about "concepts" in general, but most of its history, and the most successful examples, seem to be about modeling thesauri, not descriptions. See especially the SKOS vs OWL discussion in the SKOS Core Guide http://www.w3.org/TR/swbp-skos-core-guide/#secmodellingrdf There the concept of Henry VIII is painted as a SKOS concern, while the description of Henry VIII as an OWL concern. The most telling part of the Guide is the paragraph: "There is a subtle difference between SKOS Core and other RDF applications like FOAF [FOAF], in terms of what they allow you to model. SKOS Core allows you to model a set of concepts (essentially a set of meanings) as an RDF graph. Other RDF applications, such as FOAF, allow you to model things like people, organisations, places etc. as an RDF graph. Technically, SKOS Core introduces a layer of indirection into the modelling." At the moment, I would believe this means SKOS probably would do OK about taxa but not specimens, ecosystem generalities but not specific ecosystems, and not observations, etc. etc.,
The current SKOS draft, http://www.w3.org/TR/swbp-skos-core-spec/, is heavily slanted towards properties and away from classes. Such a slant in SPM 0.1 was, I thought, what brought us to this point in the first place, so SKOS could end up being part of the problem, not the solution.
The number of distinct authors appearing in http://www.w3.org/2004/02/skos/references is alarmingly small for something TDWG would encourage.
The tools seem few and narrowly focused. http://esw.w3.org/topic/SkosDev/ToolShed (So, what else is new....)
An interesting discussion is at http://ontolog.cim3.net/forum/ontologizing/2007-06/msg00006.html
Bob
On 10/10/07, Lee Belbin leebel@netspace.net.au wrote:
Hi Guys
While I am far from understanding the technicalities, I do know that there must be a clear path from SKOS (if you start there) to Semantic Web technologies like OWL (lite or whatever). Maybe there is already.
Lee
Lee Belbin Manager, TDWG Infrastructure Project Email: lee@tdwg.org Phone: +61(0)419 374 133 -----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Eamonn O Tuama Sent: Wednesday, 10 October 2007 2:25 AM To: tdwg-tag@lists.tdwg.org Subject: [tdwg-tag] RE: tdwg-tag Digest, Vol 22, Issue 2
Sorry, I only saw the latest additions to this thread now.
The example database that Marcus presents represents a common way that people would be providing data. I think that if can agree on a common flat vocabulary of broad terms (as begun at the SPM workshop) we can at least provide his alphabetically sorted list of categorised statements back to a consumer (still pretty useful, I think). This could be expressed as a SKOS vocabulary as suggested by Roger but we could eventually take it further as he also suggests and develop and OWL ontology for reasoning over the categories.
Gregor asked if TAPIR finds "it easier to search elements having specific attribute values than elements having child elements with specific value" like so -
<hasInformation> <InfoItem category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123"> <....> </InfoItem>
This could be important if we were to adopt a model where an InfoItem can have multiple categories, like so:
<hasInformation> <InfoItem>
<category="voc.x.org/DescriptiveConcepts/Dispersalhttp://some.resource.org/ 123"/
<category="voc.x.org/DescriptiveConcepts/Reproductionhttp://some.resource.o rg/123"/ <....>
</InfoItem>
Éamonn
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of tdwg-tag-request@lists.tdwg.org Sent: 09 October 2007 12:00 To: tdwg-tag@lists.tdwg.org Subject: tdwg-tag Digest, Vol 22, Issue 2
Send tdwg-tag mailing list submissions to tdwg-tag@lists.tdwg.org
To subscribe or unsubscribe via the World Wide Web, visit http://lists.tdwg.org/mailman/listinfo/tdwg-tag or, via email, send a message with subject or body 'help' to tdwg-tag-request@lists.tdwg.org
You can reach the person managing the list at tdwg-tag-owner@lists.tdwg.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of tdwg-tag digest..."
Today's Topics:
- Re: SPM Categories or Subclassing - again. (Gregor Hagedorn)
- Re: SPM Categories or Subclassing - again. (D ? ring, Markus)
Message: 1 Date: Mon, 8 Oct 2007 14:41:39 +0200 From: "Gregor Hagedorn" g.m.hagedorn@gmail.com Subject: Re: [tdwg-tag] SPM Categories or Subclassing - again. To: tdwg-tag@lists.tdwg.org Message-ID: 5ebbead70710080541l3cba340cyf2ebced50b4b9598@mail.gmail.com Content-Type: text/plain; charset="iso-8859-1"
In SDD we used what is here called the "tagging approach", i.e. for descriptive concepts and characters we use
<Text ref="some.resource.org/123"><Content>Free form text</Content></Text>
In SDD, all element/attribute names refer to classes/attributes in UML or Java, whereas the xml values are corresponding values. Expressing concepts and characters as classes was considered impractical if the aim is generic software (classes are creatable at runtime, but at a price) because the number of characters is often very large.
My experience shows that the number of character in identification sets varies between 50 and over 1000. Some studies in manual data integration (in LIAS) convinced me that the number of "integratable" characters is often less than imagined. A similar case study may also be the FRIDA data sets, which limit the number of common characters (common to all plant families) to 200, and from thereon use separate characters for each family.
Clearly, when limiting the approach to just "very high level concepts", both class and value approaches are possible. The benefit of the SPM 0.2 approach is that it does enable reasoning (as well as may simplify tapir usage). However, as presented in Bratislava, I am doubtful that this limit holds. In my experience the limit between free form text and categorical data tends be diffuse rather than sharp and my intuition is that the SPM mechanism, being extensible, will be extended.
I would like to note that the current TAG strategy is to provide both RDF and xml schema. The schema approach to SPM, however, seems to be more and more undesirable as the number of descriptive concepts grows.
I have a question for TAPIR:
Does TAPIR find it easier to search elements having specific attribute values than elements having child elements with specific value:
<hasInformation> <InfoItem category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123>">
<....>
</InfoItem>
I might be useful to know. Although RDF allows this for literals, it seems to require the rdf:resource="" step for categorical data, so this may or may not help.
PS: SDD calls the concepts "characters" based on the tradition and character is clearly inappropriate. However, Category to me seems to express nothing other than the type of the value. All contextValue and Value in SPM in version 0.2 refer to categories. Can anyone suggest a better term? What is the opposite of "value"?
I can think of "Class" (value = instance), or - but perhaps too much limited to descriptions - "Feature" (also in GML). Anything else?
Gregor
Gregor Hagedorn (G.M.Hagedorn@gmail.com) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Kvnigin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
One of the most interesting (but elusive) documents I came across while educating myself about SKOS is the Working Draft "Best Practice Recipes for Publishing RDF Vocabularies" http://www.w3.org/TR/swbp-vocab-pub/
I see a reference Roger made to it in tdwg-tapir mailing list, but no other TDWG discussion of it. Perhaps TAG should look at it for TDWG applicability.
Gregor will especially like it. :-)
Bob
http://www.w3.org/TR/swbp-vocab-pub/ Gregor will especially like it. :-)
:-)
Indeed, the document convinced me that http+purls+content negotiation (which is the chosen protocol for the semantic web!) would have been a better choice than LSIDs. The document is referenced on several Wiki pages, e.g. http://wiki.tdwg.org/twiki/bin/view/GUID/CommunityPracticesForHttp-basedGUID...; it helped me understand why the semantic web can be based on http at all and that the perceived problems, that initially lead us to opt for LSIDs, probably do not exist.
Gregor
participants (4)
-
Bob Morris
-
Eamonn O Tuama
-
Gregor Hagedorn
-
Lee Belbin