RE: tdwg-tag Digest, Vol 22, Issue 2
Marcus,
I'm trying to relate your revised model below to the present description on http://rs.tdwg.org/ontology/voc/SpeciesProfileModel.rdf .
You state - "The problem we saw with TAPIR is that there are only 2 new concepts defined by SPM - category and value. If you do not want to map anything more than this in TAPIR models, you probably can produce SPM RDF. "
But what about the other properties defined for InfoItem - this is the full list: associatedTaxon category context contextOccurrence contextValue hasContent hasValue
Are you suggesting that we drop 'associatedTaxon', the three context properties (context, contextOccurrence, contextValue) and the 'hasValue' property (information about a taxon in the form of a controlled vocabulary term), leaving just the 'category' and 'hasContent'. Is it possible to add additional columns in your database to accommodate these fields or do we run into trouble with TAPIR?
Éamonn
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of tdwg-tag-request@lists.tdwg.org Sent: 09 October 2007 12:00 To: tdwg-tag@lists.tdwg.org Subject: tdwg-tag Digest, Vol 22, Issue 2
Send tdwg-tag mailing list submissions to tdwg-tag@lists.tdwg.org
To subscribe or unsubscribe via the World Wide Web, visit http://lists.tdwg.org/mailman/listinfo/tdwg-tag or, via email, send a message with subject or body 'help' to tdwg-tag-request@lists.tdwg.org
You can reach the person managing the list at tdwg-tag-owner@lists.tdwg.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of tdwg-tag digest..."
Today's Topics:
1. Re: SPM Categories or Subclassing - again. (Gregor Hagedorn) 2. Re: SPM Categories or Subclassing - again. (D ? ring, Markus)
----------------------------------------------------------------------
Message: 1 Date: Mon, 8 Oct 2007 14:41:39 +0200 From: "Gregor Hagedorn" g.m.hagedorn@gmail.com Subject: Re: [tdwg-tag] SPM Categories or Subclassing - again. To: tdwg-tag@lists.tdwg.org Message-ID: 5ebbead70710080541l3cba340cyf2ebced50b4b9598@mail.gmail.com Content-Type: text/plain; charset="iso-8859-1"
In SDD we used what is here called the "tagging approach", i.e. for descriptive concepts and characters we use
<Text ref="some.resource.org/123"><Content>Free form text</Content></Text>
In SDD, all element/attribute names refer to classes/attributes in UML or Java, whereas the xml values are corresponding values. Expressing concepts and characters as classes was considered impractical if the aim is generic software (classes are creatable at runtime, but at a price) because the number of characters is often very large.
My experience shows that the number of character in identification sets varies between 50 and over 1000. Some studies in manual data integration (in LIAS) convinced me that the number of "integratable" characters is often less than imagined. A similar case study may also be the FRIDA data sets, which limit the number of common characters (common to all plant families) to 200, and from thereon use separate characters for each family.
Clearly, when limiting the approach to just "very high level concepts", both class and value approaches are possible. The benefit of the SPM 0.2 approach is that it does enable reasoning (as well as may simplify tapir usage). However, as presented in Bratislava, I am doubtful that this limit holds. In my experience the limit between free form text and categorical data tends be diffuse rather than sharp and my intuition is that the SPM mechanism, being extensible, will be extended.
I would like to note that the current TAG strategy is to provide both RDF and xml schema. The schema approach to SPM, however, seems to be more and more undesirable as the number of descriptive concepts grows.
I have a question for TAPIR:
Does TAPIR find it easier to search elements having specific attribute values than elements having child elements with specific value:
<hasInformation> <InfoItem category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123>">
<....> </InfoItem>
I might be useful to know. Although RDF allows this for literals, it seems to require the rdf:resource="" step for categorical data, so this may or may not help.
----------
PS: SDD calls the concepts "characters" based on the tradition and character is clearly inappropriate. However, Category to me seems to express nothing other than the type of the value. All contextValue and Value in SPM in version 0.2 refer to categories. Can anyone suggest a better term? What is the opposite of "value"?
I can think of "Class" (value = instance), or - but perhaps too much limited to descriptions - "Feature" (also in GML). Anything else?
Gregor
----------------------------------------------------- Gregor Hagedorn (G.M.Hagedorn@gmail.com) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Kvnigin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
Éamonn, this was just a simplified example to illustrate the TAPIR problem. It was not meant to be a new suggestion of any kind.
Although if you have read my last post in response to Gregor, I tend to think that we would be better off creating a basic set of 3 or 4 InfoItem classes which are more concise than having different options for expressing the same thing. If you look at the existing properties don't they boil down to 4 attributes?
associatedTaxon -> ORGANISM INTERACTION
category -> TAG context contextValue
contextOccurrence -> REGION
hasContent -> INFO hasValue
Markus
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org on behalf of Eamonn O Tuama Sent: Wed 10/10/2007 2:39 PM To: tdwg-tag@lists.tdwg.org Cc: Subject: [tdwg-tag] RE: tdwg-tag Digest, Vol 22, Issue 2
Marcus,
I'm trying to relate your revised model below to the present description on http://rs.tdwg.org/ontology/voc/SpeciesProfileModel.rdf .
You state - "The problem we saw with TAPIR is that there are only 2 new concepts defined by SPM - category and value. If you do not want to map anything more than this in TAPIR models, you probably can produce SPM RDF. "
But what about the other properties defined for InfoItem - this is the full list: associatedTaxon category context contextOccurrence contextValue hasContent hasValue
Are you suggesting that we drop 'associatedTaxon', the three context properties (context, contextOccurrence, contextValue) and the 'hasValue' property (information about a taxon in the form of a controlled vocabulary term), leaving just the 'category' and 'hasContent'. Is it possible to add additional columns in your database to accommodate these fields or do we run into trouble with TAPIR?
Éamonn
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of tdwg-tag-request@lists.tdwg.org Sent: 09 October 2007 12:00 To: tdwg-tag@lists.tdwg.org Subject: tdwg-tag Digest, Vol 22, Issue 2
Send tdwg-tag mailing list submissions to tdwg-tag@lists.tdwg.org
To subscribe or unsubscribe via the World Wide Web, visit http://lists.tdwg.org/mailman/listinfo/tdwg-tag or, via email, send a message with subject or body 'help' to tdwg-tag-request@lists.tdwg.org
You can reach the person managing the list at tdwg-tag-owner@lists.tdwg.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of tdwg-tag digest..."
Today's Topics:
1. Re: SPM Categories or Subclassing - again. (Gregor Hagedorn) 2. Re: SPM Categories or Subclassing - again. (D ? ring, Markus)
----------------------------------------------------------------------
Message: 1 Date: Mon, 8 Oct 2007 14:41:39 +0200 From: "Gregor Hagedorn" g.m.hagedorn@gmail.com Subject: Re: [tdwg-tag] SPM Categories or Subclassing - again. To: tdwg-tag@lists.tdwg.org Message-ID: 5ebbead70710080541l3cba340cyf2ebced50b4b9598@mail.gmail.com Content-Type: text/plain; charset="iso-8859-1"
In SDD we used what is here called the "tagging approach", i.e. for descriptive concepts and characters we use
<Text ref="some.resource.org/123"><Content>Free form text</Content></Text>
In SDD, all element/attribute names refer to classes/attributes in UML or Java, whereas the xml values are corresponding values. Expressing concepts and characters as classes was considered impractical if the aim is generic software (classes are creatable at runtime, but at a price) because the number of characters is often very large.
My experience shows that the number of character in identification sets varies between 50 and over 1000. Some studies in manual data integration (in LIAS) convinced me that the number of "integratable" characters is often less than imagined. A similar case study may also be the FRIDA data sets, which limit the number of common characters (common to all plant families) to 200, and from thereon use separate characters for each family.
Clearly, when limiting the approach to just "very high level concepts", both class and value approaches are possible. The benefit of the SPM 0.2 approach is that it does enable reasoning (as well as may simplify tapir usage). However, as presented in Bratislava, I am doubtful that this limit holds. In my experience the limit between free form text and categorical data tends be diffuse rather than sharp and my intuition is that the SPM mechanism, being extensible, will be extended.
I would like to note that the current TAG strategy is to provide both RDF and xml schema. The schema approach to SPM, however, seems to be more and more undesirable as the number of descriptive concepts grows.
I have a question for TAPIR:
Does TAPIR find it easier to search elements having specific attribute values than elements having child elements with specific value:
<hasInformation> <InfoItem category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123>">
<....> </InfoItem>
I might be useful to know. Although RDF allows this for literals, it seems to require the rdf:resource="" step for categorical data, so this may or may not help.
----------
PS: SDD calls the concepts "characters" based on the tradition and character is clearly inappropriate. However, Category to me seems to express nothing other than the type of the value. All contextValue and Value in SPM in version 0.2 refer to categories. Can anyone suggest a better term? What is the opposite of "value"?
I can think of "Class" (value = instance), or - but perhaps too much limited to descriptions - "Feature" (also in GML). Anything else?
Gregor
----------------------------------------------------- Gregor Hagedorn (G.M.Hagedorn@gmail.com) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Kvnigin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
participants (2)
-
Döring, Markus
-
Eamonn O Tuama