[tdwg-tag] RE: tdwg-tag Digest, Vol 22, Issue 2

Bob Morris morris.bob at gmail.com
Wed Oct 10 07:13:30 CEST 2007


SKOS Compatibility with OWL is a requirement under discussion, but not
yet accepted as of
http://www.w3.org/TR/2007/WD-skos-ucr-20070516/#Candidate

Suggested design patterns for doing so are in
http://isegserv.itd.rl.ac.uk/public/skos/2007/10/f2f/skos-owl-patterns.html

All of this was to have been discussed at a meeting last weekend
http://www.w3.org/2006/07/SWD/wiki/AmsterdamAgenda

SKOS claims to be about "concepts" in general, but most of its
history, and the most successful examples, seem to be about modeling
thesauri, not descriptions. See especially the SKOS vs OWL discussion
in the SKOS Core Guide
http://www.w3.org/TR/swbp-skos-core-guide/#secmodellingrdf   There the
concept of Henry VIII is painted as a SKOS concern, while the
description of Henry VIII as an OWL concern.
The most telling part of the Guide is the paragraph: "There is a
subtle difference between SKOS Core and other RDF applications like
FOAF [FOAF], in terms of what they allow you to model. SKOS Core
allows you to model a set of concepts (essentially a set of meanings)
as an RDF graph. Other RDF applications, such as FOAF, allow you to
model things like people, organisations, places etc. as an RDF graph.
Technically, SKOS Core introduces a layer of indirection into the
modelling."  At the moment, I would believe this means SKOS probably
would do OK about taxa but not specimens, ecosystem generalities but
not specific ecosystems, and not observations, etc. etc.,

The current SKOS draft, http://www.w3.org/TR/swbp-skos-core-spec/, is
heavily slanted towards properties and away from classes. Such a slant
in SPM 0.1 was, I thought, what brought us to this point in the first
place, so SKOS could end up being part of the problem, not the
solution.

The number of distinct authors appearing in
http://www.w3.org/2004/02/skos/references is alarmingly small for
something TDWG would encourage.

The tools seem few and narrowly focused.
http://esw.w3.org/topic/SkosDev/ToolShed (So, what else is new....)

An interesting discussion is at
http://ontolog.cim3.net/forum/ontologizing/2007-06/msg00006.html


Bob





On 10/10/07, Lee Belbin <leebel at netspace.net.au> wrote:
> Hi Guys
>
> While I am far from understanding the technicalities, I do know that there
> must be a clear path from SKOS (if you start there) to Semantic Web
> technologies like OWL (lite or whatever). Maybe there is already.
>
> Lee
>
>
> Lee Belbin
> Manager, TDWG Infrastructure Project
> Email: lee at tdwg.org
> Phone: +61(0)419 374 133
> -----Original Message-----
> From: tdwg-tag-bounces at lists.tdwg.org
> [mailto:tdwg-tag-bounces at lists.tdwg.org] On Behalf Of Eamonn O Tuama
> Sent: Wednesday, 10 October 2007 2:25 AM
> To: tdwg-tag at lists.tdwg.org
> Subject: [tdwg-tag] RE: tdwg-tag Digest, Vol 22, Issue 2
>
> Sorry, I only saw the latest additions to this thread now.
>
> The example database that Marcus presents represents a common way that
> people would be providing data. I think that if can agree on a common flat
> vocabulary of broad terms (as begun at the SPM workshop) we can at least
> provide his alphabetically sorted list of categorised statements back to a
> consumer (still pretty useful, I think). This could be expressed as a SKOS
> vocabulary as suggested by Roger but we could eventually take it further as
> he also suggests and develop and OWL ontology for reasoning over the
> categories.
>
> Gregor asked if TAPIR finds  "it easier to search elements having specific
> attribute values than elements having child elements with specific value"
> like so -
> <hasInformation>
>  <InfoItem
> category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123">
>    <....>
>   </InfoItem>
>
> This could be important if we were to adopt a model where an InfoItem can
> have multiple categories, like so:
>
> <hasInformation>
>  <InfoItem>
>
> <category="voc.x.org/DescriptiveConcepts/Dispersal<http://some.resource.org/
> 123"/>
>
> <category="voc.x.org/DescriptiveConcepts/Reproduction<http://some.resource.o
> rg/123"/>
>    <....>
>   </InfoItem>
>
> Éamonn
>
> -----Original Message-----
> From: tdwg-tag-bounces at lists.tdwg.org
> [mailto:tdwg-tag-bounces at lists.tdwg.org] On Behalf Of
> tdwg-tag-request at lists.tdwg.org
> Sent: 09 October 2007 12:00
> To: tdwg-tag at lists.tdwg.org
> Subject: tdwg-tag Digest, Vol 22, Issue 2
>
> Send tdwg-tag mailing list submissions to
>         tdwg-tag at lists.tdwg.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://lists.tdwg.org/mailman/listinfo/tdwg-tag
> or, via email, send a message with subject or body 'help' to
>         tdwg-tag-request at lists.tdwg.org
>
> You can reach the person managing the list at
>         tdwg-tag-owner at lists.tdwg.org
>
> When replying, please edit your Subject line so it is more specific than
> "Re: Contents of tdwg-tag digest..."
>
>
> Today's Topics:
>
>    1. Re: SPM Categories or Subclassing - again. (Gregor Hagedorn)
>    2. Re: SPM Categories or Subclassing - again. (D ? ring, Markus)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 8 Oct 2007 14:41:39 +0200
> From: "Gregor Hagedorn" <g.m.hagedorn at gmail.com>
> Subject: Re: [tdwg-tag] SPM Categories or Subclassing - again.
> To: tdwg-tag at lists.tdwg.org
> Message-ID:
>         <5ebbead70710080541l3cba340cyf2ebced50b4b9598 at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> In SDD we used what is here called the "tagging approach", i.e. for
> descriptive concepts and characters we use
>
> <Text ref="some.resource.org/123"><Content>Free form text</Content></Text>
>
> In SDD, all element/attribute names refer to classes/attributes in UML or
> Java, whereas the xml values are corresponding values. Expressing concepts
> and characters as classes was considered impractical if the aim is generic
> software (classes are creatable at runtime, but at a price) because the
> number of characters is often very large.
>
> My experience shows that the number of character in identification sets
> varies between 50 and over 1000. Some studies in manual data integration (in
> LIAS) convinced me that the number of "integratable" characters is often
> less than imagined. A similar case study may also be the FRIDA data sets,
> which limit the number of common characters (common to all plant families)
> to 200, and from thereon use separate characters for each family.
>
> Clearly, when limiting the approach to just "very high level concepts", both
> class and value approaches are possible. The benefit of the SPM 0.2 approach
> is that it does enable reasoning (as well as may simplify tapir usage).
> However, as presented in Bratislava, I am doubtful that this limit holds. In
> my experience the limit between free form text and categorical data tends be
> diffuse rather than sharp and my intuition is that the SPM mechanism, being
> extensible, will be extended.
>
> I would like to note that the current TAG strategy is to provide both RDF
> and xml schema. The schema approach to SPM, however, seems to be more and
> more undesirable as the number of descriptive concepts grows.
>
> I have a question for TAPIR:
>
> Does TAPIR find it easier to search elements having specific attribute
> values than elements having child elements with specific value:
>
> <hasInformation>
>  <InfoItem
> category="voc.x.org/DescriptiveConcepts/Size<http://some.resource.org/123>">
>
>    <....>
>   </InfoItem>
>
> I might be useful to know. Although RDF allows this for literals, it seems
> to require the rdf:resource="" step for categorical data, so this may or may
> not help.
>
> ----------
>
> PS: SDD calls the concepts "characters" based on the tradition and character
> is clearly inappropriate. However, Category to me seems to express nothing
> other than the type of the value. All contextValue and Value in SPM in
> version 0.2 refer to categories. Can anyone suggest a better term? What is
> the opposite of "value"?
>
> I can think of "Class" (value = instance), or - but perhaps too much limited
> to descriptions - "Feature" (also in GML). Anything else?
>
> Gregor
>
> -----------------------------------------------------
> Gregor Hagedorn (G.M.Hagedorn at gmail.com) Institute for Plant Virology,
> Microbiology, and Biosafety Federal Research Center for Agriculture and
> Forestry (BBA)
> Kvnigin-Luise-Str. 19      Tel: +49-30-8304-2220
> 14195 Berlin, Germany      Fax: +49-30-8304-2203
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20071008/a754f2c7/attac
> hment-0001.html
>
> ------------------------------
>
> Message: 2
> Date: Mon, 08 Oct 2007 16:39:16 +0200
> From: "D ? ring, Markus" <m.doering at BGBM.org>
> Subject: Re: [tdwg-tag] SPM Categories or Subclassing - again.
> To: Donald Hobern <dhobern at gbif.org>,   Roger Hyam <roger at tdwg.org>
> Cc: "Kohlbecker, Andreas" <a.kohlbecker at BGBM.org>,
>         tdwg-tag at lists.tdwg.org
> Message-ID: <C3300DB4.76C3%m.doering at BGBM.org>
> Content-Type: text/plain;       charset="US-ASCII"
>
> Donald, Roger,
> The problem we saw with TAPIR is that there are only 2 new concepts defined
> by SPM - category and value. If you do not want to map anything more than
> this in TAPIR models, you probably can produce SPM RDF. But if you want to
> only map the descriptions or distributions you are in trouble.
>
> Imagine a simple database table with 3 columns: taxon, category and value.
> You could map them to 3 concepts namely SPM/taxon, SPM/category and
> SPM/value. You can then create an output model similar to this one here:
>
> <outputModel
>     <structure>
>       <schema location="http://rs.tdwg.org/tapir/rs/SPM.xsd"/>
>     </structure>
>     <indexingElement path="/SpeciesProfileModel/hasInformation/InfoItem"/>
>     <mapping>
>         <node path="/SpeciesProfileModel/aboutTaxon/@rdf:resource">
>             <concept id="SPM/taxon"/>
>         </node>
>         <node
> path="/SpeciesProfileModel/hasInformation/InfoItem/category/@rdf:resource">
>             <concept id="SPM/category"/>
>         </node>
>         <node path="/SpeciesProfileModel/hasInformation/InfoItem/value">
>             <concept id="SPM/value"/>
>         </node>
>     </mapping>
> </outputModel>
>
>
> With current provider implementations you will get a new SpeciesProfileModel
> element for every record in your database, i.e. repeating taxa for every
> InfoItem. This is valid SPM and you can actually search on it too with TAPIR
> like Donald wanted to. So here you go. You would just need to know (and
> understand) what categories this provider is using. You can get those
> through an inventory instead of the capabilities, not a real problem.
>
> If someone has a table with a taxon and a colum for description, one for
> economic use and one for something else they would have to transform it into
> a simple 3 column table.
>
> My understanding of SPM in the beginning was just that I thought an initial
> broad vocabulary should be part of SPM. I still think this is very much
> needed, otherwise you all "mashup" will be done by the human enduser who is
> presented with a nice long list of facts, sorted alphabetically.
>
>
> Personally I am reluctant which approach to take. The tagging seems more
> flexible, so yes, lets go for it!
>
> But I cant see why OWL classes are not suited for defining a reusable,
> external vocabulary. Isn't this what they are made for and what GO and all
> the other ontologies use them for rather than just defining data exchange
> formats? At least the tagging approach would free us from imposing a certain
> definition language. I guess some can use SKOS, some OWL and some others
> just RDFS. And of course there will be many different semantic definitions
> in all those languages.
>
>
> Markus
>
>
>
>
>
>
> "Donald Hobern" wrote on 08.10.2007 13:53 Uhr:
>
> > Roger,
> >
> > Your analysis (and the four points) seem perfectly correct to me.  I
> prefer a
> > plain tagging approach and am particularly dubious about the long term
> > benefits of subclasses which represent multiple inheritance because this
> would
> > seem to place a greater burden on consumers.
> >
> > However I am not so sure that the tagging approach does prevent the use of
> > TAPIR.  Surely the user is expected to be searching for InfoItems and each
> > InfoItem will include a relative path "./category/@rdf:resource" which
> could
> > be searchable.  In other words you could search for InfoItems where the
> > concept identified by "./category/@rdf:resource" has the value
> > "some.resource.org/123". Won't that work?  Won't it return any InfoItem
> which
> > has any combination of category elements provided at least one is
> identified
> > to the given resource?
> >
> > Of course searching for InfoItems like this may prevent us from using
> search
> > elements outside the scope of the InfoItem - would this stop us from using
> a
> > ScientificName specified against the SpeciesProfileModel element?  In
> other
> > words, would it be better to move the aboutTaxon down to the InfoItem?
> >
> > Donald
> >
> > Roger Hyam wrote:
> >> Hi All,
> >>
> >> I'd like to re-ignite a debate we had over the summer in the light of
> what
> >> was discussed at Bratislava and things I am thinking about now.
> >>
> >> The original way the Species Profile Model was structure was to have
> >> something like the attached UML diagram tagging.png
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> We didn't define a class "Category" but just left it as being any valid
> URI
> >> in the RDF style of doing things.
> >>
> >> Markus and several others pointed out that this would not work with TAPIR
> >> very well as all the InfoItems will look the same. Something like this:
> >>
> >> <SpeciesProfileModel>
> >>     <hasInformation>
> >>         <InfoItem>
> >>             <category rdf:resource="some.resource.org/123" />
> >>             <....>
> >>         </InfoItem>
> >>     </hasInformation>
> >>     <hasInformation>
> >>         <InfoItem>
> >>             <category rdf:resource="some.resource.org/124" />
> >>             <....>
> >>         </InfoItem>
> >>     </hasInformation>
> >>     <hasInformation>
> >>         <InfoItem>
> >>             <category rdf:resource="some.resource.org/125" />
> >>             <....>
> >>         </InfoItem>
> >>     </hasInformation>
> >> </SpeciesProfileModel>
> >>
> >> The xpaths to the data represented by <...> would all be the same.
> >>
> >> They suggested something like the next diagram:
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> Where info item is subclassed. This allows RDF to be serialized like
> this:
> >>
> >> <SpeciesProfileModel>
> >>     <hasInformation>
> >>         <Ecology>
> >>             <....>
> >>         </Ecology >
> >>     </hasInformation>
> >>     <hasInformation>
> >>         <Behaviour>
> >>             <....>
> >>         </Behaviour>
> >>     </hasInformation>
> >>     <hasInformation>
> >>         <BehaviouralEcology>
> >>             <....>
> >>         </BehaviouralEcology>
> >>     </hasInformation>
> >> </SpeciesProfileModel>
> >>
> >> The xpaths to the <...> are all different and the model becomes TAPIR
> >> friendly.
> >>
> >> At the time I wasn't too bothered either way because I saw this as an
> >> artifact of serialization. The same thing could be written.
> >>
> >> <SpeciesProfileModel>
> >>     <hasInformation>
> >>         <InfoItem>
> >>             <rdf:type rdf:resource="some.resource.org/Ecology" />
> >>             <....>
> >>         </InfoItem>
> >>     </hasInformation>
> >>     <hasInformation>
> >>         <InfoItem>
> >>             <rdf:type rdf:resource="some.resource.org/Behaviour" />
> >>             <....>
> >>         </InfoItem>
> >>     </hasInformation>
> >>     <hasInformation>
> >>         <InfoItem>
> >>             <rdf:type rdf:resource="some.resource.org/BehaviouralEcology"
> />
> >>             <....>
> >>         </InfoItem>
> >>     </hasInformation>
> >> </SpeciesProfileModel>
> >>
> >> With more or less that same meaning and looking just like a tagging
> example.
> >>
> >> This was a mistake on my part because of course it isn't the same meaning
> >> just a similar serialization - an RDF type really needs to point to a
> class
> >> and can't point to any old vocabulary that it could if it was treated
> like a
> >> tag.
> >>
> >> I have just added the category property back into InfoItem for discussion
> >> purposes.
> >>
> >> I am concerned about going down the subclassing route for a few reasons.
> >>
> >> 1) Building a class hierarchy is difficult. The discussions we had in
> >> Bratislava highlighted this. The examples put together in the current
> >> vocabulary sparked a debate as to how things should be organized. There
> are
> >> issues with the diamond problem in multiple inheritance - if
> contradictory
> >> semantics occur in multiple inheritance routes to a class which has
> priority?
> >> If we don't have multiple inheritance how do we have InfoItems that are
> about
> >> both Ecology and Behaviour for example or any other two subjects such as
> >> Dispersal and Asexual reproduction.
> >>
> >> 2) If a class hierarchy is required for analysis/inference it can be
> >> superimposed on a tag based transfer protocol using OWL necessary and
> >> sufficient properties. In fact this is arguably a better way of
> approaching
> >> the situation than building a class hierarchy a priori.
> >>
> >> 3) The motivation for going down the subclassing route is largely so it
> will
> >> work with the transport protocol - which sets alarm bells ringing for me.
> >>
> >> 4) It precludes us from using something like SKOS  to build our
> categories.
> >>
> >> "SKOS or Simple Knowledge Organisation System is a family of formal
> languages
> >> designed for representation of thesauri, classification schemes,
> taxonomies,
> >> subject-heading systems, or any other type of structured controlled
> >> vocabulary. SKOS is built upon RDF and RDFS, and its main objective is to
> >> enable easy publication of controlled structured vocabularies for the
> >> Semantic Web. SKOS is currently developed within the W3C framework."
> >> (http://en.wikipedia.org/wiki/SKOS)
> >>
> >> I would really like to be able to say to "domain experts" that they
> "just"
> >> need to build a SKOS vocabulary for the terms used in their domain and
> then
> >> use the URIs of these terms for tagging information that is passed
> between
> >> providers than specify that they need to construct a more formal ontology
> of
> >> classes. I appreciate that one could argue that SKOS terms are like
> classes
> >> but they form part of a structure that is specifically designed for
> having
> >> the debates that TDWG needs to have around the vocabulary part of the
> >> standards (rather than the exchange parts).
> >>
> >> I would be grateful for peoples thoughts and criticisms.
> >>
> >> All the best,
> >>
> >> Roger
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> tdwg-tag mailing list
> >> tdwg-tag at lists.tdwg.org
> >> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
> >>
>
>
>
>
> ------------------------------
>
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
>
>
> End of tdwg-tag Digest, Vol 22, Issue 2
> ***************************************
>
>
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
>
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
>


-- 
Robert A. Morris
Professor of Computer Science
UMASS-Boston
ram at cs.umb.edu
http://bdei.cs.umb.edu/
http://www.cs.umb.edu/~ram
http://www.cs.umb.edu/~ram/calendar.html
phone (+1)617 287 6466



More information about the tdwg-tag mailing list