[tdwg-content] Use of dwc:xxxxID terms in RDF, was Re: dwc:associatedOccurrences

Bob Morris morris.bob at gmail.com
Tue Sep 7 16:30:28 CEST 2010


I've produced a limping OWL2 version in connection with several
projects I'm involved with. Later this week I'll have it tuned a
little more and put it on the table for discussion. It  likely could
be informed by and/or inform any advice about use of the current
dwcterms.rdf.

Bob


On Tue, Sep 7, 2010 at 6:42 AM, Steve Baskauf
<steve.baskauf at vanderbilt.edu> wrote:
> Markus,
> This all makes sense to me.  I think that putting the xxxxID terms in the
> "All" class would solve the problem I brought up and would help to clarify
> their use.
>
> I would be happy to take a shot at writing a guide for using DwC in RDF if
> others are willing to give feedback.  I have a couple other things I need to
> finish first, but should be able to create something in the near term -
> probably before the TDWG meeting.
>
> Steve
>
> Markus Döring wrote:
>
> Steve,
> I was about to fix the xsl when I thought we might first want to decide on
> what the whole dwc "site" should be offering.
>
> Currently we have:
> - normative html definitions for all current and historical term versions
> - 2 guidelines (xml & text) on how to use dwc terms with the respective
> technology including xml schemas and other files needed for that.
> - dwcterms.rdf and dwctermshistory.rdf for rdf definitions of the terms.
> Primarily these are intended to support the use of dwc terms in rdf and
> allow a resolution of the term URIs into rdf.
>
> It would be really nice if you or someone else would add a new guideline on
> how to use dwc terms in the context of rdf.
> For that I am wondering if dwctermshistory.rdf is needed at all or whether
> its not good enough to maintain the html version only.
> And in the light of having already normative html versions I support the
> idea of removing the xsl stylesheet from the rdf files.
>
> In regards to the resolution of the term URIs I have setup some content
> negotiation that takes you to the rdf in case you request
> "application/rdf+xml" - otherwise it defaults to the html definitions. I
> think that is one more reason that we dont need any xsl on the rdf anymore.
>
> http://rs.tdwg.org/dwc/terms/taxonID
> now gets redirected to either:
> http://darwincore.googlecode.com/svn/trunk/rdf/dwcterms.rdf
> http://darwincore.googlecode.com/svn/trunk/terms/index.htm#taxonID
>
>
> For the domains of the ID terms its good to have a more in depth discussion.
> To me those ID terms can always both, a primary or a foreign key depending
> where you use them. For RDF as you say there is a natural primary key with
> rdf:about for all rdf resources, so the purpose for dwc ID terms should be
> restricted to act as "foreign keys", i.e. have no domain at all (unless we
> want to go into the discussion which is the definite set of accepted domains
> and how they actually relate - sth we had decided dwc will not do but wait
> for the proper ontology work to take on). For that discussion Id also like
> to point out that analogeous to rdf we have created guidelines for using dwc
> terms in normalised xml that defines "classes". In the matching xml schema
> actually all ID terms are allowed in every class:
> http://rs.tdwg.org/dwc/terms/guides/xml/index.htm#classes
> http://rs.tdwg.org/dwc/xsd/tdwg_dwc_class_terms.xsd
>
> In the html definitions we assign a "class" to all terms, some of which are
> "all" though (for example all record level ones). We should consider for all
> ID terms a Class of "all" too, so we can use them as "foreign keys" as well.
> In the historical html file we additionally have a "Has Domain" definition.
> It seems to me this is a left over from the initial definitions and this can
> be removed.
>
> In the dwcterms.rdf definitions we dont use rdfs:domain at all right now.
> Btw, Ive changed rdfs:replaces to dcterms:replaces in the rdf files.
>
> If we all agree I would propose to do the following additional changes:
> - remove xsl from rdf files
> - remove dwctermshistory.rdf in favor over html alone?
> - remove "Has Domain" from term history.
>
>
>
> Markus
>
>
> On Sep 6, 2010, at 2:38, Steve Baskauf wrote:
>
>
>
> Thanks for the references, John.  Sorry for bringing up a settled issue - I
> didn't mean to plow old ground again (I think I wasn't on the list yet when
> these posts came out).  I can see the advantage of not actually specifying a
> domain for the DwC terms in terms of not squelching innovative uses of the
> DwC terms.  However, rdfs:domain issue aside, if I'm understanding the cited
> archived messages correctly, the point of putting the DwC terms into classes
> at all is to help to clarify what they mean and how they should be used.
> The point of my post was that the dwc:xxxxID terms have two possible uses,
> one of which is consistent with their current placement in classes and one
> (the one that I think is most useful) that is not.
>
> I guess the reason that I'm bringing this up is that I'm using these terms
> pretty heavily in a way that I'm not sure is the way they are intended.
> What I would like is either some confirmation from the community that this
> use is OK or else we need to define some other terms that will get the job
> done.  Darwin Core doesn't at this point have an "instruction book" except
> for the Google Code Wiki and that Wiki has no descriptions yet for most of
> the terms in the vocabulary.  I don't feel comfortable creating my own
> meanings for terms that don't have clear guidelines.  After all, the point
> of having Darwin Core terms is so that people use them to refer to the same
> "thing".
>
> I would also like to make two suggestions about the RDF file at
> http://darwincore.googlecode.com/svn/trunk/rdf/dwcterms.rdf .  One is that
> either someone needs to fix the "human.xsl" file so that it renders the RDF
> in a web browser, or remove the xml-stylesheet tag and just let web browsers
> display the raw RDF XML.  This has not rendered properly since the rdf was
> moved over from the tdwg domain (I think at least six months ago).  It
> doesn't matter to me since I know how to use "view source" to see the source
> XML, but it's really pretty shoddy form for the definition of a major
> vocabulary.  The second thing is that if the DwC terms are not going to have
> rdfs:domains specified, then somebody should take those erroneous comments
> out of the dwcterms.rdf file.  I think that would be allowable without
> requiring the issuing of a new version because it's correcting an error and
> d
> oesn't actually change the DwC standard itself.
>
> Steve
>
> John Wieczorek wrote:
>
>
> For background on the state of and reasons for the Darwin Core domains, here
> are relevant messages in the tdwg-content list archives from the public
> review period for the standard as it currently stands:
>
> assertions in DwC terms:
>
> http://lists.tdwg.org/pipermail/tdwg-content/2009-June/000019.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-June/000021.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-June/000022.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-June/000024.html
>
> Request for Decision for Public Review of DarwinCore Draft Standard:
>
> http://lists.tdwg.org/pipermail/tdwg-content/2009-July/000035.html
>
> http://lists.tdwg.org/pipermail/tdwg-content/2009-August/000088.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-August/000089.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-August/000098.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-August/000092.html
>
> On Tue, Aug 31, 2010 at 12:34 PM, Steve Baskauf
> <steve.baskauf at vanderbilt.edu> wrote:
> Markus,
> Thanks!  After I posted the other night I was thinking that the way I had
> constructed the RDF was a bit convoluted.  Your example is much more
> straightforward.  (although except for a lack of typing on my part the two
> examples actually result in the same triples).
>
> As you noted, the other thing that my example has is that the record for the
> offspring makes reference to the dwc:resourceRelationshipID as a way to
> assert that the offspring has that relationship.  I would be interested in
> some discussion about the appropriate use of the varous dwc:xxxxID terms
> (e.g. dwc:occurrenceID, dwc:institutionID, etc.).  As was the case with the
> resourceRelationship terms, there aren't a lot of examples showing the
> proper use of these terms, with the noteworthy exception of
> http://rs.tdwg.org/dwc/terms/guides/xml/index.htm .  As I see it, there are
> two possible uses of these terms.  One (the "first" use) is to identify the
> actual resource that is the subject of a record.  The other (the "second"
> use) is as an "IDRef" term that connects the subject resource to another
> record of a different type.  In the XML example, we see both uses.  In the
> first example in
> section 2.7 we see
>
>     <dcterms:Location>
>         <dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
>              ... more metadata...
>     </dcterms:Location>
>     <dwc:Occurrence>
>              ... more metadata ...
>         <dwc:occurrenceID>urn:catalog:MVZ:Mammals:14523</dwc:occurrenceID>
>              ... more metadata ...
>         <dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
>     </dwc:Occurrence>
>
> where in the dcterms:Location record, dwc:locationID indicates the
> identifier for the Location itself (i.e. the subject), while in the
> dwc:Occurrence record, dwc:locationID refers to the object of the location
> property of the Occurrence.  Because of the freewheeling nature of generic
> XML files, I guess this is OK.  However, when I'm trying to think clearly
> about how to express relationships in RDF, I've come to decide that I don't
> like this dual use.  I would express the above relationships in RDF as
> follows:
>
> <rdf:Description rdf:about="http://guid.mvz.org/sites/arg/127">
>     <rdfs:type rdf:resource="http://purl.org/dc/terms/Location"/>
>        ...metadata...
> </rdf:Description>
> <rdf:Description
> rdf:about="http://resolver.org/urn:catalog:MVZ:Mammals:14523">
>     <rdfs:type rdf:resource="http://rs.tdwg.org/dwc/terms/Occurrence"/>
>        ... metadata...
>     <dwc:locationID rdf:resource="http://guid.mvz.org/sites/arg/127"/>
> </rdf:Description>
>
> To me this makes the most sense.  It corresponds to the "second" use in the
> XML example (it indicates the Location property of the Occurrence, i.e.
> serves as the predicate connecting the subject Occurrence to the object
> Location).   One could also do something like
>
> <rdf:Description rdf:about="http://guid.mvz.org/sites/arg/127">
>     <dwc:locationID rdf:resource="http://guid.mvz.org/sites/arg/127"/>
>     <rdfs:type rdf:resource="http://purl.org/dc/terms/Location"/>
>        ...metadata...
> etc.
>
> or
>
> <rdf:Description rdf:about="http://guid.mvz.org/sites/arg/127">
>     <dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
>     <rdfs:type rdf:resource="http://purl.org/dc/terms/Location"/>
>        ...metadata...
> etc.
>
> But this seems rather pointless because in the first case, the rdf:about
> attribute already provides information about the subject of the relationship
> in a Linked Data manner.  In the second case, why have a special term to
> indicate a literal version of the identifier when there is a more generic
> and well-known term that is common use? e.g.
>
> <rdf:Description rdf:about="http://guid.mvz.org/sites/arg/127">
>
> <dcterms:identifier>http://guid.mvz.org/sites/arg/127</dcterms:identifier>
>     <rdfs:type rdf:resource="http://purl.org/dc/terms/Location"/>
>        ...metadata...
> etc.
>
> In contrast, there is no simple alternative that I can see (with the
> possible exception of the dwc:relatedResource, which I wouldn't call a
> SIMPLE alternative) to using dwc:locationID as the predicate that points to
> the Location object of an Occurrence subject.  Darwin Core doesn't have any
> terms specifically designed for this (such as "hasLocation" or
> "hasIdentificatin" for example) but I don't see any reason why the
> dwc:xxxxID terms couldn't be used this way.
>
> The main issue I can see with the use of these dwc:xxxxID terms is their
> placement in classes which I guess is related to their rdfs:domain .
> Although the term description comments at
> http://darwincore.googlecode.com/svn/trunk/rdf/dwcterms.rdf (is this the
> actual place where DwC is officially defined?) say that each term has a
> rdfs:domain property, they actually don't in the RDF.  So the only real clue
> to the intended subjects of the dwc:xxxxID terms is the placement of the
> dwc:xxxxID terms into classes.  In most cases, a dwc:xxxxID term is placed
> in the class of the thing that the term is identifying (e.g. occurrenceID is
> in the Occurrence class, identificationID is in the Identification class,
> etc.).  This would hint that the terms were intended to be used with
> instances of their class as their subject.  However, if the terms were used
> as IDRefs (the "second" use shown
> in the XML example) then they really should be in different classes.  For
> example, dwc:locationID could be a member of the class Occurrence, since an
> Occurrence could have dwc:locationID as a property with an instance of a
> Location as its object.  But then there might be the question of whether an
> Event could also have a dwc:locationID as a property and dwc:locationID
> shouldn't be in two classes.  In the absence of defined rdfs:domains, I
> guess the terms can be used with pretty much any subject a user thinks makes
> sense.  As Bob Morris pointed out in a recent post, there aren't very many
> applications that are doing much in the way of semantic reasoning at this
> point.  But I suppose (hope?) that there could be some in the future, so
> achieving some kind of consensus on how the terms should be used in RDF
> would probably be good.
>
> At this point, the dwc:xxxxID terms serve a useful purpose for me in
> connecting one kind of resource to another (examples in
> http://bioimages.vanderbilt.edu/ind-baskauf/41870.rdf and
> http://bioimages.vanderbilt.edu/baskauf/51363.rdf), so unless somebody tells
> me that's "wrong" and comes up with a better term to describe the
> relationships I need to describe, I'll keep using the dwc:xxxxID terms as
> IDRefs.
>
> Comments?
> Steve
>
> Markus Döring wrote:
>
>
> I must say it feels strange to use the id terms in rdf to refer to
> rdf:resources instead of holding ID literals.
> More natural to the rdf language would be <dwc:resource rdf:resource="..."
> /> and <dwc:relatedResource rdf:resource="..." /> I suppose, but the dwc
> vocabulary was built for different technologies, not rdf alone. So I guess
> thats the price we have to pay.
>
> Markus
>
>
>
>
> --
> Steven J. Baskauf, Ph.D., Senior Lecturer
> Vanderbilt University Dept. of Biological Sciences
>
> postal mail address:
> VU Station B 351634
> Nashville, TN  37235-1634,  U.S.A.
>
> delivery address:
> 2125 Stevenson Center
> 1161 21st Ave., S.
> Nashville, TN 37235
>
> office: 2128 Stevenson Center
> phone: (615) 343-4582,  fax: (615) 343-6707
>
> http://bioimages.vanderbilt.edu
>
>
>
>
>
>
> --
> Steven J. Baskauf, Ph.D., Senior Lecturer
> Vanderbilt University Dept. of Biological Sciences
>
> postal mail address:
> VU Station B 351634
> Nashville, TN  37235-1634,  U.S.A.
>
> delivery address:
> 2125 Stevenson Center
> 1161 21st Ave., S.
> Nashville, TN 37235
>
> office: 2128 Stevenson Center
> phone: (615) 343-4582,  fax: (615) 343-6707
>
> http://bioimages.vanderbilt.edu
>
>
> .
>
>
>
> --
> Steven J. Baskauf, Ph.D., Senior Lecturer
> Vanderbilt University Dept. of Biological Sciences
>
> postal mail address:
> VU Station B 351634
> Nashville, TN  37235-1634,  U.S.A.
>
> delivery address:
> 2125 Stevenson Center
> 1161 21st Ave., S.
> Nashville, TN 37235
>
> office: 2128 Stevenson Center
> phone: (615) 343-4582,  fax: (615) 343-6707
> http://bioimages.vanderbilt.edu
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>



-- 
Robert A. Morris
Emeritus Professor  of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
Associate, Harvard University Herbaria
email: morris.bob at gmail.com
web: http://bdei.cs.umb.edu/
web: http://etaxonomy.org/mw/FilteredPush
http://www.cs.umb.edu/~ram
phone (+1) 857 222 7992 (mobile)



More information about the tdwg-content mailing list