[tdwg-content] Use of dwc:xxxxID terms in RDF, was Re: dwc:associatedOccurrences

Steve Baskauf steve.baskauf at vanderbilt.edu
Mon Sep 6 02:38:10 CEST 2010


Thanks for the references, John.  Sorry for bringing up a settled issue 
- I didn't mean to plow old ground again (I think I wasn't on the list 
yet when these posts came out).  I can see the advantage of not actually 
specifying a domain for the DwC terms in terms of not squelching 
innovative uses of the DwC terms.  However, rdfs:domain issue aside, if 
I'm understanding the cited archived messages correctly, the point of 
putting the DwC terms into classes at all is to help to clarify what 
they mean and how they should be used.  The point of my post was that 
the dwc:xxxxID terms have two possible uses, one of which is consistent 
with their current placement in classes and one (the one that I think is 
most useful) that is not. 

I guess the reason that I'm bringing this up is that I'm using these 
terms pretty heavily in a way that I'm not sure is the way they are 
intended.  What I would like is either some confirmation from the 
community that this use is OK or else we need to define some other terms 
that will get the job done.  Darwin Core doesn't at this point have an 
"instruction book" except for the Google Code Wiki and that Wiki has no 
descriptions yet for most of the terms in the vocabulary.  I don't feel 
comfortable creating my own meanings for terms that don't have clear 
guidelines.  After all, the point of having Darwin Core terms is so that 
people use them to refer to the same "thing". 

I would also like to make two suggestions about the RDF file at 
http://darwincore.googlecode.com/svn/trunk/rdf/dwcterms.rdf .  One is 
that either someone needs to fix the "human.xsl" file so that it renders 
the RDF in a web browser, or remove the xml-stylesheet tag and just let 
web browsers display the raw RDF XML.  This has not rendered properly 
since the rdf was moved over from the tdwg domain (I think at least six 
months ago).  It doesn't matter to me since I know how to use "view 
source" to see the source XML, but it's really pretty shoddy form for 
the definition of a major vocabulary.  The second thing is that if the 
DwC terms are not going to have rdfs:domains specified, then somebody 
should take those erroneous comments out of the dwcterms.rdf file.  I 
think that would be allowable without requiring the issuing of a new 
version because it's correcting an error and doesn't actually change the 
DwC standard itself.

Steve

John Wieczorek wrote:
> For background on the state of and reasons for the Darwin Core 
> domains, here are relevant messages in the tdwg-content list archives 
> from the public review period for the standard as it currently stands:
>
> assertions in DwC terms:
>
> http://lists.tdwg.org/pipermail/tdwg-content/2009-June/000019.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-June/000021.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-June/000022.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-June/000024.html
>
>
>   Request for Decision for Public Review of DarwinCore Draft Standard:
>
>
>   http://lists.tdwg.org/pipermail/tdwg-content/2009-July/000035.html
>
> http://lists.tdwg.org/pipermail/tdwg-content/2009-August/000088.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-August/000089.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-August/000098.html
> http://lists.tdwg.org/pipermail/tdwg-content/2009-August/000092.html
>
> On Tue, Aug 31, 2010 at 12:34 PM, Steve Baskauf 
> <steve.baskauf at vanderbilt.edu <mailto:steve.baskauf at vanderbilt.edu>> 
> wrote:
>
>     Markus,
>     Thanks!  After I posted the other night I was thinking that the
>     way I had constructed the RDF was a bit convoluted.  Your example
>     is much more straightforward.  (although except for a lack of
>     typing on my part the two examples actually result in the same
>     triples). 
>
>     As you noted, the other thing that my example has is that the
>     record for the offspring makes reference to the
>     dwc:resourceRelationshipID as a way to assert that the offspring
>     has that relationship.  I would be interested in some discussion
>     about the appropriate use of the varous dwc:xxxxID terms (e.g.
>     dwc:occurrenceID, dwc:institutionID, etc.).  As was the case with
>     the resourceRelationship terms, there aren't a lot of examples
>     showing the proper use of these terms, with the noteworthy
>     exception of http://rs.tdwg.org/dwc/terms/guides/xml/index.htm . 
>     As I see it, there are two possible uses of these terms.  One (the
>     "first" use) is to identify the actual resource that is the
>     subject of a record.  The other (the "second" use) is as an
>     "IDRef" term that connects the subject resource to another record
>     of a different type.  In the XML example, we see both uses.  In
>     the first example in section 2.7 we see
>
>         <dcterms:Location>
>            
>     <dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
>                  ... more metadata...
>         </dcterms:Location>
>         <dwc:Occurrence>
>                  ... more metadata ...
>            
>     <dwc:occurrenceID>urn:catalog:MVZ:Mammals:14523</dwc:occurrenceID>
>                  ... more metadata ...
>            
>     <dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
>         </dwc:Occurrence>
>
>     where in the dcterms:Location record, dwc:locationID indicates the
>     identifier for the Location itself (i.e. the subject), while in
>     the dwc:Occurrence record, dwc:locationID refers to the object of
>     the location property of the Occurrence.  Because of the
>     freewheeling nature of generic XML files, I guess this is OK. 
>     However, when I'm trying to think clearly about how to express
>     relationships in RDF, I've come to decide that I don't like this
>     dual use.  I would express the above relationships in RDF as follows:
>
>     <rdf:Description rdf:about="http://guid.mvz.org/sites/arg/127"
>     <http://guid.mvz.org/sites/arg/127>>
>         <rdfs:type rdf:resource="http://purl.org/dc/terms/Location"
>     <http://purl.org/dc/terms/Location>/>
>            ...metadata...
>     </rdf:Description>
>     <rdf:Description
>     rdf:about="http://resolver.org/urn:catalog:MVZ:Mammals:14523"
>     <http://resolver.org/urn:catalog:MVZ:Mammals:14523>>
>         <rdfs:type
>     rdf:resource="http://rs.tdwg.org/dwc/terms/Occurrence"
>     <http://rs.tdwg.org/dwc/terms/Occurrence>/>
>            ... metadata...
>         <dwc:locationID
>     rdf:resource="http://guid.mvz.org/sites/arg/127"
>     <http://guid.mvz.org/sites/arg/127>/>
>     </rdf:Description>
>
>     To me this makes the most sense.  It corresponds to the "second"
>     use in the XML example (it indicates the Location property of the
>     Occurrence, i.e. serves as the predicate connecting the subject
>     Occurrence to the object Location).   One could also do something like
>
>     <rdf:Description rdf:about="http://guid.mvz.org/sites/arg/127"
>     <http://guid.mvz.org/sites/arg/127>>
>         <dwc:locationID
>     rdf:resource="http://guid.mvz.org/sites/arg/127"
>     <http://guid.mvz.org/sites/arg/127>/>
>         <rdfs:type rdf:resource="http://purl.org/dc/terms/Location"
>     <http://purl.org/dc/terms/Location>/>
>            ...metadata...
>     etc.
>
>     or
>
>     <rdf:Description rdf:about="http://guid.mvz.org/sites/arg/127"
>     <http://guid.mvz.org/sites/arg/127>>
>         <dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
>         <rdfs:type rdf:resource="http://purl.org/dc/terms/Location"
>     <http://purl.org/dc/terms/Location>/>
>            ...metadata...
>     etc.
>
>     But this seems rather pointless because in the first case, the
>     rdf:about attribute already provides information about the subject
>     of the relationship in a Linked Data manner.  In the second case,
>     why have a special term to indicate a literal version of the
>     identifier when there is a more generic and well-known term that
>     is common use? e.g.
>
>     <rdf:Description rdf:about="http://guid.mvz.org/sites/arg/127"
>     <http://guid.mvz.org/sites/arg/127>>
>        
>     <dcterms:identifier>http://guid.mvz.org/sites/arg/127</dcterms:identifier>
>         <rdfs:type rdf:resource="http://purl.org/dc/terms/Location"
>     <http://purl.org/dc/terms/Location>/>
>            ...metadata...
>     etc.
>
>     In contrast, there is no simple alternative that I can see (with
>     the possible exception of the dwc:relatedResource, which I
>     wouldn't call a SIMPLE alternative) to using dwc:locationID as the
>     predicate that points to the Location object of an Occurrence
>     subject.  Darwin Core doesn't have any terms specifically designed
>     for this (such as "hasLocation" or "hasIdentificatin" for example)
>     but I don't see any reason why the dwc:xxxxID terms couldn't be
>     used this way. 
>
>     The main issue I can see with the use of these dwc:xxxxID terms is
>     their placement in classes which I guess is related to their
>     rdfs:domain .  Although the term description comments at
>     http://darwincore.googlecode.com/svn/trunk/rdf/dwcterms.rdf (is
>     this the actual place where DwC is officially defined?) say that
>     each term has a rdfs:domain property, they actually don't in the
>     RDF.  So the only real clue to the intended subjects of the
>     dwc:xxxxID terms is the placement of the dwc:xxxxID terms into
>     classes.  In most cases, a dwc:xxxxID term is placed in the class
>     of the thing that the term is identifying (e.g. occurrenceID is in
>     the Occurrence class, identificationID is in the Identification
>     class, etc.).  This would hint that the terms were intended to be
>     used with instances of their class as their subject.  However, if
>     the terms were used as IDRefs (the "second" use shown in the XML
>     example) then they really should be in different classes.  For
>     example, dwc:locationID could be a member of the class Occurrence,
>     since an Occurrence could have dwc:locationID as a property with
>     an instance of a Location as its object.  But then there might be
>     the question of whether an Event could also have a dwc:locationID
>     as a property and dwc:locationID shouldn't be in two classes.  In
>     the absence of defined rdfs:domains, I guess the terms can be used
>     with pretty much any subject a user thinks makes sense.  As Bob
>     Morris pointed out in a recent post, there aren't very many
>     applications that are doing much in the way of semantic reasoning
>     at this point.  But I suppose (hope?) that there could be some in
>     the future, so achieving some kind of consensus on how the terms
>     should be used in RDF would probably be good.  
>
>     At this point, the dwc:xxxxID terms serve a useful purpose for me
>     in connecting one kind of resource to another (examples in
>     http://bioimages.vanderbilt.edu/ind-baskauf/41870.rdf and
>     http://bioimages.vanderbilt.edu/baskauf/51363.rdf), so unless
>     somebody tells me that's "wrong" and comes up with a better term
>     to describe the relationships I need to describe, I'll keep using
>     the dwc:xxxxID terms as IDRefs.
>
>     Comments?
>     Steve
>
>     Markus Döring wrote:
>>     I must say it feels strange to use the id terms in rdf to refer to rdf:resources instead of holding ID literals.
>>     More natural to the rdf language would be <dwc:resource rdf:resource="..." /> and <dwc:relatedResource rdf:resource="..." /> I suppose, but the dwc vocabulary was built for different technologies, not rdf alone. So I guess thats the price we have to pay.
>>
>>     Markus
>>       
>     -- 
>     Steven J. Baskauf, Ph.D., Senior Lecturer
>     Vanderbilt University Dept. of Biological Sciences
>
>     postal mail address:
>     VU Station B 351634
>     Nashville, TN  37235-1634,  U.S.A.
>
>     delivery address:
>     2125 Stevenson Center
>     1161 21st Ave., S.
>     Nashville, TN 37235
>
>     office: 2128 Stevenson Center
>     phone: (615) 343-4582,  fax: (615) 343-6707
>     http://bioimages.vanderbilt.edu
>         
>
>

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20100905/e74137a2/attachment.html 


More information about the tdwg-content mailing list