[tdwg-tag] Darwin Core generic XML with attributes

Roderic Page r.page at bio.gla.ac.uk
Thu May 20 09:55:05 CEST 2010

I think part of the problem here results from trying to satisfy  
modelling the data and having something that is easy to read (i.e.,  
having both a URI and a literal for the same tag). The result is messy  
and inconsistent.

I think Peter Ansell's first option is a good RDF solution, namely:


<?xml version="1.0" encoding="UTF-8"?>

    <!-- occurrence -->
    <rdf:Description rdf:about="http://herbarium.org/hb123456">
       <dwc:recordedBy rdf:resource="http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me 
    <!-- person -->
    <rdf:Description rdf:about="http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me 
       <rdfs:label>Steve Baskauf</rdfs:label>


This document says:

OCCURRENCE <http://herbarium.org/hb123456> --> recorded by --> PERSON <http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me 
 > --> who has name --> "Steve Baskauf"

which I assume is what we want. You can see this graph in the W3C RDF  
validator here http://tinyurl.com/39nqho2

The RDF has all the information a linked data client needs in order to  
say this, and we could also write a XSLT style sheet to render this in  
HTML for people to read.

Adding <dwc:recordedBy>Steve Baskauf</dwc:recordedBy is a hack that  
breaks the model.

Note also that if the person doesn't have a URI we still shouldn't use  
dwc:recordedBy as a literal. Instead we can do this:


<?xml version="1.0" encoding="UTF-8"?>

    <!-- occurrence -->
    <rdf:Description rdf:about="http://herbarium.org/hb123456">
       <!-- person -->
       <dwc:recordedBy rdf:parseType="Resource">
       	<rdfs:label>Steve Baskauf</rdfs:label>

Here we are saying that the occurrence was recorded by a person called  
Steve Baskauf. If you paste this into the W3C validator you get the  
same model

OCCURRENCE <http://herbarium.org/hb123456> --> recorded by --> PERSON  
<xxx> --> who has name --> "Steve Baskauf"

Since "Steve Baskauf" in this example doesn't have a URI we get a  
"bnode" with a local identifier. See http://tinyurl.com/392qzb3 .

In both cases (person with or without a URI) we are saying the same  
thing. If you want a literal for <dwc:recordedBy> (say for ease of  
display) then I think you want a different tag that is expressly  
defined to do just that. For example, http://rs.tdwg.org/ontology/voc/TaxonOccurrence 
  has <identifiedTo>  to point to a URI for a taxon, and  
<identifiedToString> if you want the literal. I don't know if dwc has  
anything equivalent for recordedBy (and can somebody please tell me  
why we now have so many vocabularies for the same things?)

Personally I think that starting with XML and trying to generate RDF  
and HTML from that is going to lead to a world of hurt. I suspect it  
makes more sense to:

a) model what we want to say
b) say it in RDF
c) write a XSLT to convert it to HTML for humans



On 20 May 2010, at 06:35, Bob Morris wrote:

> Per my discussion in answer to the original problem, I think what you
> are tripping on is that the way you want to do this effectively trying
> to make a triple with two objects.  I believe it is not really a
> modeling question, but rather a question of how RDF/XML is translated
> into triples.
> Bob Morris
> On Wed, May 19, 2010 at 9:52 PM, Steve Baskauf
> <steve.baskauf at vanderbilt.edu> wrote:
>> This recent discussion reminds me of a question that I have been
>> wondering about for several months and hadn't gotten around to  
>> bringing
>> up: can you have a Darwin Core XML representation where an element  
>> has a
>> literal value and an attribute? If the XML is RDF, then I think the
>> answer is pretty much "no" as I just found out with the W3C  
>> Validator.
>> However, in generic XML I don't think there is any rule that says  
>> that
>> one can't have any attribute that one wants.  The only guidance I  
>> know
>> of on the subject is:
>> http://rs.tdwg.org/dwc/terms/guides/xml/index.htm#implement
>> It states that the value of a Darwin Core property should be the  
>> content
>> of the element rather than stating the value as an attribute.   
>> However,
>> I have the situation where I want to store or transfer two somewhat
>> equivalent representations of the value of a property: a string  
>> literal
>> form and a URI form.  In the example we've been talking about, I  
>> would
>> like my generic (non-RDF) XML to do something like this:
>> <?xml version="1.0" encoding="UTF-8"?>
>> <dwr:SimpleDarwinRecordSet
>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>                 xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
>>                 xmlns:dwr="http://rs.tdwg.org/dwc/xsd/simpledarwincore/ 
>> "
>>                 >
>> <dwr:SimpleDarwinRecord>
>>     <dwc:occurrenceID
>> rdf:resource="http://herbarium.org/hb123456">http://herbarium.org/hb123456 
>> </dwc:occurrenceID>
>>     <dwc:recordedBy
>> rdf:resource="http://people.vanderbilt.edu/~steve.baskauf/ 
>> foaf.rdf#me">Steve
>> Baskauf</dwc:recordedBy>
>>     <dwc:basisOfRecord
>> rdf:resource="http://rs.tdwg.org/dwc/dwctype/ 
>> PreservedSpecimen">PreservedSpecimen</dwc:basisOfRecord>
>>     ... more elements, mostly with string literal values...
>> </dwr:SimpleDarwinRecord>
>> </dwr:SimpleDarwinRecordSet>
>> This would meet the basic guidelines of the Darwin Core XML Guide in
>> that the literal values would be the contents of the elements.   
>> What I
>> don't know is if the inclusion of the rdf:resource attributes would
>> invalidate the XML if it were validated against someone's schema that
>> was silent about attributes or if the schema would have to explicitly
>> say that having an rdf:resource attribute was a valid option.  I  
>> think I
>> don't know enough about XML schemas ...
>> The reason why I would like to maintain/transfer both types of values
>> (literal and URI) is so that I could use the XML data to generate  
>> both
>> HTML and RDF if I wanted.  The HTML would tell humans that the
>> occurrence was a PreservedSpecimen, but the RDF would tell a linked  
>> data
>> client that the occurrence was a
>> http://rs.tdwg.org/dwc/dwctype/PreservedSpecimen .  I realize that  
>> for
>> my own internal use, the XML can have any format I want, but if I  
>> were
>> exporting XML for general public use, would it be bad to use the
>> approach above?
>> Steve
>> As an aside, I wanted to see exactly what the definition was for
>> rdf:resource  However, the usual namespace for rdf:
>> (http://www.w3.org/1999/02/22-rdf-syntax-ns#) doesn't seem to include
>> "resource" in the defined properties.  Very odd!  Maybe I'm just  
>> missing
>> something...
>> --
>> Steven J. Baskauf, Ph.D., Senior Lecturer
>> Vanderbilt University Dept. of Biological Sciences
>> postal mail address:
>> VU Station B 351634
>> Nashville, TN  37235-1634,  U.S.A.
>> delivery address:
>> 2125 Stevenson Center
>> 1161 21st Ave., S.
>> Nashville, TN 37235
>> office: 2128 Stevenson Center
>> phone: (615) 343-4582,  fax: (615) 343-6707
>> http://bioimages.vanderbilt.edu
>> _______________________________________________
>> tdwg-tag mailing list
>> tdwg-tag at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
> -- 
> Robert A. Morris
> Emeritus Professor  of Computer Science
> UMASS-Boston
> 100 Morrissey Blvd
> Boston, MA 02125-3390
> Associate, Harvard University Herbaria
> email: ram at cs.umb.edu
> web: http://bdei.cs.umb.edu/
> web: http://etaxonomy.org/FilteredPush
> http://www.cs.umb.edu/~ram
> phone (+1)617 287 6466
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag

Roderic Page
Professor of Taxonomy
Graham Kerr Building
University of Glasgow
Glasgow G12 8QQ, UK

Email: r.page at bio.gla.ac.uk
Tel: +44 141 330 4778
Fax: +44 141 330 2792
AIM: rodpage1962 at aim.com
Facebook: http://www.facebook.com/profile.php?id=1112517192
Twitter: http://twitter.com/rdmpage
Blog: http://iphylo.blogspot.com
Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html

More information about the tdwg-tag mailing list