[tdwg-tag] literal enumerated values in TCS-RDF

Roderic Page r.page at bio.gla.ac.uk
Mon Oct 2 15:21:19 CEST 2006

 From an RDF perspective, I would vote for referring to the botanical  
code (or any other code) using a URI, not a literal, and not a name  
space reference (such as "tn:botanical) which, as Steve pointed out,  
RDF parsers can't handle.

By referencing a URI, the semantics are clear (I can fetch whatever the  
URI points to and figure out what it means).

In other words, something like

rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />

as Steve proposed (so long as the URI is actually resolvable, which is  
another difference between XML schema and RDF --- in RDF the name  
spaces must be resolvable).

I haven't really followed what Roger proposes, but IMHO anything  
motivated by an attempt to salvage the use of XML schema is bound to  
end in tears. It's a hack (anything that exposes internal identifiers  
of a class has got to be hack, surely).



On 2 Oct 2006, at 13:52, Sally Hinchcliffe wrote:

> I'm trying to nail down a final spec for the IPNI LSID server if I
> can.
> Steve Perry raised the point, over on the tdwg-guid mailing list,
> that the following bit of output from the IPNI server was an error:
> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
> This syntax has been used in a number of places - for the value for
> the nomenclaturalCode, for the nomenclaturalNote noteType (eg.
> basedOn) and for the typeOfTyp (eg. holo) - basically wherever there
> was an enumeration in the original schema.
> Roger's response is below - but I'm not quite clear how that will
> work in terms of replacing the syntax Steve found was failing. I'm
> not sure from Roger's email whether he really meant we should use
> numeric ids for all literal enumerated values, or whether those were
> merely  place holders - but if so it would seem to make the resulting
> rdf willfully unreadable.
> Roger - can you explain with a more concrete example?
> Sally
>> Hi Everyone,
>> In the not very comforting words of software vendors these are "known
>> issues" and will be resolved in the next release ;)
>> But seriously.The namespaces in the TCS names vocabulary do not  
>> resolve
>> because we didn't have a policy at that time. The use if entity
>> references was 'borrowed' from an example (probably Protege output)  
>> and
>> I wouldn't mind doing away with it.
>> I have been thinking long and hard about the namespace issues in the
>> last few weeks and believe I have a solution that I will propose at  
>> St Louis. It would be good to have face to face discussions about it  
>> and
>> make a decision there.
>> To briefly summarize: The issue is getting a namespace convention that
>> will work across technologies. Suppose we want to serve data in a
>> technology that "isn't very good at namespaces". As an example - if we
>> were to have separate namespaces for TaxonNames, TaxonConcepts,
>> Specimens, Metadata, GeospatialStuff, Collections and we wanted to
>> validate a document using XML Schema that contained all these things  
>> it
>> would require 6 independent schemas each with it's own target  
>> namespace.
>> If you have ever tried to debug something like this you will know what
>> total madness it is. We can't just abandon XML Schema because it would
>> rule out not only our existing technologies but GML and probably
>> others... It may also be desirable to express our ontology in things
>> that aren't even XML.
>> The only solution I can think is that for the TDWG Ontology we should
>> have a single formal namespace of: http://rs.tdwg.org/ontology/
>> Within the ontology we have a convention for concepts that goes like  
>> this.
>> A class would be: http://rs.tdwg.org/ontology/tdwg123_MyClass
>> where 123 is the internal id of the class and MyClass is the class  
>> name
>> A property in MyClass would be:
>> http://rs.tdwg.org/ontology/tdwg123_myProperty
>> where 123 is the *class* id not the property id and myProperty is the
>> property name.
>> An instance would be http://rs.tdwg.org/ontology/tdwg123_789
>> where 123 is the *class* id and 789 is the instance id (there is  
>> nothing
>> stable about an instance that we could use as an identifier unless we
>> force a label property and make it immutable).
>> In a way this is using the part before the _ as a pseudo namespace.
>> I think this hits the balance between something that is technology
>> independent and something that will produce reasonably human readable
>> documents.
>> It is radical which is why I thought it would be good to talk about  
>> it.
>> Any one got an alternative?
>> We have to have a solution for this by the end of the St Louis meeting
>> as it is critical path for ontology work.
>> Most grateful for you patience and any thoughts you have.
>> Roger
>> Sally Hinchcliffe wrote:
>>> Hi Steve /all
>>> We took that syntax straight from Roger's RDF/TCS examples. I think
>>> Roger was going to do more work on tidying up those sorts of loose
>>> ends. I have to admit that my knowledge of RDF and particularly RDFS
>>> is pretty superficial
>>> We can switch to either the shorter format or the safer fully
>>> qualified URI - what do people think would be better?
>>> Sally
>>>> By the way, the IPNI example you cite has an error:
>>>> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
>>>> Many RDF/XML parsers will see &tn; as an entity which cannot be
>>>> resolved.  Since I don't have a copy of the ontology (and
>>>> http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only  
>>>> take
>>>> a guess that it should look something like:
>>>> <tn:nomenclaturalCode rdf:resource="tn:botanical" />
>>>> However, using XML namespace prefixes in resource references inside
>>>> RDF/XML documents tends to cause problems because not all RDF/XML
>>>> parsers are smart enough to dereference the namespace prefix and  
>>>> build a
>>>> fully-qualified resource URI.  A safer form of the above would be  
>>>> the
>>>> fully qualified resource URI which looks like:
>>>> <tn:nomenclaturalCode  
>>>> rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
>>>> -Steve
>>> *** Sally Hinchcliffe
>>> *** Computer section, Royal Botanic Gardens, Kew
>>> *** tel: +44 (0)20 8332 5708
>>> *** S.Hinchcliffe at rbgkew.org.uk
>>> _______________________________________________
>>> TDWG-GUID mailing list
>>> TDWG-GUID at mailman.nhm.ku.edu
>>> http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
>> --  
>> -------------------------------------
>>  Roger Hyam
>>  Technical Architect
>>  Taxonomic Databases Working Group
>> -------------------------------------
>>  http://www.tdwg.org
>>  roger at tdwg.org
>>  +44 1578 722782
>> -------------------------------------
> *** Sally Hinchcliffe
> *** Computer section, Royal Botanic Gardens, Kew
> *** tel: +44 (0)20 8332 5708
> *** S.Hinchcliffe at rbgkew.org.uk
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Professor Roderic D. M. Page
Editor, Systematic Biology
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom

Phone:    +44 141 330 4778
Fax:      +44 141 330 2792
email:    r.page at bio.gla.ac.uk
web:      http://taxonomy.zoology.gla.ac.uk/rod/rod.html
iChat:    aim://rodpage1962
reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html

Subscribe to Systematic Biology through the Society of Systematic
Biologists Website:  http://systematicbiology.org
Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
Find out what we know about a species: http://ispecies.org
Rod's rants on phyloinformatics: http://iphylo.blogspot.com
Rod's rants on ants: http://semant.blogspot.com

More information about the tdwg-tag mailing list