[Tdwg-guid] Jena examples?
Hi Everyone,
Does anyone have/or know of any good examples of using Jena to output RDF metadata with non-typical namespaces such as TCS/RDF?
Currently I have the following code for my metadata method(I know it's wrong!).
String TN_NS = "http://tdwg.org/2006/03/12/TaxonNames/"; ByteArrayOutputStream byteStream = new ByteArrayOutputStream(); Model model = ModelFactory.createDefaultModel(); Resource res = model.createResource("http://test"); res.addProperty(model.createProperty(TN_NS,"testProperty"),"test");
model.write(byteStream, "RDF/XML-ABBREV"); return new MetadataResponse(new ByteArrayInputStream(byteStream.toByteArray()),null,MetadataResponse.RDF _FORMAT);
...which gives me.
<rdf:RDF xmlns:j.0="http://tdwg.org/2006/03/12/TaxonNames/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E <rdf:Description rdf:about="http://test%22%3E <j.0:testProperty >test</j.0:testProperty> </rdf:Description> </rdf:RDF>
...whereas I could really do with something along the lines of the IPNI metadata.
<rdf:RDF xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:tn="http://tdwg.org/2006/03/12/TaxonNames/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E
<!-- example family (has an unlinked typifying name with non-standard format data)--> tn:TaxonName rdf:about="urn:lsid:ipni.org:names:30000959-2:1.1.2.1" <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> dc:titleAmaryllidaceae J.St.-Hil.</dc:title> dcterms:created2004-01-20 00:00:00.0</dcterms:created> dcterms:modified2005-06-23 15:45:33.0</dcterms:modified> tn:rankStringfam.</tn:rankString> tn:nameCompleteAmaryllidaceae</tn:nameComplete> tn:uninomialAmaryllidaceae</tn:uninomial> tn:authorshipJ.St.-Hil.</tn:authorship> tn:publishedInExpos. Fam. Nat. 1: 134. 1805 [Feb-Apr 1805]</tn:publishedIn> tn:year1805</tn:year> tn:typifiedBy tn:NomenclaturalType dc:titleAmaryllis Linnaeus, nom. cons.</dc:title> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>
I think the main problem I have is creating resources that aren't defined as vocabularies within Jena already. I can't seem to find a way to create that <tn:TaxonName root resource element.
I'm starting to think that just outputting vanilla text could be the best option here, or am I barking up the wrong tree?
Many thanks, Peter.
Peter Hollas MSc BSc(hons) (Peter.Hollas@thomson.com) Software Engineer /Systems Administrator Thomson Zoological Innovation Centre York Science Park Heslington York YO10 5DG
Tel: 01904-435113 Fax: 01904-435114
Hi Peter, In XML/ABBREV mode, the serializer uses rdf:Types for element names. Since your example doesn't have an rdf:Type, it uses the default Jena element names.
- Ben
Ben Szekely IBM Software Engineer Advanced Internet Technology, Cambridge, MA bhszekel@us.ibm.com
tdwg-guid-bounces@mailman.nhm.ku.edu wrote on 09/25/2006 11:02:30 AM:
Hi Everyone,
Does anyone have/or know of any good examples of using Jena to output RDF metadata with non-typical namespaces such as TCS/RDF?
Currently I have the following code for my metadata method(I know it's wrong!).
String TN_NS = "http://tdwg.org/2006/03/12/TaxonNames/"; ByteArrayOutputStream byteStream = new ByteArrayOutputStream(); Model model = ModelFactory.createDefaultModel(); Resource res = model.createResource("http://test"); res.addProperty(model.createProperty(TN_NS,"testProperty"),"test");
model.write(byteStream, "RDF/XML-ABBREV"); return new MetadataResponse(new ByteArrayInputStream(byteStream.toByteArray()),null,MetadataResponse.RDF _FORMAT);
...which gives me.
<rdf:RDF xmlns:j.0="http://tdwg.org/2006/03/12/TaxonNames/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E <rdf:Description rdf:about="http://test%22%3E <j.0:testProperty >test</j.0:testProperty> </rdf:Description> </rdf:RDF>
...whereas I could really do with something along the lines of the IPNI metadata.
<rdf:RDF xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:tn="http://tdwg.org/2006/03/12/TaxonNames/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E
<!-- example family (has an unlinked typifying name with
non-
standard format data)--> tn:TaxonName rdf:about="urn:lsid:ipni.org:names:30000959-2:1.1.2.1" <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> dc:titleAmaryllidaceae J.St.-Hil.</dc:title> dcterms:created2004-01-20 00:00:00.0</dcterms:created> dcterms:modified2005-06-23 15:45:33.0</dcterms:modified> tn:rankStringfam.</tn:rankString> tn:nameCompleteAmaryllidaceae</tn:nameComplete> tn:uninomialAmaryllidaceae</tn:uninomial> tn:authorshipJ.St.-Hil.</tn:authorship> tn:publishedInExpos. Fam. Nat. 1: 134. 1805 [Feb-Apr 1805]</tn:publishedIn> tn:year1805</tn:year> tn:typifiedBy tn:NomenclaturalType dc:titleAmaryllis Linnaeus, nom. cons.</dc:title> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>
I think the main problem I have is creating resources that aren't defined as vocabularies within Jena already. I can't seem to find a way to create that <tn:TaxonName root resource element.
I'm starting to think that just outputting vanilla text could be the best option here, or am I barking up the wrong tree?
Many thanks, Peter.
Peter Hollas MSc BSc(hons) (Peter.Hollas@thomson.com) Software Engineer /Systems Administrator Thomson Zoological Innovation Centre York Science Park Heslington York YO10 5DG
Tel: 01904-435113 Fax: 01904-435114
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
Hi Peter,
Ben is right. To get a TaxonName element (instead of an rdf:Description), add the following line of code:
res.addProperty(model.createProperty(" http://www.w3.org/1999/02/22-rdf-syntax-ns#type"), TN_NS + "TaxonName");
This should give you output like :
<rdf:RDF xmlns:j.0="http://tdwg.org/2006/03/12/TaxonNames" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E <j.0:TaxonName rdf:about="http://test%22%3E j.0:testPropertytest</j.0:testProperty> </j.0:TaxonName> </rdf:RDF>
XML namespace prefixes are not preserved in RDF so Jena will supply its own prefixes when serializing to one of the XML formats (hence the j.0 instead of tn).
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
Benjamin H Szekely wrote:
Hi Peter, In XML/ABBREV mode, the serializer uses rdf:Types for element names. Since your example doesn't have an rdf:Type, it uses the default Jena element names.
- Ben
Ben Szekely IBM Software Engineer Advanced Internet Technology, Cambridge, MA bhszekel@us.ibm.com
tdwg-guid-bounces@mailman.nhm.ku.edu wrote on 09/25/2006 11:02:30 AM:
Hi Everyone,
Does anyone have/or know of any good examples of using Jena to output RDF metadata with non-typical namespaces such as TCS/RDF?
Currently I have the following code for my metadata method(I know it's wrong!).
String TN_NS = "http://tdwg.org/2006/03/12/TaxonNames/"; ByteArrayOutputStream byteStream = new ByteArrayOutputStream(); Model model = ModelFactory.createDefaultModel(); Resource res = model.createResource("http://test"); res.addProperty(model.createProperty(TN_NS,"testProperty"),"test");
model.write(byteStream, "RDF/XML-ABBREV"); return new MetadataResponse(new ByteArrayInputStream(byteStream.toByteArray()),null,MetadataResponse.RDF _FORMAT);
...which gives me.
<rdf:RDF xmlns:j.0="http://tdwg.org/2006/03/12/TaxonNames/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E <rdf:Description rdf:about="http://test%22%3E <j.0:testProperty >test</j.0:testProperty> </rdf:Description> </rdf:RDF>
...whereas I could really do with something along the lines of the IPNI metadata.
<rdf:RDF xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:tn="http://tdwg.org/2006/03/12/TaxonNames/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E
<!-- example family (has an unlinked typifying name with
non-
standard format data)--> tn:TaxonName rdf:about="urn:lsid:ipni.org:names:30000959-2:1.1.2.1" <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> dc:titleAmaryllidaceae J.St.-Hil.</dc:title> dcterms:created2004-01-20 00:00:00.0</dcterms:created> dcterms:modified2005-06-23 15:45:33.0</dcterms:modified> tn:rankStringfam.</tn:rankString> tn:nameCompleteAmaryllidaceae</tn:nameComplete> tn:uninomialAmaryllidaceae</tn:uninomial> tn:authorshipJ.St.-Hil.</tn:authorship> tn:publishedInExpos. Fam. Nat. 1: 134. 1805 [Feb-Apr 1805]</tn:publishedIn> tn:year1805</tn:year> tn:typifiedBy tn:NomenclaturalType dc:titleAmaryllis Linnaeus, nom. cons.</dc:title> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>
I think the main problem I have is creating resources that aren't defined as vocabularies within Jena already. I can't seem to find a way to create that <tn:TaxonName root resource element.
I'm starting to think that just outputting vanilla text could be the best option here, or am I barking up the wrong tree?
Many thanks, Peter.
Peter Hollas MSc BSc(hons) (Peter.Hollas@thomson.com) Software Engineer /Systems Administrator Thomson Zoological Innovation Centre York Science Park Heslington York YO10 5DG
Tel: 01904-435113 Fax: 01904-435114
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
Hi Steve /all
We took that syntax straight from Roger's RDF/TCS examples. I think Roger was going to do more work on tidying up those sorts of loose ends. I have to admit that my knowledge of RDF and particularly RDFS is pretty superficial
We can switch to either the shorter format or the safer fully qualified URI - what do people think would be better?
Sally
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
Hi Everyone,
In the not very comforting words of software vendors these are "known issues" and will be resolved in the next release ;)
But seriously.The namespaces in the TCS names vocabulary do not resolve because we didn't have a policy at that time. The use if entity references was 'borrowed' from an example (probably Protege output) and I wouldn't mind doing away with it.
I have been thinking long and hard about the namespace issues in the last few weeks and believe I have a solution that I will propose at TDWG St Louis. It would be good to have face to face discussions about it and make a decision there.
To briefly summarize: The issue is getting a namespace convention that will work across technologies. Suppose we want to serve data in a technology that "isn't very good at namespaces". As an example - if we were to have separate namespaces for TaxonNames, TaxonConcepts, Specimens, Metadata, GeospatialStuff, Collections and we wanted to validate a document using XML Schema that contained all these things it would require 6 independent schemas each with it's own target namespace. If you have ever tried to debug something like this you will know what total madness it is. We can't just abandon XML Schema because it would rule out not only our existing technologies but GML and probably others... It may also be desirable to express our ontology in things that aren't even XML.
The only solution I can think is that for the TDWG Ontology we should have a single formal namespace of: http://rs.tdwg.org/ontology/
Within the ontology we have a convention for concepts that goes like this.
A class would be: http://rs.tdwg.org/ontology/tdwg123_MyClass
where 123 is the internal id of the class and MyClass is the class name
A property in MyClass would be: http://rs.tdwg.org/ontology/tdwg123_myProperty
where 123 is the *class* id not the property id and myProperty is the property name.
An instance would be http://rs.tdwg.org/ontology/tdwg123_789
where 123 is the *class* id and 789 is the instance id (there is nothing stable about an instance that we could use as an identifier unless we force a label property and make it immutable).
In a way this is using the part before the _ as a pseudo namespace.
I think this hits the balance between something that is technology independent and something that will produce reasonably human readable documents.
It is radical which is why I thought it would be good to talk about it.
Any one got an alternative?
We have to have a solution for this by the end of the St Louis meeting as it is critical path for ontology work.
Most grateful for you patience and any thoughts you have.
Roger
Sally Hinchcliffe wrote:
Hi Steve /all
We took that syntax straight from Roger's RDF/TCS examples. I think Roger was going to do more work on tidying up those sorts of loose ends. I have to admit that my knowledge of RDF and particularly RDFS is pretty superficial
We can switch to either the shorter format or the safer fully qualified URI - what do people think would be better?
Sally
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
I'm trying to nail down a final spec for the IPNI LSID server if I can.
Steve Perry raised the point, over on the tdwg-guid mailing list, that the following bit of output from the IPNI server was an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
This syntax has been used in a number of places - for the value for the nomenclaturalCode, for the nomenclaturalNote noteType (eg. basedOn) and for the typeOfTyp (eg. holo) - basically wherever there was an enumeration in the original schema.
Roger's response is below - but I'm not quite clear how that will work in terms of replacing the syntax Steve found was failing. I'm not sure from Roger's email whether he really meant we should use numeric ids for all literal enumerated values, or whether those were merely place holders - but if so it would seem to make the resulting rdf willfully unreadable.
Roger - can you explain with a more concrete example? Sally
Hi Everyone,
In the not very comforting words of software vendors these are "known issues" and will be resolved in the next release ;)
But seriously.The namespaces in the TCS names vocabulary do not resolve because we didn't have a policy at that time. The use if entity references was 'borrowed' from an example (probably Protege output) and I wouldn't mind doing away with it.
I have been thinking long and hard about the namespace issues in the last few weeks and believe I have a solution that I will propose at TDWG St Louis. It would be good to have face to face discussions about it and make a decision there.
To briefly summarize: The issue is getting a namespace convention that will work across technologies. Suppose we want to serve data in a technology that "isn't very good at namespaces". As an example - if we were to have separate namespaces for TaxonNames, TaxonConcepts, Specimens, Metadata, GeospatialStuff, Collections and we wanted to validate a document using XML Schema that contained all these things it would require 6 independent schemas each with it's own target namespace. If you have ever tried to debug something like this you will know what total madness it is. We can't just abandon XML Schema because it would rule out not only our existing technologies but GML and probably others... It may also be desirable to express our ontology in things that aren't even XML.
The only solution I can think is that for the TDWG Ontology we should have a single formal namespace of: http://rs.tdwg.org/ontology/
Within the ontology we have a convention for concepts that goes like this.
A class would be: http://rs.tdwg.org/ontology/tdwg123_MyClass
where 123 is the internal id of the class and MyClass is the class name
A property in MyClass would be: http://rs.tdwg.org/ontology/tdwg123_myProperty
where 123 is the *class* id not the property id and myProperty is the property name.
An instance would be http://rs.tdwg.org/ontology/tdwg123_789
where 123 is the *class* id and 789 is the instance id (there is nothing stable about an instance that we could use as an identifier unless we force a label property and make it immutable).
In a way this is using the part before the _ as a pseudo namespace.
I think this hits the balance between something that is technology independent and something that will produce reasonably human readable documents.
It is radical which is why I thought it would be good to talk about it.
Any one got an alternative?
We have to have a solution for this by the end of the St Louis meeting as it is critical path for ontology work.
Most grateful for you patience and any thoughts you have.
Roger
Sally Hinchcliffe wrote:
Hi Steve /all
We took that syntax straight from Roger's RDF/TCS examples. I think Roger was going to do more work on tidying up those sorts of loose ends. I have to admit that my knowledge of RDF and particularly RDFS is pretty superficial
We can switch to either the shorter format or the safer fully qualified URI - what do people think would be better?
Sally
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
--
Roger Hyam Technical Architect Taxonomic Databases Working Group
http://www.tdwg.org roger@tdwg.org
+44 1578 722782
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
From an RDF perspective, I would vote for referring to the botanical code (or any other code) using a URI, not a literal, and not a name space reference (such as "tn:botanical) which, as Steve pointed out, RDF parsers can't handle.
By referencing a URI, the semantics are clear (I can fetch whatever the URI points to and figure out what it means).
In other words, something like
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
as Steve proposed (so long as the URI is actually resolvable, which is another difference between XML schema and RDF --- in RDF the name spaces must be resolvable).
I haven't really followed what Roger proposes, but IMHO anything motivated by an attempt to salvage the use of XML schema is bound to end in tears. It's a hack (anything that exposes internal identifiers of a class has got to be hack, surely).
Regards
Rod
On 2 Oct 2006, at 13:52, Sally Hinchcliffe wrote:
I'm trying to nail down a final spec for the IPNI LSID server if I can.
Steve Perry raised the point, over on the tdwg-guid mailing list, that the following bit of output from the IPNI server was an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
This syntax has been used in a number of places - for the value for the nomenclaturalCode, for the nomenclaturalNote noteType (eg. basedOn) and for the typeOfTyp (eg. holo) - basically wherever there was an enumeration in the original schema.
Roger's response is below - but I'm not quite clear how that will work in terms of replacing the syntax Steve found was failing. I'm not sure from Roger's email whether he really meant we should use numeric ids for all literal enumerated values, or whether those were merely place holders - but if so it would seem to make the resulting rdf willfully unreadable.
Roger - can you explain with a more concrete example? Sally
Hi Everyone,
In the not very comforting words of software vendors these are "known issues" and will be resolved in the next release ;)
But seriously.The namespaces in the TCS names vocabulary do not resolve because we didn't have a policy at that time. The use if entity references was 'borrowed' from an example (probably Protege output) and I wouldn't mind doing away with it.
I have been thinking long and hard about the namespace issues in the last few weeks and believe I have a solution that I will propose at TDWG St Louis. It would be good to have face to face discussions about it and make a decision there.
To briefly summarize: The issue is getting a namespace convention that will work across technologies. Suppose we want to serve data in a technology that "isn't very good at namespaces". As an example - if we were to have separate namespaces for TaxonNames, TaxonConcepts, Specimens, Metadata, GeospatialStuff, Collections and we wanted to validate a document using XML Schema that contained all these things it would require 6 independent schemas each with it's own target namespace. If you have ever tried to debug something like this you will know what total madness it is. We can't just abandon XML Schema because it would rule out not only our existing technologies but GML and probably others... It may also be desirable to express our ontology in things that aren't even XML.
The only solution I can think is that for the TDWG Ontology we should have a single formal namespace of: http://rs.tdwg.org/ontology/
Within the ontology we have a convention for concepts that goes like this.
A class would be: http://rs.tdwg.org/ontology/tdwg123_MyClass
where 123 is the internal id of the class and MyClass is the class name
A property in MyClass would be: http://rs.tdwg.org/ontology/tdwg123_myProperty
where 123 is the *class* id not the property id and myProperty is the property name.
An instance would be http://rs.tdwg.org/ontology/tdwg123_789
where 123 is the *class* id and 789 is the instance id (there is nothing stable about an instance that we could use as an identifier unless we force a label property and make it immutable).
In a way this is using the part before the _ as a pseudo namespace.
I think this hits the balance between something that is technology independent and something that will produce reasonably human readable documents.
It is radical which is why I thought it would be good to talk about it.
Any one got an alternative?
We have to have a solution for this by the end of the St Louis meeting as it is critical path for ontology work.
Most grateful for you patience and any thoughts you have.
Roger
Sally Hinchcliffe wrote:
Hi Steve /all
We took that syntax straight from Roger's RDF/TCS examples. I think Roger was going to do more work on tidying up those sorts of loose ends. I have to admit that my knowledge of RDF and particularly RDFS is pretty superficial
We can switch to either the shorter format or the safer fully qualified URI - what do people think would be better?
Sally
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
--
Roger Hyam Technical Architect Taxonomic Databases Working Group
http://www.tdwg.org roger@tdwg.org
+44 1578 722782
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
------------------------------------------------------------------------ ---------------------------------------- Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html iChat: aim://rodpage1962 reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species: http://ispecies.org Rod's rants on phyloinformatics: http://iphylo.blogspot.com Rod's rants on ants: http://semant.blogspot.com
Steve writes:
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be
the fully qualified resource URI which looks like: <tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
It seems the discussion confuses QNames ("namespace-colon") and XML-Entities (ampersand-semicolon) - or am I confused???
An attempt to clarify: From my understanding, xml itself has no such thing as a namespace-colon in literals - xml-schema has introduced it as a convenient thing (QName). However, the use of xml-entities is a requirement of xml 1.0 itself. I agree with Rod that URIs are correct way for RDF:
rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical"
and under no circumstances (even with RDF-xml-Schema) can we use
rdf:resource="tn:/botanical"
because RDF does not use QNames. However, we can use (if abbreviation is an issue, and providing an entity definition for it, as the protege examples do.)
rdf:resource="&tn;/botanical"
If RDF-parsers fail to deal with the latter, they are grossly non-interoperable with xml as a whole.
Gregor---------------------------------------------------------- Gregor Hagedorn (G.Hagedorn@bba.de) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Königin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
I agree with Rod and Gregor that we ought to use full URIs to indicate nomenclatural codes.
Gregor is right that we could use
rdf:resource="&tn;/botanical"
as an abbreviated reference to such a URI, so long as we provide an entity definition for tn in an embedded DTD. If this is done properly, RDF parsers can handle it (because the entity is expanded during XML parsing). However, it's easy to lose the DTD when copying and pasting, when creating your own RDF using someone else's as an example, or when using text processing routines on serialized RDF/XML.
DTDs are defined at the document level while non-anonymous/non-blank RDF resources are global and the full description of a global resource can be split among multiple files. To cut down the likelihood of errors we should probably keep it simple and suggest the use of full URIs as a best practice.
-Steve
Gregor Hagedorn wrote:
Steve writes:
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be
the fully qualified resource URI which looks like: <tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
It seems the discussion confuses QNames ("namespace-colon") and XML-Entities (ampersand-semicolon) - or am I confused???
An attempt to clarify: From my understanding, xml itself has no such thing as a namespace-colon in literals - xml-schema has introduced it as a convenient thing (QName). However, the use of xml-entities is a requirement of xml 1.0 itself. I agree with Rod that URIs are correct way for RDF:
rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical"
and under no circumstances (even with RDF-xml-Schema) can we use
rdf:resource="tn:/botanical"
because RDF does not use QNames. However, we can use (if abbreviation is an issue, and providing an entity definition for it, as the protege examples do.)
rdf:resource="&tn;/botanical"
If RDF-parsers fail to deal with the latter, they are grossly non-interoperable with xml as a whole.
Gregor---------------------------------------------------------- Gregor Hagedorn (G.Hagedorn@bba.de) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Königin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
I'm happy with using rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" and we'll update our example documents and our LSID server accordingly
We'll set up our templates so that if the path changes (e.g. the tdwg.org/2006/03/12/TaxonNames/ part) then we can easily update it.
Thanks Sally
I agree with Rod and Gregor that we ought to use full URIs to indicate nomenclatural codes.
Gregor is right that we could use
rdf:resource="&tn;/botanical"
as an abbreviated reference to such a URI, so long as we provide an entity definition for tn in an embedded DTD. If this is done properly, RDF parsers can handle it (because the entity is expanded during XML parsing). However, it's easy to lose the DTD when copying and pasting, when creating your own RDF using someone else's as an example, or when using text processing routines on serialized RDF/XML.
DTDs are defined at the document level while non-anonymous/non-blank RDF resources are global and the full description of a global resource can be split among multiple files. To cut down the likelihood of errors we should probably keep it simple and suggest the use of full URIs as a best practice.
-Steve
Gregor Hagedorn wrote:
Steve writes:
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be
the fully qualified resource URI which looks like: <tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
It seems the discussion confuses QNames ("namespace-colon") and XML-Entities (ampersand-semicolon) - or am I confused???
An attempt to clarify: From my understanding, xml itself has no such thing as a namespace-colon in literals - xml-schema has introduced it as a convenient thing (QName). However, the use of xml-entities is a requirement of xml 1.0 itself. I agree with Rod that URIs are correct way for RDF:
rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical"
and under no circumstances (even with RDF-xml-Schema) can we use
rdf:resource="tn:/botanical"
because RDF does not use QNames. However, we can use (if abbreviation is an issue, and providing an entity definition for it, as the protege examples do.)
rdf:resource="&tn;/botanical"
If RDF-parsers fail to deal with the latter, they are grossly non-interoperable with xml as a whole.
Gregor---------------------------------------------------------- Gregor Hagedorn (G.Hagedorn@bba.de) Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Königin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
Hi Sally,
No problem. The task was to create a prototype LSID resolver, not to solve all the KR issues surrounding taxon concepts. However, I do think it's time we start talking about these issues. I worry that the prototype resolvers we set up will become de facto reference implementations, that other people will start to construct services modeled on the prototypes without us ever having gone back to talk about what worked and what didn't.
I know there are several versions of TCS-in-RDF floating around. I think Roger's is an RDFS document. Rob Gales created an OWL-DL version for the GBIF demonstration project that Jessie and he worked on this summer. Early this year I created a partial implementation in OWL-Lite (that I've since discarded). While each one is "TCS", they're all substantially different in the way they represent TCS classes and properties, in part because the different representation languages (RDFS, OWL-Lite, OWL-DL) have different language features and expressive powers.
It would be nice if we could devise one standard RDF implementation of TCS. I don't care which one we use, but I would like to narrow the field to one so we can get the details sorted out. I'm talking about details like resolvable namespaces, typed versus non-typed literals, the use of anonymous resources, and serialization issues like the references-to-resources problem that cropped up in the IPNI example Peter Hollas is working from. These details are quite important because certain decisions taken here can effect the larger network of linked data providers.
Take the anonymous resources issues: If you look at the example Peter cites, the typifiedBy property refers to an anonymous NomenclaturalType that has a dc:title. Within a single data provider, this is no big deal because many different data objects can refer to this NomenclaturalType. However the use of anonymous resources can cause big problems when you try to harvest and index the data from multiple providers. It also causes problems for the caching use case.
It would be nice to discuss some of these things, perhaps within TAG.
-Steve
Sally Hinchcliffe wrote:
Hi Steve /all
We took that syntax straight from Roger's RDF/TCS examples. I think Roger was going to do more work on tidying up those sorts of loose ends. I have to admit that my knowledge of RDF and particularly RDFS is pretty superficial
We can switch to either the shorter format or the safer fully qualified URI - what do people think would be better?
Sally
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
Hi Steve/all
I agree that we should start trying to nail down the TCS/RDF more firmly. I know that Roger wants to discuss this at TDWG but I think some firm proposals are needed to take to the meeting - and some robust examples as well, for those of us who struggle with RDFS, OWL et al and need to see actual implementations to properly engage with the proposals. Otherwise, as you say, we will end up with some de facto standards based on LSID implementations which were never intended to be anything more than proof of concepts.
Now is a good time to tackle this for us, because we're re-opening the LSID implementation that we did for IPNI and trying to bring it up to something that will be properly releasable and can be more widely publicised. As part of that work it will be easy to make any changes to the metadata that's produced. Once released, pressures of time and other projects will make it hard to re-open the work and change the metadata output.
Is TAG the best place to discuss this? I'm not on that mailing list but I can probably join if the discussion is going to be over there.
Let me know
Sally
Hi Sally,
No problem. The task was to create a prototype LSID resolver, not to solve all the KR issues surrounding taxon concepts. However, I do think it's time we start talking about these issues. I worry that the prototype resolvers we set up will become de facto reference implementations, that other people will start to construct services modeled on the prototypes without us ever having gone back to talk about what worked and what didn't.
I know there are several versions of TCS-in-RDF floating around. I think Roger's is an RDFS document. Rob Gales created an OWL-DL version for the GBIF demonstration project that Jessie and he worked on this summer. Early this year I created a partial implementation in OWL-Lite (that I've since discarded). While each one is "TCS", they're all substantially different in the way they represent TCS classes and properties, in part because the different representation languages (RDFS, OWL-Lite, OWL-DL) have different language features and expressive powers.
It would be nice if we could devise one standard RDF implementation of TCS. I don't care which one we use, but I would like to narrow the field to one so we can get the details sorted out. I'm talking about details like resolvable namespaces, typed versus non-typed literals, the use of anonymous resources, and serialization issues like the references-to-resources problem that cropped up in the IPNI example Peter Hollas is working from. These details are quite important because certain decisions taken here can effect the larger network of linked data providers.
Take the anonymous resources issues: If you look at the example Peter cites, the typifiedBy property refers to an anonymous NomenclaturalType that has a dc:title. Within a single data provider, this is no big deal because many different data objects can refer to this NomenclaturalType. However the use of anonymous resources can cause big problems when you try to harvest and index the data from multiple providers. It also causes problems for the caching use case.
It would be nice to discuss some of these things, perhaps within TAG.
-Steve
Sally Hinchcliffe wrote:
Hi Steve /all
We took that syntax straight from Roger's RDF/TCS examples. I think Roger was going to do more work on tidying up those sorts of loose ends. I have to admit that my knowledge of RDF and particularly RDFS is pretty superficial
We can switch to either the shorter format or the safer fully qualified URI - what do people think would be better?
Sally
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
I suggest we move the discussion to the TAG list as it is wider than LSIDs.
Please join the TAG list Sally - we would appreciate your thoughts
http://lists.tdwg.org/mailman/listinfo/tdwg-tag
I have set up a wiki page here:
http://wiki.tdwg.org/twiki/bin/view/TAG/EstablishingLsidVocabularies
where I am brain storming the issues. Everyone is invited to add ideas/issues to this page.
I would like to just take a series of decisions so that we have a solution to these issues as soon as possible but I am aware that we must move forward as a group. This is slower but has to be done.
The situation with TCS/RDF isn't as bad as it might seem. Rob Gales version is based very heavily on the RDFS version that I did earlier in the year for GUID-2 (which IPNI and IF use) so provided Steve keeps his version buried we really only have two versions in the wild and they are really two ports of the same thing.
If I could have answers to two questions I could probably get the TCS/RDF vocabularies set up in the right place pretty damn quickly. * How do we divide the namespace? - I suggested an approach earlier. * What language do we use for the RDF vocabularies in metadata responses? RDFS/OWL
I have run out of time here today so I will not have a chance to add anything more to the wiki but will look out the RDFS and OWL versions of TCS Names part tomorrow and try and provide a commentary on any differences.
More later,
Roger
Sally Hinchcliffe wrote:
Hi Steve/all
I agree that we should start trying to nail down the TCS/RDF more firmly. I know that Roger wants to discuss this at TDWG but I think some firm proposals are needed to take to the meeting - and some robust examples as well, for those of us who struggle with RDFS, OWL et al and need to see actual implementations to properly engage with the proposals. Otherwise, as you say, we will end up with some de facto standards based on LSID implementations which were never intended to be anything more than proof of concepts.
Now is a good time to tackle this for us, because we're re-opening the LSID implementation that we did for IPNI and trying to bring it up to something that will be properly releasable and can be more widely publicised. As part of that work it will be easy to make any changes to the metadata that's produced. Once released, pressures of time and other projects will make it hard to re-open the work and change the metadata output.
Is TAG the best place to discuss this? I'm not on that mailing list but I can probably join if the discussion is going to be over there.
Let me know
Sally
Hi Sally,
No problem. The task was to create a prototype LSID resolver, not to solve all the KR issues surrounding taxon concepts. However, I do think it's time we start talking about these issues. I worry that the prototype resolvers we set up will become de facto reference implementations, that other people will start to construct services modeled on the prototypes without us ever having gone back to talk about what worked and what didn't.
I know there are several versions of TCS-in-RDF floating around. I think Roger's is an RDFS document. Rob Gales created an OWL-DL version for the GBIF demonstration project that Jessie and he worked on this summer. Early this year I created a partial implementation in OWL-Lite (that I've since discarded). While each one is "TCS", they're all substantially different in the way they represent TCS classes and properties, in part because the different representation languages (RDFS, OWL-Lite, OWL-DL) have different language features and expressive powers.
It would be nice if we could devise one standard RDF implementation of TCS. I don't care which one we use, but I would like to narrow the field to one so we can get the details sorted out. I'm talking about details like resolvable namespaces, typed versus non-typed literals, the use of anonymous resources, and serialization issues like the references-to-resources problem that cropped up in the IPNI example Peter Hollas is working from. These details are quite important because certain decisions taken here can effect the larger network of linked data providers.
Take the anonymous resources issues: If you look at the example Peter cites, the typifiedBy property refers to an anonymous NomenclaturalType that has a dc:title. Within a single data provider, this is no big deal because many different data objects can refer to this NomenclaturalType. However the use of anonymous resources can cause big problems when you try to harvest and index the data from multiple providers. It also causes problems for the caching use case.
It would be nice to discuss some of these things, perhaps within TAG.
-Steve
Sally Hinchcliffe wrote:
Hi Steve /all
We took that syntax straight from Roger's RDF/TCS examples. I think Roger was going to do more work on tidying up those sorts of loose ends. I have to admit that my knowledge of RDF and particularly RDFS is pretty superficial
We can switch to either the shorter format or the safer fully qualified URI - what do people think would be better?
Sally
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
OK - have joined. Whoopee, more email ...
I suggest we move the discussion to the TAG list as it is wider than LSIDs.
Please join the TAG list Sally - we would appreciate your thoughts
http://lists.tdwg.org/mailman/listinfo/tdwg-tag
I have set up a wiki page here:
http://wiki.tdwg.org/twiki/bin/view/TAG/EstablishingLsidVocabularies
where I am brain storming the issues. Everyone is invited to add ideas/issues to this page.
I would like to just take a series of decisions so that we have a solution to these issues as soon as possible but I am aware that we must move forward as a group. This is slower but has to be done.
The situation with TCS/RDF isn't as bad as it might seem. Rob Gales version is based very heavily on the RDFS version that I did earlier in the year for GUID-2 (which IPNI and IF use) so provided Steve keeps his version buried we really only have two versions in the wild and they are really two ports of the same thing.
If I could have answers to two questions I could probably get the TCS/RDF vocabularies set up in the right place pretty damn quickly.
- How do we divide the namespace? - I suggested an approach earlier.
- What language do we use for the RDF vocabularies in metadata
responses? RDFS/OWL
I have run out of time here today so I will not have a chance to add anything more to the wiki but will look out the RDFS and OWL versions of TCS Names part tomorrow and try and provide a commentary on any differences.
More later,
Roger
Sally Hinchcliffe wrote:
Hi Steve/all
I agree that we should start trying to nail down the TCS/RDF more firmly. I know that Roger wants to discuss this at TDWG but I think some firm proposals are needed to take to the meeting - and some robust examples as well, for those of us who struggle with RDFS, OWL et al and need to see actual implementations to properly engage with the proposals. Otherwise, as you say, we will end up with some de facto standards based on LSID implementations which were never intended to be anything more than proof of concepts.
Now is a good time to tackle this for us, because we're re-opening the LSID implementation that we did for IPNI and trying to bring it up to something that will be properly releasable and can be more widely publicised. As part of that work it will be easy to make any changes to the metadata that's produced. Once released, pressures of time and other projects will make it hard to re-open the work and change the metadata output.
Is TAG the best place to discuss this? I'm not on that mailing list but I can probably join if the discussion is going to be over there.
Let me know
Sally
Hi Sally,
No problem. The task was to create a prototype LSID resolver, not to solve all the KR issues surrounding taxon concepts. However, I do think it's time we start talking about these issues. I worry that the prototype resolvers we set up will become de facto reference implementations, that other people will start to construct services modeled on the prototypes without us ever having gone back to talk about what worked and what didn't.
I know there are several versions of TCS-in-RDF floating around. I think Roger's is an RDFS document. Rob Gales created an OWL-DL version for the GBIF demonstration project that Jessie and he worked on this summer. Early this year I created a partial implementation in OWL-Lite (that I've since discarded). While each one is "TCS", they're all substantially different in the way they represent TCS classes and properties, in part because the different representation languages (RDFS, OWL-Lite, OWL-DL) have different language features and expressive powers.
It would be nice if we could devise one standard RDF implementation of TCS. I don't care which one we use, but I would like to narrow the field to one so we can get the details sorted out. I'm talking about details like resolvable namespaces, typed versus non-typed literals, the use of anonymous resources, and serialization issues like the references-to-resources problem that cropped up in the IPNI example Peter Hollas is working from. These details are quite important because certain decisions taken here can effect the larger network of linked data providers.
Take the anonymous resources issues: If you look at the example Peter cites, the typifiedBy property refers to an anonymous NomenclaturalType that has a dc:title. Within a single data provider, this is no big deal because many different data objects can refer to this NomenclaturalType. However the use of anonymous resources can cause big problems when you try to harvest and index the data from multiple providers. It also causes problems for the caching use case.
It would be nice to discuss some of these things, perhaps within TAG.
-Steve
Sally Hinchcliffe wrote:
Hi Steve /all
We took that syntax straight from Roger's RDF/TCS examples. I think Roger was going to do more work on tidying up those sorts of loose ends. I have to admit that my knowledge of RDF and particularly RDFS is pretty superficial
We can switch to either the shorter format or the safer fully qualified URI - what do people think would be better?
Sally
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
--
Roger Hyam Technical Architect Taxonomic Databases Working Group
http://www.tdwg.org roger@tdwg.org
+44 1578 722782
*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk
Hi,
Could someone advise on which TCS/RDF ontology would be the best to implement for metadata coming from nomenclatural sources such as ZooBank? I've yet to come across anything other than a TCS XML Schema in public circulation.
I've decided to do away with using the Jena API altogether for returning metadata responses from ZooBank; it seems to be rather heavy handed approach to returning what is basically a simple structured text document. A much more flexible way to go is with a page templating system, especially when the schemata are in constant flux. A Spring Framework MVC/JSTL endpoint will allow for schemata changes to be implemented without recompilation. The LSID metadata class will just act as a façade/decorator to the templating system.
Many thanks, Peter.
Hi Peter,
I was just writing a response to your question about Jena, but I'll scrap it now. Unfortunately there's no standard representation for TCS in RDF at this time. Hopefully there will be one in the near future.
Jena is well suited to consuming RDF metadata, but I agree with you that it's a bit heavyweight if all you want to do is produce many instances of a single class.
Visualizing the problem as one of templating and using Spring MVC is a neat approach. I use Spring mostly for dependency injection and hadn't considered that it might be used to isolate this kind of software from changes in the schema.
-Steve
peter.hollas@thomson.com wrote:
Hi,
Could someone advise on which TCS/RDF ontology would be the best to implement for metadata coming from nomenclatural sources such as ZooBank? I've yet to come across anything other than a TCS XML Schema in public circulation.
I've decided to do away with using the Jena API altogether for returning metadata responses from ZooBank; it seems to be rather heavy handed approach to returning what is basically a simple structured text document. A much more flexible way to go is with a page templating system, especially when the schemata are in constant flux. A Spring Framework MVC/JSTL endpoint will allow for schemata changes to be implemented without recompilation. The LSID metadata class will just act as a façade/decorator to the templating system.
Many thanks, Peter.
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
Hi Steve,
I can't seem to get your example to give me <tn:TaxonName rdf:about="lsid info">, it just seems to add a property.
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E <rdf:Description rdf:about="urn:lsid:zoobank.org:taxon:123"> rdf:typehttp://tdwg.org/2006/03/12/TaxonNames/TaxonName</rdf:type> </rdf:Description> </rdf:RDF>
From my code...
ByteArrayOutputStream byteStream = new ByteArrayOutputStream(); Model model = ModelFactory.createDefaultModel(); Resource res = model.createResource(lsid.toString()); res.addProperty(model.createProperty("http://www.w3.org/1999/02/22-rdf-s yntax-ns#type"), TN_NS + "TaxonName");
model.write(byteStream, "RDF/XML-ABBREV"); return new MetadataResponse(new ByteArrayInputStream(byteStream.toByteArray()),null,MetadataResponse.RDF _FORMAT);
Many thanks, Peter.
-----Original Message----- From: Steve Perry [mailto:smperry@ku.edu] Sent: 25 September 2006 20:54 To: Hollas, Peter (TS UK); tdwg-guid@mailman.nhm.ku.edu Subject: Re: [Tdwg-guid] Jena examples?
Hi Peter,
Ben is right. To get a TaxonName element (instead of an rdf:Description), add the following line of code:
res.addProperty(model.createProperty(" http://www.w3.org/1999/02/22-rdf-syntax-ns#type"), TN_NS + "TaxonName");
This should give you output like :
<rdf:RDF xmlns:j.0="http://tdwg.org/2006/03/12/TaxonNames" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E <j.0:TaxonName rdf:about="http://test%22%3E j.0:testPropertytest</j.0:testProperty> </j.0:TaxonName> </rdf:RDF>
XML namespace prefixes are not preserved in RDF so Jena will supply its own prefixes when serializing to one of the XML formats (hence the j.0 instead of tn).
By the way, the IPNI example you cite has an error:
<tn:nomenclaturalCode rdf:resource="&tn;#botanical" />
Many RDF/XML parsers will see &tn; as an entity which cannot be resolved. Since I don't have a copy of the ontology (and http://tdwg.org/2006/03/12/TaxonNames does not resolve), I can only take
a guess that it should look something like:
<tn:nomenclaturalCode rdf:resource="tn:botanical" />
However, using XML namespace prefixes in resource references inside RDF/XML documents tends to cause problems because not all RDF/XML parsers are smart enough to dereference the namespace prefix and build a
fully-qualified resource URI. A safer form of the above would be the fully qualified resource URI which looks like:
<tn:nomenclaturalCode rdf:resource="http://tdwg.org/2006/03/12/TaxonNames/botanical" />
-Steve
Benjamin H Szekely wrote:
Hi Peter, In XML/ABBREV mode, the serializer uses rdf:Types for element
names.
Since your example doesn't have an rdf:Type, it uses the default Jena element names.
- Ben
Ben Szekely IBM Software Engineer Advanced Internet Technology, Cambridge, MA bhszekel@us.ibm.com
tdwg-guid-bounces@mailman.nhm.ku.edu wrote on 09/25/2006 11:02:30 AM:
Hi Everyone,
Does anyone have/or know of any good examples of using Jena to output RDF metadata with non-typical namespaces such as TCS/RDF?
Currently I have the following code for my metadata method(I know
it's
wrong!).
String TN_NS = "http://tdwg.org/2006/03/12/TaxonNames/"; ByteArrayOutputStream byteStream = new ByteArrayOutputStream(); Model model = ModelFactory.createDefaultModel(); Resource res = model.createResource("http://test"); res.addProperty(model.createProperty(TN_NS,"testProperty"),"test");
model.write(byteStream, "RDF/XML-ABBREV"); return new MetadataResponse(new
ByteArrayInputStream(byteStream.toByteArray()),null,MetadataResponse.RDF
_FORMAT);
...which gives me.
<rdf:RDF xmlns:j.0="http://tdwg.org/2006/03/12/TaxonNames/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E <rdf:Description rdf:about="http://test%22%3E <j.0:testProperty >test</j.0:testProperty> </rdf:Description> </rdf:RDF>
...whereas I could really do with something along the lines of the
IPNI
metadata.
<rdf:RDF xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:tn="http://tdwg.org/2006/03/12/TaxonNames/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#%22%3E
<!-- example family (has an unlinked typifying name with
non-
standard format data)--> tn:TaxonName rdf:about="urn:lsid:ipni.org:names:30000959-2:1.1.2.1" <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> dc:titleAmaryllidaceae J.St.-Hil.</dc:title> dcterms:created2004-01-20 00:00:00.0</dcterms:created> dcterms:modified2005-06-23 15:45:33.0</dcterms:modified> tn:rankStringfam.</tn:rankString> tn:nameCompleteAmaryllidaceae</tn:nameComplete> tn:uninomialAmaryllidaceae</tn:uninomial> tn:authorshipJ.St.-Hil.</tn:authorship> tn:publishedInExpos. Fam. Nat. 1: 134. 1805 [Feb-Apr 1805]</tn:publishedIn> tn:year1805</tn:year> tn:typifiedBy tn:NomenclaturalType dc:titleAmaryllis Linnaeus, nom. cons.</dc:title> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>
I think the main problem I have is creating resources that aren't defined as vocabularies within Jena already. I can't seem to find a
way
to create that <tn:TaxonName root resource element.
I'm starting to think that just outputting vanilla text could be the best option here, or am I barking up the wrong tree?
Many thanks, Peter.
Peter Hollas MSc BSc(hons) (Peter.Hollas@thomson.com) Software Engineer /Systems Administrator Thomson Zoological Innovation Centre York Science Park Heslington York YO10 5DG
Tel: 01904-435113 Fax: 01904-435114
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
------------------------------------------------------------------------
TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
participants (7)
-
Benjamin H Szekely
-
Gregor Hagedorn
-
peter.hollas@thomson.com
-
Roderic Page
-
Roger Hyam
-
Sally Hinchcliffe
-
Steve Perry