[tdwg-tapir] Output model definition and mixed name spaces.
Hi Everyone,
In the few days between Christmas and New Year I am doing some thinking about architecture and the general problem of validating (controlling) data alongside using a more RDF based approach. This is an open world / closed world problem really but relates to Tapir as follows. Suppose I want to specify an output model like this:
<?xml version="1.0"?> <rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:vCard = "http://www.w3.org/2001/vcard-rdf/3.0#%22%3E rdf:Description dc:creator <rdf:Description rdf:about = "http://qqqfoo.com/staff/corky" > rdfs:label Corky Crystal </rdfs:label> vCard:FN Corky Crystal </vCard:FN> <vCard:N rdf:parseType="Resource"> vCard:Family Crystal </vCard:Family> vCard:Given Corky </vCard:Given> vCard:Other Jacky </vCard:Other> vCard:Prefix Dr </vCard:Prefix> </vCard:N> vCard:BDAY 1980-01-01 </vCard:BDAY> </rdf:Description> </dc:creator> </rdf:Description> </rdf:RDF>
This is taken from: http://dublincore.org/documents/dcq-rdf-xml/#sec3 and is a good example or using a mix of 'other' vocabularies to describe something. DublinCore and vCard are pretty standard and should really not be re-invented.
I can't see how to define this structure in a simple XML Schema. It only allows a single target namespace so one needs to have at least 4 XSD documents (1 + 3 imported) to define this tightly. Here is an example of how to do this kind of thing:
http://www-128.ibm.com/developerworks/xml/library/x-tipschnm.html
We have the additional problem that there aren't actual public XML Schema definitions of these things so one would have to make up schemas defining the elements used. It all gets quite horrible.
Does the subset of XML Schema in Tapir support imports?
Tapir has the intention of combining separate 'concepts' (which implies different namespaces) into a single output model but do the concepts all have to be mapped into a single namespace to make the thing workable?
I may be missing something obvious with XML Schema or Tapir (or both) in which case I would be grateful if some one could set me right. I will blame it all on an excess of rich food and red wine over the past few days.
Many thanks for your thoughts.
Happy New Year,
Roger
Roger,
I'm only just catching up with some of this myself, but I believe that you really need RDF Schema or OWL to define the constraints on semantically acceptable documents. Plain RDF and XML Schema will only allow you to constrain this kind of document to what is syntactically valid for RDF.
Donald
--------------------------------------------------------------- Donald Hobern (dhobern@gbif.org) Programme Officer for Data Access and Database Interoperability Global Biodiversity Information Facility Secretariat Universitetsparken 15, DK-2100 Copenhagen, Denmark Tel: +45-35321483 Mobile: +45-28751483 Fax: +45-35321480 ---------------------------------------------------------------
-----Original Message----- From: tdwg-tapir-bounces@lists.tdwg.org [mailto:tdwg-tapir-bounces@lists.tdwg.org] On Behalf Of Roger Hyam Sent: 30 December 2005 11:52 To: tdwg-tapir@lists.tdwg.org Subject: [tdwg-tapir] Output model definition and mixed name spaces.
Hi Everyone,
In the few days between Christmas and New Year I am doing some thinking about architecture and the general problem of validating (controlling) data alongside using a more RDF based approach. This is an open world / closed world problem really but relates to Tapir as follows. Suppose I want to specify an output model like this:
<?xml version="1.0"?> <rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:vCard = "http://www.w3.org/2001/vcard-rdf/3.0#%22%3E rdf:Description dc:creator <rdf:Description rdf:about = "http://qqqfoo.com/staff/corky" > rdfs:label Corky Crystal </rdfs:label> vCard:FN Corky Crystal </vCard:FN> <vCard:N rdf:parseType="Resource"> vCard:Family Crystal </vCard:Family> vCard:Given Corky </vCard:Given> vCard:Other Jacky </vCard:Other> vCard:Prefix Dr </vCard:Prefix> </vCard:N> vCard:BDAY 1980-01-01 </vCard:BDAY> </rdf:Description> </dc:creator> </rdf:Description> </rdf:RDF>
This is taken from: http://dublincore.org/documents/dcq-rdf-xml/#sec3 and is a good example or using a mix of 'other' vocabularies to describe something. DublinCore and vCard are pretty standard and should really not be re-invented.
I can't see how to define this structure in a simple XML Schema. It only allows a single target namespace so one needs to have at least 4 XSD documents (1 + 3 imported) to define this tightly. Here is an example of how to do this kind of thing:
http://www-128.ibm.com/developerworks/xml/library/x-tipschnm.html
We have the additional problem that there aren't actual public XML Schema definitions of these things so one would have to make up schemas defining the elements used. It all gets quite horrible.
Does the subset of XML Schema in Tapir support imports?
Tapir has the intention of combining separate 'concepts' (which implies different namespaces) into a single output model but do the concepts all have to be mapped into a single namespace to make the thing workable?
I may be missing something obvious with XML Schema or Tapir (or both) in which case I would be grateful if some one could set me right. I will blame it all on an excess of rich food and red wine over the past few days.
Many thanks for your thoughts.
Happy New Year,
Roger
Donald,
Yes, this is what I am thinking for the architecture. We need to trade validation for semantics. The validation moves more to the application layer than the transfer format - which I don't have a problem with but which is a change in mindset - but these are architecture issues - more later.
What I was more worried about from the Tapir perspective is how we define output models with multiple namespaces. Whether or not we actually validate the response we need to be able to define the model. Does the protocol allow us to ask for data back in arbitrary XML structures (including different namespaces and so potentially RDF/XML serializations) or are we restricting it to a single namespace in initial implementations.
Roger
Donald Hobern wrote:
Roger,
I'm only just catching up with some of this myself, but I believe that you really need RDF Schema or OWL to define the constraints on semantically acceptable documents. Plain RDF and XML Schema will only allow you to constrain this kind of document to what is syntactically valid for RDF.
Donald
Donald Hobern (dhobern@gbif.org) Programme Officer for Data Access and Database Interoperability Global Biodiversity Information Facility Secretariat Universitetsparken 15, DK-2100 Copenhagen, Denmark Tel: +45-35321483 Mobile: +45-28751483 Fax: +45-35321480
-----Original Message----- From: tdwg-tapir-bounces@lists.tdwg.org [mailto:tdwg-tapir-bounces@lists.tdwg.org] On Behalf Of Roger Hyam Sent: 30 December 2005 11:52 To: tdwg-tapir@lists.tdwg.org Subject: [tdwg-tapir] Output model definition and mixed name spaces.
Hi Everyone,
In the few days between Christmas and New Year I am doing some thinking about architecture and the general problem of validating (controlling) data alongside using a more RDF based approach. This is an open world / closed world problem really but relates to Tapir as follows. Suppose I want to specify an output model like this:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:vCard = "http://www.w3.org/2001/vcard-rdf/3.0#%22%3E rdf:Description dc:creator <rdf:Description rdf:about = "http://qqqfoo.com/staff/corky" > rdfs:label Corky Crystal </rdfs:label> vCard:FN Corky Crystal </vCard:FN> <vCard:N rdf:parseType="Resource"> vCard:Family Crystal </vCard:Family> vCard:Given Corky </vCard:Given> vCard:Other Jacky </vCard:Other> vCard:Prefix Dr </vCard:Prefix> </vCard:N> vCard:BDAY 1980-01-01 </vCard:BDAY> </rdf:Description> </dc:creator> </rdf:Description> </rdf:RDF>
This is taken from: http://dublincore.org/documents/dcq-rdf-xml/#sec3 and is a good example or using a mix of 'other' vocabularies to describe something. DublinCore and vCard are pretty standard and should really not be re-invented.
I can't see how to define this structure in a simple XML Schema. It only allows a single target namespace so one needs to have at least 4 XSD documents (1 + 3 imported) to define this tightly. Here is an example of how to do this kind of thing:
http://www-128.ibm.com/developerworks/xml/library/x-tipschnm.html
We have the additional problem that there aren't actual public XML Schema definitions of these things so one would have to make up schemas defining the elements used. It all gets quite horrible.
Does the subset of XML Schema in Tapir support imports?
Tapir has the intention of combining separate 'concepts' (which implies different namespaces) into a single output model but do the concepts all have to be mapped into a single namespace to make the thing workable?
I may be missing something obvious with XML Schema or Tapir (or both) in which case I would be grateful if some one could set me right. I will blame it all on an excess of rich food and red wine over the past few days.
Many thanks for your thoughts.
Happy New Year,
Roger
Hi Roger,
You're right about the way of producing an output with mixed namespaces (defining all XSDs and importing them into the main one). The "xsd:import" tag is not part of what we called the "basic schema language" (the subset of XML Schema that needs to be understood by TAPIR provider implementations if they want to be able to interpret custom output models). But it's an optional feature - and a quite interesting one, I agree.
Since mapping uses XPath to identify nodes in the output model, unless I'm missing some point it should be possible, for instance, to map some concept to this node from your example:
/rdf:RDF/rdf:Description/dc:creator/rdf:Description/vCard:FN
Whether the output definition language used by TAPIR was a "good" choice or not, that's another issue. I'm no big fan of XML Schema and I do agree that it is quite complex (that's why we needed to specify a subset of it as the "basic schema language"). But DiGIR was already using it to define "record structures", and all output models from the TDWG community have an XML Schema ready to be used, unlike your example.
Anyway, if you have any other suggestion of a better or easier way to specify output models, I'm certainly interested to know, regardless being too late or not to make such a change in the first version of TAPIR. Personally I've always liked the idea of making TAPIR independent of the output language (capabilities responses would say "I understand this and that output languages defined in these locations"), but that would probably be a drastic change at this time.
Regards, -- Renato
On 30 Dec 2005 at 10:51, Roger Hyam wrote:
Hi Everyone,
In the few days between Christmas and New Year I am doing some thinking about architecture and the general problem of validating (controlling) data alongside using a more RDF based approach. This is an open world / closed world problem really but relates to Tapir as follows. Suppose I want to specify an output model like this:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:vCard = "http://www.w3.org/2001/vcard-rdf/3.0#%22%3E rdf:Description dc:creator <rdf:Description rdf:about = "http://qqqfoo.com/staff/corky" > rdfs:label Corky Crystal </rdfs:label> vCard:FN Corky Crystal </vCard:FN> <vCard:N rdf:parseType="Resource"> vCard:Family Crystal </vCard:Family> vCard:Given Corky </vCard:Given> vCard:Other Jacky </vCard:Other> vCard:Prefix Dr </vCard:Prefix> </vCard:N> vCard:BDAY 1980-01-01 </vCard:BDAY> </rdf:Description> </dc:creator> </rdf:Description> </rdf:RDF>
This is taken from: http://dublincore.org/documents/dcq-rdf-xml/#sec3 and is a good example or using a mix of 'other' vocabularies to describe something. DublinCore and vCard are pretty standard and should really not be re-invented.
I can't see how to define this structure in a simple XML Schema. It only allows a single target namespace so one needs to have at least 4 XSD documents (1 + 3 imported) to define this tightly. Here is an example of how to do this kind of thing:
http://www-128.ibm.com/developerworks/xml/library/x-tipschnm.html
We have the additional problem that there aren't actual public XML Schema definitions of these things so one would have to make up schemas defining the elements used. It all gets quite horrible.
Does the subset of XML Schema in Tapir support imports?
Tapir has the intention of combining separate 'concepts' (which implies different namespaces) into a single output model but do the concepts all have to be mapped into a single namespace to make the thing workable?
I may be missing something obvious with XML Schema or Tapir (or both) in which case I would be grateful if some one could set me right. I will blame it all on an excess of rich food and red wine over the past few days.
Many thanks for your thoughts.
Happy New Year,
Roger
participants (3)
-
Donald Hobern
-
Renato De Giovanni
-
Roger Hyam