Hi Roger,
You're right about the way of producing an output with mixed namespaces (defining all XSDs and importing them into the main one). The "xsd:import" tag is not part of what we called the "basic schema language" (the subset of XML Schema that needs to be understood by TAPIR provider implementations if they want to be able to interpret custom output models). But it's an optional feature - and a quite interesting one, I agree.
Since mapping uses XPath to identify nodes in the output model, unless I'm missing some point it should be possible, for instance, to map some concept to this node from your example:
/rdf:RDF/rdf:Description/dc:creator/rdf:Description/vCard:FN
Whether the output definition language used by TAPIR was a "good" choice or not, that's another issue. I'm no big fan of XML Schema and I do agree that it is quite complex (that's why we needed to specify a subset of it as the "basic schema language"). But DiGIR was already using it to define "record structures", and all output models from the TDWG community have an XML Schema ready to be used, unlike your example.
Anyway, if you have any other suggestion of a better or easier way to specify output models, I'm certainly interested to know, regardless being too late or not to make such a change in the first version of TAPIR. Personally I've always liked the idea of making TAPIR independent of the output language (capabilities responses would say "I understand this and that output languages defined in these locations"), but that would probably be a drastic change at this time.
Regards, -- Renato
On 30 Dec 2005 at 10:51, Roger Hyam wrote:
Hi Everyone,
In the few days between Christmas and New Year I am doing some thinking about architecture and the general problem of validating (controlling) data alongside using a more RDF based approach. This is an open world / closed world problem really but relates to Tapir as follows. Suppose I want to specify an output model like this:
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf = "http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:vCard = "http://www.w3.org/2001/vcard-rdf/3.0#%22%3E rdf:Description dc:creator <rdf:Description rdf:about = "http://qqqfoo.com/staff/corky" > rdfs:label Corky Crystal </rdfs:label> vCard:FN Corky Crystal </vCard:FN> <vCard:N rdf:parseType="Resource"> vCard:Family Crystal </vCard:Family> vCard:Given Corky </vCard:Given> vCard:Other Jacky </vCard:Other> vCard:Prefix Dr </vCard:Prefix> </vCard:N> vCard:BDAY 1980-01-01 </vCard:BDAY> </rdf:Description> </dc:creator> </rdf:Description> </rdf:RDF>
This is taken from: http://dublincore.org/documents/dcq-rdf-xml/#sec3 and is a good example or using a mix of 'other' vocabularies to describe something. DublinCore and vCard are pretty standard and should really not be re-invented.
I can't see how to define this structure in a simple XML Schema. It only allows a single target namespace so one needs to have at least 4 XSD documents (1 + 3 imported) to define this tightly. Here is an example of how to do this kind of thing:
http://www-128.ibm.com/developerworks/xml/library/x-tipschnm.html
We have the additional problem that there aren't actual public XML Schema definitions of these things so one would have to make up schemas defining the elements used. It all gets quite horrible.
Does the subset of XML Schema in Tapir support imports?
Tapir has the intention of combining separate 'concepts' (which implies different namespaces) into a single output model but do the concepts all have to be mapped into a single namespace to make the thing workable?
I may be missing something obvious with XML Schema or Tapir (or both) in which case I would be grateful if some one could set me right. I will blame it all on an excess of rich food and red wine over the past few days.
Many thanks for your thoughts.
Happy New Year,
Roger