[tdwg-tapir] Mapping to CNS file
Roger Hyam
roger at tdwg.org
Thu Mar 8 17:03:13 CET 2007
Hi Folks,
I have been doing some thinking about mapping to the ontology.
We have a way of defining TAPIR concepts as paths through the
ontology. I have written a script to generate a CNS file from a
particular view onto the ontology i.e. starting at a root class and
following all the links in the RDF to extinction but not allowing
recursion on any one path. It needs a little work to include the
common and external properties but seems to run OK.
I attach an example CNS if it will make it through the mail server.
How can we use such as CNS file to configure the configurators of the
wrappers and is this desirable?
I believe the current configurators assume that the full name of the
concepts is resolvable to the documentation for the concept. The
concept paths generated form the ontology are not resolvable in this
way. I could write a script that generated a documentation page when
it was passed one of these paths. Would this be a reasonable thing to
do?
There are lots of concepts in this file (138) and it will get a lot
bigger when I add in the general properties. We really need a way of
organizing the mapping process into a hierarchy browsing process.
Could we have a pre-mapping phase where someone browses the ontology
and creates a list of concepts that they want to map. They could then
use this much shorter list in the configurator. Would this be more
productive? I have started mocking something up along these lines but
will abandon it if there is another way forward.
I was thinking of writing a script to automate the production of
output model structures in much the same way as creating the CNS but
I ran into an interesting problem with preventing recursion.
When building paths for the CNS any one path can't include the same
class twice - which is easy to do when you are only thinking about
paths but when you are building a series of XML Schemas it becomes
more complex.
A --includes--> B --includes--> C --includes--> A
Can be detected and C can contain a link instead of the include.
A --includes--> B --includes--> C --linksTo--> A
So we avoid recursion when building the concepts in the CNS.
But if we have to consider multiple paths and possible loops then we
end in situations where we can't make choices automatically. If B can
include C and C can include B for example.
A --includes--> B --includes--> C --linksTo-->B
A --includes--> C --includes--> B --linksTo -->C
There are two definitions of B here, one with a link to C and one
which includes C. We can't do this in XML Schema because we have to
have two definitions of the same thing in the same namespace. When
you reference the element in the schema for A it couldn't know which
one you meant. Please correct me if I am wrong - I'd like to be.
This example is likely to arise more often than you might think as
there are lots of circular links in the ontology. just think of
synonymy and basionyms and hierarchies etc.
So a decision still has to be made - by a human I guess - about which
is the optimum combination of links and includes for a useful output
structure.
It makes me think a convention on namespace substitution is the
easiest way forward for building output models - but that might be
just because we are getting near the end of the day here....
What do you think?
Roger
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cns_example.txt
Url: http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20070308/0c2b8c2b/attachment.txt
-------------- next part --------------
More information about the tdwg-tag
mailing list