[Tdwg-tag] A very simple question stated again.
Roger Hyam
roger at tdwg.org
Mon Mar 27 18:07:13 CEST 2006
Hi Bob,
What tools are you using with SDD and the other UBIF based schemas?
Thanks,
Roger
Bob Morris wrote:
> Ignoring the often observed combersomeness of versioning in
> XML-Schema, your example seems a bit of a red-herring and to me serves
> more as supporting a claim---with which I agree---that it is \easier/
> in RDF than XML-Schema to clarify relations and easier in XML-Schema
> to fail to do so, For example, were it the \intent/ to insure that
> Collector isA Human, it is certainly possible to do so in RDF. This
> then would require an extender to introduce a new supercclass, say
> PseudoCollector. along with all the baggage(?) of enforcing the
> properties it shares with the subclass. I don't know enough about DL
> to assert this with any confidence, but allowing arbitrary
> superclassing also sounds to me like it might cause pain to reasoners.
>
> I take the gist of the previously posted reference to the final pages
> of http://www.omg.org/docs/ad/05-08-01.pdf to be that even in a
> modeling tool \more expressive/ than OWL, there is likely to remain
> the embedding of semantics in naming conventions, a cranky, but
> somewhat successful mechanism that is part of what Stan seems to be
> exploring.
>
> To me, issues of pain to reasoning engines is not small. To my mind,
> machine reasoning is the biggest motivation for considering RDF-based
> representations, and it is well-understood in the research community
> that it is quite easy for this utility to vanish into the thin air of
> exponential time or other complexities. I am reminded of the Feb 15
> posting by Steve Perry which contained the scary (to me) sentence. "We
> mostly use text editors for developing ontologies because we've
> occasionally found Protege to be unstable with large complicated OWL
> models.". (*)
>
> Bob
>
> (*)In fairness to Steve, who is way more qualified than I in these
> matters, on Feb 20 he posted a rather detailed analysis which makes me
> question my belief about the main utility of RDF where he writes:
>
> "For the same reason I think the primary use case is not
> inference over OWL-described RDF, but search over flexible RDF-Schema
> described data models. I personally think that RDF might make some use
> cases, especially the merge case, easier to handle. So I'd like to see
> further discussion of the use cases above for both XML Schema and RDF."
>
> Alas, my finding his arguments convincing does nothing to assuage my
> terrors. If biologist-friendly OWL tools are lacking for non-toy
> ontology development, it is likely that biologist-friendly tools for
> RDF/RDFS are not-even on the production-quality horizon.
>
> Roger Hyam wrote:
>> Hi Stan,
>>
>> Can I take that as a Yes then? Or is it a No?
>>
>> Concrete example:
>>
>> Take this instance document.
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <ExampleDataSet xmlns="http://example.org/specimens#">
>> <Specimen>
>> <Collector>
>> <Name>John Doe</Name>
>> </Collector>
>> </Specimen>
>> </ExampleDataSet>
>>
>> Is John Doe a person or a research vessel?
>>
>> If the document validates against this schema:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
>> elementFormDefault="qualified"
>> targetNamespace="http://example.org/specimens#"
>> xmlns:specimens="http://example.org/specimens#">
>> <xs:element name="ExampleDataSet">
>> <xs:complexType>
>> <xs:sequence>
>> <xs:element ref="specimens:Specimen"/>
>> </xs:sequence>
>> </xs:complexType>
>> </xs:element>
>> <xs:element name="Specimen">
>> <xs:complexType>
>> <xs:sequence>
>> <xs:element name="Collector" type="specimens:personType"/>
>> </xs:sequence>
>> </xs:complexType>
>> </xs:element>
>> <xs:complexType name="personType">
>> <xs:sequence>
>> <xs:element name="Name" type="xs:string"/>
>> </xs:sequence>
>> </xs:complexType>
>> </xs:schema>
>>
>> then it looks like John Doe is a person.
>>
>> Without looking at the schema John Doe could as easily have been a
>> team of people or an expedition or a sampling machine etc. We can
>> change the meaning of the instance document depending on the schema
>> it validates against - if we say meaning is in the schema - and there
>> are many schemas that this document could validate against.
>>
>> This is the reason for my question. If we interpret meaning from the
>> type hierarchy in XML Schema then we are stuck with single
>> inheritance (good in Java but maybe not so hot in data modeling). It
>> also means that we hard link structure to meaning. It is very
>> difficult for some one to come along and extend this schema saying "I
>> have an element and my element represents a collector that isn't a
>> person" because the notion of collector is hard coded to the
>> structure for representing a person. They can't abstractly say "This
>> machine is a type of collector".
>>
>> Another way to imply meaning is through namespaces. The element
>> http://example.org/specimens#Collector could resolve to a document
>> that tells us what we mean by collector. Then we wouldn't have to
>> worry so much about the 'structure' of the document but about the
>> individual meanings of elements. We could still use XML Schema to
>> encode useful structures but the meanings of elements would come from
>> the namespace. (And I didn't even mention that this is how RDF does
>> it - oh bother - now I have...).
>>
>> My central question is how we map between existing and future
>> schemas. If we can't say where the meaning is encoded in our current
>> schemas then we can't even start the process.
>>
>> All the best,
>>
>> Roger
>>
>>
>> Blum, Stan wrote:
>>
>>> "Are the semantics encoded in the XML Schema or in the structure of
>>> the XML
>>> instance documents that validate against that schema? Is it possible to
>>> 'understand' an instance document without reference to the schema?"
>>>
>>> Possible answers are:
>>>
>>>
>>> 1. Yes: you can understand an XML instance document in the
>>> absence of a schema it validates against i.e. just from the
>>> structure of the
>>> elements and the namespaces used.
>>>
>>> 2. No: you require the XML Schema to understand the document.
>>>
>>>
>>> Roger, If you postulate that the instance document is valid against the
>>> schema, and the that the element and attribute names are meaningful
>>> to the
>>> reader (a human, or software written by a human who understands their
>>> meaning), then the only additional semantics the schema could
>>> provide would
>>> be in the annotations/documentation, if any exist in the schema.
>>> I'm not entirely sure what you include in [data] "structure", but if
>>> you only
>>> mean concepts such as tuples, trees, sets, lists, bags, etc., then I
>>> would
>>> disagree that semantics are encoded substantially in data structure
>>> (of the
>>> XML instance doc or any other record). It is true that without proper
>>> structure, semantics cannot be encoded, but I think semantics are
>>> encoded
>>> predominantly in class/element-attribute names and any referenced
>>> documentation (i.e., natural language). If you replace meaningful
>>> names with
>>> surrogate keys (e.g., integers) and thereby obscure any meaning
>>> conveyed by
>>> the names, then the instance document would lose a lot of its meaning.
>>>
>>> I'm not exactly sure how this relates to the earlier discussion
>>> about XML
>>> schema, RDF, and more powerful modeling methodologies like UML. but
>>> I hope it
>>> helps.
>>>
>>> Cheers,
>>>
>>> -Stan
>>>
>>
>>
>>
>
--
-------------------------------------
Roger Hyam
Technical Architect
Taxonomic Databases Working Group
-------------------------------------
http://www.tdwg.org
roger at tdwg.org
+44 1578 722782
-------------------------------------
More information about the tdwg-tag
mailing list