[Tdwg-tag] A very simple question stated again.

Roger Hyam roger at tdwg.org
Mon Mar 27 18:07:13 CEST 2006

Hi Bob,

What tools are you using with SDD and the other UBIF based schemas?



Bob Morris wrote:
> Ignoring the often observed combersomeness of versioning in 
> XML-Schema, your example seems a bit of a red-herring and to me serves 
> more as supporting a claim---with which I agree---that it is \easier/ 
> in RDF than XML-Schema to clarify relations and easier in XML-Schema 
> to fail to do so,  For example, were it the \intent/ to insure that 
> Collector isA Human, it is certainly possible to do so in RDF. This 
> then would require an extender to introduce a new supercclass, say 
> PseudoCollector. along with all the baggage(?) of enforcing the 
> properties it shares with the subclass. I don't know enough about DL 
> to assert this with any confidence, but allowing arbitrary 
> superclassing also sounds to me like it might cause pain to reasoners.
> I take the gist of the previously posted reference to the final pages 
> of  http://www.omg.org/docs/ad/05-08-01.pdf to be that even in a 
> modeling tool \more expressive/ than OWL, there is likely to remain 
> the embedding of semantics in naming conventions, a cranky, but 
> somewhat successful mechanism that is part of what Stan seems to be 
> exploring.
> To me, issues of pain to reasoning engines is not small. To my mind, 
> machine reasoning is the biggest motivation for considering RDF-based 
> representations, and it is well-understood in the research community 
> that it is quite easy for this utility to vanish into the thin air of 
> exponential time or other complexities. I am reminded of the Feb 15 
> posting by Steve Perry which contained the scary (to me) sentence. "We 
> mostly use text editors for developing ontologies because we've
> occasionally found Protege to be unstable with large complicated OWL
> models.". (*)
> Bob
> (*)In fairness to Steve, who is way more qualified than I in these 
> matters, on Feb 20 he posted a rather detailed analysis which makes me 
> question my belief about the main utility of RDF where he writes:
> "For the same reason I think the primary use case is not
> inference over OWL-described RDF, but search over flexible RDF-Schema
> described data models.  I personally think that RDF might make some use
> cases, especially the merge case, easier to handle.  So I'd like to see
> further discussion of the use cases above for both XML Schema and RDF."
> Alas, my finding his arguments convincing does nothing to assuage my 
> terrors. If biologist-friendly OWL tools are lacking for non-toy 
> ontology development, it is likely that biologist-friendly tools for 
> RDF/RDFS are not-even on the production-quality horizon.
> Roger Hyam wrote:
>> Hi Stan,
>> Can I take that as a Yes then? Or is it a No?
>> Concrete example:
>> Take this instance document.
>> <?xml version="1.0" encoding="UTF-8"?>
>> <ExampleDataSet xmlns="http://example.org/specimens#">
>>     <Specimen>
>>         <Collector>
>>             <Name>John Doe</Name>
>>         </Collector>
>>     </Specimen>
>> </ExampleDataSet>
>> Is John Doe a person or a research vessel?
>> If the document validates against this schema:
>> <?xml version="1.0" encoding="UTF-8"?>
>> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
>> elementFormDefault="qualified"
>>   targetNamespace="http://example.org/specimens#" 
>> xmlns:specimens="http://example.org/specimens#">
>>   <xs:element name="ExampleDataSet">
>>     <xs:complexType>
>>       <xs:sequence>
>>         <xs:element ref="specimens:Specimen"/>
>>       </xs:sequence>
>>     </xs:complexType>
>>   </xs:element>
>>   <xs:element name="Specimen">
>>     <xs:complexType>
>>       <xs:sequence>
>>         <xs:element name="Collector" type="specimens:personType"/>
>>       </xs:sequence>
>>     </xs:complexType>
>>   </xs:element>
>>   <xs:complexType name="personType">
>>     <xs:sequence>
>>       <xs:element name="Name" type="xs:string"/>
>>     </xs:sequence>
>>   </xs:complexType>
>> </xs:schema>
>> then it looks like John Doe is a person.
>> Without looking at the schema John Doe could as easily have been a 
>> team of people or an expedition or a sampling machine etc. We can 
>> change the meaning of the instance document depending on the schema 
>> it validates against - if we say meaning is in the schema - and there 
>> are many schemas that this document could validate against.
>> This is the reason for my question. If we interpret meaning from the 
>> type hierarchy in XML Schema then we are stuck with single 
>> inheritance (good in Java but maybe not so hot in data modeling). It 
>> also means that we hard link structure to meaning. It is very 
>> difficult for some one to come along and extend this schema saying "I 
>> have an element and my element represents a collector that isn't a 
>> person" because the notion of collector is hard coded to the 
>> structure for representing a person. They can't abstractly say "This 
>> machine is a type of collector".
>> Another way to imply meaning is through namespaces. The element 
>> http://example.org/specimens#Collector could resolve to a document 
>> that tells us what we mean by collector. Then we wouldn't have to 
>> worry so much about the 'structure' of the document but about the 
>> individual meanings of elements. We could still use XML Schema to 
>> encode useful structures but the meanings of elements would come from 
>> the namespace. (And I didn't even mention that this is how RDF does 
>> it - oh bother - now I have...).
>> My central question is how we map between existing and future 
>> schemas. If we can't say where the meaning is encoded in our current 
>> schemas then we can't even start the process.
>> All the best,
>> Roger
>> Blum, Stan wrote:
>>> "Are the semantics encoded in the XML Schema or in the structure of 
>>> the XML
>>> instance documents that validate against that schema? Is it possible to
>>> 'understand' an instance document without reference to the schema?"
>>> Possible answers are:
>>>     1.    Yes: you can understand an XML instance document in the
>>> absence of a schema it validates against i.e. just from the 
>>> structure of the
>>> elements and the namespaces used.
>>>     2.    No: you require the XML Schema to understand the document.
>>> Roger, If you postulate that the instance document is valid against the
>>> schema, and the that the element and attribute names are meaningful 
>>> to the
>>> reader (a human, or software written by a human who understands their
>>> meaning), then the only additional semantics the schema could 
>>> provide would
>>> be in the annotations/documentation, if any exist in the schema. 
>>> I'm not entirely sure what you include in [data] "structure", but if 
>>> you only
>>> mean concepts such as tuples, trees, sets, lists, bags, etc., then I 
>>> would
>>> disagree that semantics are encoded substantially in data structure 
>>> (of the
>>> XML instance doc or any other record).  It is true that without proper
>>> structure, semantics cannot be encoded, but I think semantics are 
>>> encoded
>>> predominantly in class/element-attribute names and any referenced
>>> documentation (i.e., natural language).  If you replace meaningful 
>>> names with
>>> surrogate keys (e.g., integers) and thereby obscure any meaning 
>>> conveyed by
>>> the names, then the instance document would lose a lot of its meaning.
>>> I'm not exactly sure how this relates to the earlier discussion 
>>> about XML
>>> schema, RDF, and more powerful modeling methodologies like UML. but 
>>> I hope it
>>> helps.
>>> Cheers,
>>> -Stan


 Roger Hyam
 Technical Architect
 Taxonomic Databases Working Group
 roger at tdwg.org
 +44 1578 722782

More information about the tdwg-tag mailing list