[Tdwg-tag] A very simple question stated again.

Mon Mar 27 16:07:41 CEST 2006

Ignoring the often observed combersomeness of versioning in XML-Schema, 
your example seems a bit of a red-herring and to me serves more as 
supporting a claim---with which I agree---that it is \easier/ in RDF 
than XML-Schema to clarify relations and easier in XML-Schema to fail to 
do so,  For example, were it the \intent/ to insure that Collector isA 
Human, it is certainly possible to do so in RDF. This then would require 
an extender to introduce a new supercclass, say PseudoCollector. along 
with all the baggage(?) of enforcing the properties it shares with the 
subclass. I don't know enough about DL to assert this with any 
confidence, but allowing arbitrary superclassing also sounds to me like 
it might cause pain to reasoners.

I take the gist of the previously posted reference to the final pages of 
  http://www.omg.org/docs/ad/05-08-01.pdf to be that even in a modeling 
tool \more expressive/ than OWL, there is likely to remain the embedding 
of semantics in naming conventions, a cranky, but somewhat successful 
mechanism that is part of what Stan seems to be exploring.

To me, issues of pain to reasoning engines is not small. To my mind, 
machine reasoning is the biggest motivation for considering RDF-based 
representations, and it is well-understood in the research community 
that it is quite easy for this utility to vanish into the thin air of 
exponential time or other complexities. I am reminded of the Feb 15 
posting by Steve Perry which contained the scary (to me) sentence. "We 
mostly use text editors for developing ontologies because we've
occasionally found Protege to be unstable with large complicated OWL
models.". (*)

Bob

(*)In fairness to Steve, who is way more qualified than I in these 
matters, on Feb 20 he posted a rather detailed analysis which makes me 
question my belief about the main utility of RDF where he writes:

"For the same reason I think the primary use case is not
inference over OWL-described RDF, but search over flexible RDF-Schema
described data models.  I personally think that RDF might make some use
cases, especially the merge case, easier to handle.  So I'd like to see
further discussion of the use cases above for both XML Schema and RDF."

Alas, my finding his arguments convincing does nothing to assuage my 
terrors. If biologist-friendly OWL tools are lacking for non-toy 
ontology development, it is likely that biologist-friendly tools for 
RDF/RDFS are not-even on the production-quality horizon.

Roger Hyam wrote:
> Hi Stan,
> 
> Can I take that as a Yes then? Or is it a No?
> 
> Concrete example:
> 
> Take this instance document.
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <ExampleDataSet xmlns="http://example.org/specimens#">
>     <Specimen>
>         <Collector>
>             <Name>John Doe</Name>
>         </Collector>
>     </Specimen>
> </ExampleDataSet>
> 
> Is John Doe a person or a research vessel?
> 
> If the document validates against this schema:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
> elementFormDefault="qualified"
>   targetNamespace="http://example.org/specimens#" 
> xmlns:specimens="http://example.org/specimens#">
>   <xs:element name="ExampleDataSet">
>     <xs:complexType>
>       <xs:sequence>
>         <xs:element ref="specimens:Specimen"/>
>       </xs:sequence>
>     </xs:complexType>
>   </xs:element>
>   <xs:element name="Specimen">
>     <xs:complexType>
>       <xs:sequence>
>         <xs:element name="Collector" type="specimens:personType"/>
>       </xs:sequence>
>     </xs:complexType>
>   </xs:element>
>   <xs:complexType name="personType">
>     <xs:sequence>
>       <xs:element name="Name" type="xs:string"/>
>     </xs:sequence>
>   </xs:complexType>
> </xs:schema>
> 
> then it looks like John Doe is a person.
> 
> Without looking at the schema John Doe could as easily have been a team 
> of people or an expedition or a sampling machine etc. We can change the 
> meaning of the instance document depending on the schema it validates 
> against - if we say meaning is in the schema - and there are many 
> schemas that this document could validate against.
> 
> This is the reason for my question. If we interpret meaning from the 
> type hierarchy in XML Schema then we are stuck with single inheritance 
> (good in Java but maybe not so hot in data modeling). It also means that 
> we hard link structure to meaning. It is very difficult for some one to 
> come along and extend this schema saying "I have an element and my 
> element represents a collector that isn't a person" because the notion 
> of collector is hard coded to the structure for representing a person. 
> They can't abstractly say "This machine is a type of collector".
> 
> Another way to imply meaning is through namespaces. The element 
> http://example.org/specimens#Collector could resolve to a document that 
> tells us what we mean by collector. Then we wouldn't have to worry so 
> much about the 'structure' of the document but about the individual 
> meanings of elements. We could still use XML Schema to encode useful 
> structures but the meanings of elements would come from the namespace. 
> (And I didn't even mention that this is how RDF does it - oh bother - 
> now I have...).
> 
> My central question is how we map between existing and future schemas. 
> If we can't say where the meaning is encoded in our current schemas then 
> we can't even start the process.
> 
> All the best,
> 
> Roger
> 
> 
> Blum, Stan wrote:
> 
>>"Are the semantics encoded in the XML Schema or in the structure of the XML
>>instance documents that validate against that schema? Is it possible to
>>'understand' an instance document without reference to the schema?"
>>
>>Possible answers are:
>>
>>
>>	1.	Yes: you can understand an XML instance document in the
>>absence of a schema it validates against i.e. just from the structure of the
>>elements and the namespaces used.
>>		
>>	2.	No: you require the XML Schema to understand the document. 
>>
>> 
>>
>>Roger, If you postulate that the instance document is valid against the
>>schema, and the that the element and attribute names are meaningful to the
>>reader (a human, or software written by a human who understands their
>>meaning), then the only additional semantics the schema could provide would
>>be in the annotations/documentation, if any exist in the schema.  
>>
>>I'm not entirely sure what you include in [data] "structure", but if you only
>>mean concepts such as tuples, trees, sets, lists, bags, etc., then I would
>>disagree that semantics are encoded substantially in data structure (of the
>>XML instance doc or any other record).  It is true that without proper
>>structure, semantics cannot be encoded, but I think semantics are encoded
>>predominantly in class/element-attribute names and any referenced
>>documentation (i.e., natural language).  If you replace meaningful names with
>>surrogate keys (e.g., integers) and thereby obscure any meaning conveyed by
>>the names, then the instance document would lose a lot of its meaning.
>>
>>I'm not exactly sure how this relates to the earlier discussion about XML
>>schema, RDF, and more powerful modeling methodologies like UML. but I hope it
>>helps.
>>
>>Cheers,
>>
>>-Stan 
>>
>>  
> 
> 
> 

-- 
Robert A. Morris
Professor of Computer Science
UMASS-Boston
http://www.cs.umb.edu/~ram
phone (+1)617 287 6466