[Tdwg-tag] A very simple question stated again.
Gregor Hagedorn
G.Hagedorn at BBA.DE
Mon Mar 27 10:46:18 CEST 2006
> *"Are the semantics encoded in the XML Schema or in the structure of the
> XML instance documents that validate against that schema? Is it possible
> to 'understand' an instance document without reference to the schema?"*
I would say:
A little semantics is embedded in the instance document, with intuitive design
quite a bit more can be guessed.
Much more semantics is embedded in the schema: Type information, complete
knowledge of optional properties, cardinality constraints, extension points.
More semantics is commonly added in the schema annotations.
Certain kind of semantics, like the semantics of repeated use of the same type
(simple like string or complex, self-created types=classes) in different
properties of a class are not expressed formally in schema. Nor are they
formally expressed in UML static class modeling or ER modeling.
Others semantics, like relations between classes (xml-schema:"complex types")
are expressable through identity constraints, but unfortunately most people
skip the work of doing it.
> XML is 'self describing' so you would think this must be true.
Perhaps XML is "self-guessable" :-)
> If the answer in No then we need clear statements about how all
> instances must always bear links to a permanently retrievable schema -
> or they become meaningless.
In xml schema this is done through namespace and namespace location. Note that
namespace location is a hint, not required to be followed. Consumer may use
their own version of a schema on their own responsibility.
> We need very tight version control of
> schemas and a method of linking between the versions so we can track how
> the meaning has changed.
Updating a deployed schema in the same namespace implies that you state that at
least previous document are valid under the new schema. Depending on the
management of schema users, the reverse may or may not be required as well.
Guidelines exist as to how forward and backward compatibility can be achieved
in xml-schema (provide extension points with xs:any and the self or non-self
namespace options, provide them within container element), enabling schema to
become extensible. If these guidelines are followed, a new schema version may
be produced having the same namespace, but a different version attribute. The
version attribute documents "evolutionary changes".
> We also need clear statements on what happens
> when you can validate a document with multiple schemas? Does this imply
> multiple meanings? Schemas must be archived with any data etc.
Schema is primarily about syntax, not semantics (although some semantics are
implied by syntax). Clearly, additionaly semantics is desirable, and makes this
discussion worth it.
However, I can see great benefits in syntax. According to Roger's post RDFS
cares very little about this ("we could use external OWL-based . Knowing that
syntax is correct enables you to guarantee that the imported/consumed data
fullfill an number of validation constraints build into your local data
structure. Having these constraints enforced then allows you to write code
relying on assumptions. If you import unconstraints data, you would have to
write super-error-tolerant code that has exceptions for all possible
inconsistencies in the data.
Is this a problem with using RDF/S?
> If you respond to this message please state a preference for either 1 or
> 2. There is no middle road on this one!
I disagree, I believe formal semantical definitions are a question of degree,
not "yes" or "no". Even OWL claims only to do the "doable" things and not
aiming to solve all of AI problems.
Gregor----------------------------------------------------------
Gregor Hagedorn (G.Hagedorn at bba.de)
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Königin-Luise-Str. 19 Tel: +49-30-8304-2220
14195 Berlin, Germany Fax: +49-30-8304-2203
More information about the tdwg-tag
mailing list