TDWG-SDD XML proposals of Kevin Thiele

Robert A. (Bob) Morris ram at CS.UMB.EDU
Mon Nov 6 23:13:56 CET 2000


Steve Shattuck writes:
 > <soap_box_on> and that's partly why
 > this discussion has drawn out for sooooo long - the BioLink team built a
 > complete XML representation of the BioLink database, some 400 fields in 50
 > tables, in 3 weeks and are now using it to move data between BioLink
 > databases and between Platypus and BioLink - and, surprise surprise, we
 > don't have a DTD or Schema because you don't actually need one to make this
 > stuff work <soap_box_off/>).

[Hey Steve, my soapbox is bigger than your soapbox... :-) ]

I reply:

Absolutely right. However, you almost never will see a need for a
Schema (or DTD) when you have a single data model and your
applications know it, as in the case you describe. If on the other
hand, you want to combine data from multiple data sources, you have an
advantage if there is a schema (small 's', meaning either an XML
Schema or a DTD). You can always infer a schema from an instance of
XML, (see Michael Kay's DTD generator at
http://users.iclway.co.uk/mhkay/saxon/saxon5-5-1/dtdgen.html), but if
you have an instance that isn't general, you'll get a schema that
isn't general.  In addition, if you are trying to write an application
that manipulates the node tree represented by an application, you also
have an advantage having a schema or otherwise knowing the data model.
I'll be startled if you wrote your exchange program with no knowledge
of the data model. The fact that such a program works independently of
a schema is part of the requirements of XML, but this says nothing
about what aid such a schema can offer either in writing applications,
or more interestingly, in generating applications.

For those in the camp that believe in publishing static XML---and
there are a lot of them---having a schema also is a big win for using
markup editors like XMLSpy. With a schema, these offer you only valid
tags at every point. Writing an application that is assured of working
against validated XML will often result in more robust applications
because they will not miss XML elements that are in the "wrong" place.

Validating against an XML Schema provides strong data typing and so
can help detect data errors.

Finally, XML Schema provides robust inheritance
mechanisms. Consequently if people derive a Schema from some series of
base Schemas, they automatically gain the use of applications written
against the base, even if they do nothing interesting to their
extensions. In my opinion, this reusability is highly valuable.

--Bob (Soapy) Morris

p.s.
I'd be interested to see what Kay's DTD generator gives against your
XML.




More information about the tdwg-content mailing list