Re: Morphological Data Representation

23 Nov 2001


      Ainsi parlait Steve Shattuck :
[..]
...
I've translated this into the XML file that is attached.  (Even this fairly
simple example is moderately large and I would recommend using an XML
viewer such as Microsoft's XML Notepad when working with it - see
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnxml/html
/ xmlpaddownload.asp.)
I'm always surprised seeing people recommend using proprietary stuff, when
they are plenty of free-software equivalents:
http://freshmeat.net/search/?site=Freshmeat&q=xml+editor§ion=projects
sourceforge.net also also provide a complete list
vim and emacs also have xml support, and are available on many platforms
...
The basic structure of the attached file is:
<Morphology>
   <Characters>
      <Character> - ANY NUMBER
         <Type>
         <ID>
         <Short_Descriptor>
         <Long_Descriptor>
         <State_Descriptor>
         <Order>
         <State> - ANY NUMBER
            <Value>
            <ID>
            <Order>
   <Items>
      <Item> - ANY NUMBER
         <Type>
         <ID>
         <Descriptor>
         <CodedCharacter> - ANY NUMBER
            <CharacterID>
            <Description>
            <CodedState> - ANY NUMBER
               <StateID>
               <Value>
This is a mixed approach, whereas Gregor's proposition is to separate
characters description and items. Morevoer, it's closely related to delta (as
xdelta), whereas everyone seems to agree on building something from scratch.
...
Note that I've treated everything as elements and haven't used attributes.
Simplicity is the only reason for this and some elements would be better as
attributes; these can be converted when the dust settles.
No, there is also a performance argument for attributes: as they make files
less verbose, parsing is quicker. See xerces-j/xalan-j FAQ.
However, from a modeling point-of-view, the rule should be:
use elements for what is has real-world meaning
use attributes for modeling artifacts (id, idrefs, etc...)
Default XSLT transformation enforces this by outputting elements contents and
ignoring attributes.
--
Guillaume Rousse <rousse@ccr.jussieu.fr>
GPG key http://lis.snv.jussieu.fr/~rousse/gpgkey.html