Re: Morphological Data Representation
Ainsi parlait Steve Shattuck : [..]
I've translated this into the XML file that is attached. (Even this fairly simple example is moderately large and I would recommend using an XML viewer such as Microsoft's XML Notepad when working with it - see http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnxml/html / xmlpaddownload.asp.)
I'm always surprised seeing people recommend using proprietary stuff, when they are plenty of free-software equivalents: http://freshmeat.net/search/?site=Freshmeat&q=xml+editor%C2%A7ion=projec... sourceforge.net also also provide a complete list vim and emacs also have xml support, and are available on many platforms
The basic structure of the attached file is:
<Morphology> <Characters> <Character> - ANY NUMBER <Type> <ID> <Short_Descriptor> <Long_Descriptor> <State_Descriptor> <Order> <State> - ANY NUMBER <Value> <ID> <Order> <Items> <Item> - ANY NUMBER <Type> <ID> <Descriptor> <CodedCharacter> - ANY NUMBER <CharacterID> <Description> <CodedState> - ANY NUMBER <StateID> <Value>
This is a mixed approach, whereas Gregor's proposition is to separate characters description and items. Morevoer, it's closely related to delta (as xdelta), whereas everyone seems to agree on building something from scratch.
Note that I've treated everything as elements and haven't used attributes. Simplicity is the only reason for this and some elements would be better as attributes; these can be converted when the dust settles.
No, there is also a performance argument for attributes: as they make files less verbose, parsing is quicker. See xerces-j/xalan-j FAQ.
However, from a modeling point-of-view, the rule should be: use elements for what is has real-world meaning use attributes for modeling artifacts (id, idrefs, etc...) Default XSLT transformation enforces this by outputting elements contents and ignoring attributes. -- Guillaume Rousse rousse@ccr.jussieu.fr GPG key http://lis.snv.jussieu.fr/~rousse/gpgkey.html
participants (1)
-
Guillaume Rousse