Markup of text descriptions Vs. structured data (GEN)
G.Hagedorn at BBA.DE
Tue Nov 30 15:17:38 CET 1999
Jim Croft wrote:
> Can you explain in more detail why they should be fundamentally different?
> Aren't the differences just a matter a matter of degree, different points
> on a continuum of data as it were? And aren't the basic principles
> applicable across the whole?
>>The issues overlap, and it would be beneficial to use as much common
>>syntax as possible, but fundamentally I believe them to be quite
I completely agree with you that the issues overlap and should be treated
together. However, I still see them a distinct, so that they can be
treated by separate groups of commands, that use a common system.
But maybe that is only to try to protect my brain from blowing up in
Could you perhaps give examples where you see the continuum?
My differences (see also the example in the other post) are:
o Language in free text that is marked up may have different
language, while the underlying data should be language independent
o Similarly, on a more subtle level, the concepts used by one authors
may only approximately map unto the concepts developed in the
o Data may be more detailed, which unambigously may map unto a course
natural language representation
o Much additional knowledge management attributes, mutliple
authoring/editing/revision layers, annotation and assessment,
quality control, status regarding data source and passing
information up and down taxonomic hierarchy, assumptions about
possible misinterpretations, etc.
Much information may be used for markup as well as for data, so there
will be overlap. However, for example, for a given character
definition multiple independent hierarchical classifications (by
morphological structure, perhaps according to different schools, by
methods, by function) may be defined; the natural language definition
will however usually only use a single classification. This
classification is superfluous in the item description part of the
data, but it is desirable to add to the markup. Or isn't it? But then
we could not use the idea to just use a coarse structure-only markup
as a first step to limit the scope of XML queries, an idea forwarded
by Don Kirkup and Bryan Heidorn at the TDWG meeting.
Gregor Hagedorn G.Hagedorn at bba.de
Institute for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Koenigin-Luise-Str. 19 Tel: +49-30-8304-2220
14195 Berlin, Germany Fax: +49-30-8304-2203
Often wrong but never in doubt!
More information about the tdwg-content