Re: Markup of text descriptions Vs. structured data (GEN)
Jim Croft wrote:
Can you explain in more detail why they should be fundamentally different? Aren't the differences just a matter a matter of degree, different points on a continuum of data as it were? And aren't the basic principles applicable across the whole?
The issues overlap, and it would be beneficial to use as much common syntax as possible, but fundamentally I believe them to be quite different.
I completely agree with you that the issues overlap and should be treated together. However, I still see them a distinct, so that they can be treated by separate groups of commands, that use a common system. But maybe that is only to try to protect my brain from blowing up in confusion :-)
Could you perhaps give examples where you see the continuum? My differences (see also the example in the other post) are: o Language in free text that is marked up may have different language, while the underlying data should be language independent o Similarly, on a more subtle level, the concepts used by one authors may only approximately map unto the concepts developed in the character definition o Data may be more detailed, which unambigously may map unto a course natural language representation o Much additional knowledge management attributes, mutliple authoring/editing/revision layers, annotation and assessment, quality control, status regarding data source and passing information up and down taxonomic hierarchy, assumptions about possible misinterpretations, etc.
Much information may be used for markup as well as for data, so there will be overlap. However, for example, for a given character definition multiple independent hierarchical classifications (by morphological structure, perhaps according to different schools, by methods, by function) may be defined; the natural language definition will however usually only use a single classification. This classification is superfluous in the item description part of the data, but it is desirable to add to the markup. Or isn't it? But then we could not use the idea to just use a coarse structure-only markup as a first step to limit the scope of XML queries, an idea forwarded by Don Kirkup and Bryan Heidorn at the TDWG meeting.
Gregor ---------------------------------------------------------- Gregor Hagedorn G.Hagedorn@bba.de Institute for Plant Virology, Microbiology, and Biosafety Federal Research Center for Agriculture and Forestry (BBA) Koenigin-Luise-Str. 19 Tel: +49-30-8304-2220 14195 Berlin, Germany Fax: +49-30-8304-2203
Often wrong but never in doubt!
participants (1)
-
Gregor Hagedorn