Markup of text descriptions Vs. structured data (GEN)
G.Hagedorn at BBA.DE
Mon Nov 29 18:09:53 CET 1999
Jean-Marc Varel, Don Kirkup and others have repeatedly touched the
problem of existing textual descriptions in their discussions
contributed to this list.
On the other hand, I myself belong to the group of DELTA/NEXUS minded
people, who are concerned mostly with knowledge management of
descriptive data that are created new.
These issues should be seen quite separately, and that we need both
1. A markup language for text documents. This includes:
- existing descriptions, captured by OCR or other means, where the
markup would be manually (or automated/data mining?)
- computer generated natural language descriptions that are
published as electronic documents. A new CSIRO package could have
a ToNat command that automatically adds the necessary, hidden
2. A data language for new observations, including repeated
measurements or repeated observations of categorical data, e.g.
shape of multiple leaves in a single specimen. Further, many
structures for knowledge managements (data revision, annotation,
quality control and assessment) need to be implemented here.
The issues overlap, and it would be beneficial to use as much common
syntax as possible, but fundamentally I believe them to be quite
See also the separate post "Item description data in XML (XML)" for
discussion of how to achieve this in XML and with XML examples.
Inst. for Plant Virology, Microbiology, and Biosafety
Federal Research Center for Agriculture and Forestry (BBA)
Gregor Hagedorn Net: G.Hagedorn at bba.de
Koenigin-Luise-Str. 19 Tel: +49-30-8304-2220
14195 Berlin, Germany Fax: +49-30-8304-2203
Often wrong but never in doubt!
More information about the tdwg-content