(RQT) Structure, Matrix, and Text

Jim Croft jrc at ANBG.GOV.AU
Sat Dec 18 09:34:32 CET 1999

>Free comments anywhere, where it is not clear to which data item a
>comment belongs, will not help. Also, in free text the same concept
>will often be expressed with different words. This is a common
>problem known by anybody who tried to capture free conventional
>descriptions in a database.

... and is probably intractable.  One person can often see data in another
comments, while another may see others data as hardly worthy of comment.
In many descriptive works the pearls of information are subtle and implicit
rather than explicit and in your face (but we will not open the biological
description as an art versus a science debate).

Discussions about what is and is not data will go on forever and the
critera will change fron genus to genus, gamily to family, order to order,
and biologist to biologist.

What is needed is a common standard format to store all descriptive data
and information in an understandable and unambiguous way for exchange
between applications so that nothing gets lost or made unavailable in
transactions between applications and up and down the taxonomic hierachy.
To ignore data and information is probably excusable, but to actually lose
it is sinful and unpardonable.

It is a shame that verbosity is the price we have to pay for this
flexibility, but the tecnology can handle it.  I did not think we were
considering the 'direct processing' of these tagged XML (or whatever)
files; they were to be just the lingua franca of applictions using
descritpive data.  A contemporary example would be the DELTA format: after
Intkey, the DELTA Editor, MEKA or whatever have grabbbed the data they
neither know nor care what a DELTA file looks like.  What I think we are
after is a format like this that all applications know about and can use
without any extrenal hacking and preprocessing.

There is a major differece between an effective archival and exchange
format and an efficient arrangement of data for processing and analysis; I
doubt if it will be posible to have both in the same package.  My
understanding was that this list was going to focus on the former, while
application developers and closet hackers were going to whatever they
needed or wanted with the latter.

A recurring nightmare is the thought of pouring data backwards and forwards
between applicatons, losing a little at each export/import like the fraying
ends of an aging chromosome at each cell division, until eventually there
is nothing left...


Jim Croft ~ jrc at anbg.gov.au ~ http://www.anbg.gov.au/people/croft.jim.html
ph 02-6246-5500 ~ fx 02-6246-5248 ~ GPO Box 1600 Canberra ACT 2601

More information about the tdwg-content mailing list