Philosophy; Was: Re: Draft Spec mark 2
Robert A. (Bob) Morris
ram at CS.UMB.EDU
Fri Sep 1 09:01:28 CEST 2000
<RAMBLING_PHILOSOPHY flame="occasionally" biology="rarely">
Erik Westlin writes:
> In my mind it's totally wrong to create a language for descriptions
> using XML.
> Instead create a dedicated language for descriptions - then
> construct a XML generator for this language if you like.
One would do this only if one decided that XML and its associated
languages did not have enough expressive power to solve the problem
at hand. If it does, not using it is to abandon LOTS of existing
tools for parsing, transformation, presentation, data exchange and
> I'm in computer science, my interrest in botany is just a small part
> time hobby
> But i hate to see a lot of energy go to waste and i would like to see
> a computer flora whith a deep knowledge i could interact with.
This is a good goal for system architects, but I doubt that treatment
representation will be the enabling technology for it. Rather only it
must not preclude it. What taxonomists mean by a treatment is too
narrow to be the sole support for what non-taxonomists, especially
non-biologists, need from computer systems about organisms.
Treatment data is only part of the knowledge that an
author of, say, a field guide, must represent in the guide, or that is
needed by projects as ambitious as the Global Biodiversity Information Facility
(www.gbif.org), or many of the numerous efforts with intersecting
Here's an example that can not(?) be represented in what Kevin is
attempting, nor does it belong there: Some years ago on a rural road
in upstate New York, I passed a litter of fox cubs who---at the likely
cost of their own lives---would not leave
their mother as she lay dying in the middle of the road, obviously hit
by a car. There is a lot of important behavioral and ecological
information encompassed in that scene: the age of the cubs; the
rearing habits of the species; the impact of roads on
populations; etc. etc. None of this matters to a taxonomist (well,
maybe a little. Probably one can deduce from the behavior that foxes
Kevin's effort, and TDWG's charter, is foremost about data in the
service of knowledge, rather than knowledge in the service of
users. As such, his effort supports the system builders, especially
those with a desire to exchange or federate data or (one hopes) to
build extensible systems.
> I think you should consult some people in the knowledge representation
<FLAME level="extreme" offense_intended="none">
This is the last thing one would do if one believes---as I do---that
computer scientists should be building tools that /free/ biologists
from computer science specialists rather than indenture biologists to
computer scientists. Were it truly necessary, it would be analogous to
advice to consult with automotive engineers, physicists and organic
chemists before you buy a car.
> Plant descriptive data is much to complex to be captured in XML which is
I don't believe this. XML in particular and semi-structured data in
general can support schema inference (See especially the literature
referenced in "Data on the Web : From Relations to Semistructured Data
and Xml" (Morgan Kaufmann Series in Data Management Systems) -- Serge
Abiteboul, et al; 2000). This alone probably disproves your claim.
> more geared towards presentation than representation.
Ummm.... unless these terms mean something different in the Knowledge
Representation field than in the Document Management field, this is
the opposite of what most people believe:
"Markup encodes a description of the document's storage layout and
logical structure. XML provides a mechanism to impose constraints on
the storage layout and logical structure."
Sec. 1(Origin and Goals) of the XML1.0 Recommendation
So much is the desire to separate presentation that only a small part
of XSL, the XML Style Sheet language, addresses it (Section 6,
"Formatting Objects", http://www.w3.org/TR/xsl/slice6.html#fo-section)
In general, neither Kevin's goals nor XML impose any presentation or
other semantics at all. That's good, because it leaves knowledge
representation to the application builder---who one hopes has lots of
help from biologists, all of whom are smarter than software.
p.s. After about a year of hanging out with a variety of biologists,
I've come to think of the taxonomists the of grammarians
and etymologists (not /entomologists/) of Life:
"etymology n., pl. The origin and historical development of a
linguistic form as shown by determining its basic elements, earliest
known use, and changes in form and meaning, tracing its transmission
from one language to another, identifying its cognates in other
languages, and reconstructing its ancestral form where possible."
Taxonomists, God bless 'em. You can't live with 'em and you can't
live without 'em.
 The American Heritage. Dictionary of the English Language, Third
Edition 1996, as cited after ALT-Click with software installed from
from guruNet. This software is incredibly cool. It gives you
information about any word in /any/ application on your Windows or
Palm screen. See http://www.gurunet.com.
More information about the tdwg-content