Re: TDWG-SDD XML proposals of Kevin Thiele

6 Nov 2000

      Dear TDWG folk - I sent the following note to Bob Morris but have since
thought that maybe I should have sent it to the whole list. So here it is.

-----------------------

Dear Bob

Something in the mail below caught my eye & I thought I should mention it
lest something at the foundation level be incorporated which is incorrect.

What I'm referring to is just below a row of *******'s below and copied
here:
...
...
<taxonomy>
     <rank="species" value="alfari"/>
     <rank="genus" value="Azteca"/>
     <author value="Emery"/>
What is referred to as "species" value="alfari"/ is actually the specific
epithet. The "species" value in this instance would actually be "Azteca
alfari" You're probably familiar with this protocol of referring to the
scientific name of an organism which implies citing the genus name & the
specific epithet which together comprise the species names - as per the
Linnean system of binomial nomenclature. Should I have the 'wrong end of the
stick' of the discourse at this stage then please advise me.

Sincerely,
- Leszek Vincent
xoxoxoxoxoxoxoxoxoxoxoxoxoxoxox
P. Leszek D. Vincent Ph.D., FLS
Plant Science Unit, Dept. of Agronomy, 209 Curtis Hall,
University of Missouri-Columbia, Columbia, MO 65211-7020, USA. Ph: (573)
884-3716 (Agronomy), Fax:(573) 884-7850;
Ph/Fax (Home): (573) 441-1228;
Email: Leszek@missouri.edu
Plant Systematist on the Missouri Maize Project - NSF award 9872655 -
(http://www.cafnr.missouri.edu/mmp/ and  http://www.agron.missouri.edu/)
xoxoxoxoxoxoxoxoxoxoxoxoxoxoxox

-----------------------------------------------------
...
-----Original Message-----
From: Robert A. (Bob) Morris [mailto:ram@CS.UMB.EDU]
Sent: Monday, October 30, 2000 12:20 PM
To: TDWG-SDD@USOBI.ORG
Subject: Re: TDWG-SDD XML proposals of Kevin Thiele
Jim Croft writes:
...
Date:         Mon, 30 Oct 2000 23:57:44 +1100
From: Jim Croft <jrc@anbg.gov.au>
To: TDWG-SDD@usobi.org
Subject:      Re: TDWG-SDD XML proposals of Kevin Thiele
...
The material at
http://www.cs.umb.edu/efg/ThieleDraft/thiele_0_3.html
illustrates our
efforts to draft an XML Schema from Thiele's draft
proposal version
0.3 ("TDD0.3") and illustrate its utility with simple
applications manipulating a taxonomic treatment.
No-one seems to be commenting on this, at least on the
list... is it all
too scary?  With Halloween just around the corner, do
these guys deserve to
be tricked or treated?
Unless people are commenting directly to Kevin/Bob/Jun, I
would imagine
they would be feeling pretty disappointed so far...
I have been through this a number of times, still trying
to come to grips
with the various options that have been used from the
draft specs of a few
weeks ago... and why...
This is a great attempt at implementation and surely
demonstrates that we
are on the right track...  doesn't it?
Yes, the format does seem a bit verbose, but from what I
gather Bob and Jun
are saying, let the software, applications and databases
worry about that...
Personally I prefer something like:
**************
...
...
<taxonomy>
     <rank="species" value="alfari"/>
     <rank="genus" value="Azteca"/>
     <author value="Emery"/>
     <year value="1893"/>
</taxonomy>
to something like:
<DESCRIPTION name="Taxonomic Information">
    <FEATURE name="Species Name">
       <FEATURE_VALUE>alfari</FEATURE_VALUE>
    </FEATURE>
    <FEATURE name="Genus Name">
       <FEATURE_VALUE>Azteca</FEATURE_VALUE>
    </FEATURE>
    <FEATURE name="Namer">
       <FEATURE_VALUE>Emery</FEATURE_VALUE>
    </FEATURE>
    <FEATURE name="Name Year">
       <FEATURE_VALUE>1893</FEATURE_VALUE>
    </FEATURE>
</DESCRIPTION>
but if we never have to read it in this format, I do not
suppose it really
matters...
Anyway, it looks as though we are settling on an
architecture allows both
discursive descriptions accommodated by 'narrative'
elements, and for the
compulsive obsessives, a nested set of 'features' that can have
'feature_values', 'qualifers' and associated 'narrative'...  Right?
And that the competing architecture of a list of
'feature_values' that can
be present, absent, unknown, doubtful, present by
misinterpretation, absent
by misinterpretation, imaginary, etc. has been given the flick?
Not really. We believe that it is pretty straightforward to translate
between this kind of format and the one more spiritually akin to XML
and will illustrate that Real Soon Now. The biggest thing we need to
think about is exactly how the XML id reference mechanisms should be
used to implement things as in Kevin's examples.
My own view is that there should actually be a single standard
representation with a bunch of ways to transform between it and other
desired representations. The advantage of this is that you need fewer
translators to go between representations A and B if everything has a
bidirectional representation with a common one, X. The disadvantage is
that you change X at the risk of breaking X<->A, thereby breaking
A<->B when neither A nor B changed. That said, I think that this issue
should not have lots of philosophical discussion until we, or somebody
else, demonstrates the conversion between list-like formats and
tree-like formats.
...
Or having coded the data in this manner, is this
distinction in data
...
representation employed by different interactive
identification products,
not really relevant or meaningful?
Sorry about the rambling - just trying to come to grips
with what has been
done here...  I am pretty sure it is good stuff once I
work out what it all
means...
One vague background concern I have is with things like:
<FEATURE name="Page Author">
    <FEATURE name="name">
       <FEATURE_VALUE>John T. Longino</FEATURE_VALUE>
    </FEATURE>
    <FEATURE name="efg address">
       <FEATURE_VALUE>The Evergreen State College, Olympia WA 98505
USA</FEATURE_VALUE>
    </FEATURE>
    <FEATURE name="email">
       <FEATURE_VALUE>longinoj@elwha.evergreen.edu</FEATURE_VALUE>
    </FEATURE>
</FEATURE>
which seems on the surface to correspond to ye olde
library cataloguing
metadata stuff.  Is there a wheel here we do not need to reinvent?
Absolutely. Per our remarks in the web page, we mainly did this out of
time constraints. For metadata, there is no doubt that the Dublin Core
and its extension the Darwin Core should be looked at. For
bibliographic things, the American Publishers Association has an SGML
DTD which is likely to be XML by now, etc. I'll write some references
to those for people to consider.
...
Related to this there are a number of projects around to
place using XML to
...
mark up and display descriptive data and taxonomic
treatments and they seem
to inventing rafts of 'feature' names for what appear to
be the same things
in terms of blocks onf information.  At the higher and
metadata level, is
it too early to talk about rationalizing these?  Being
very wary here, not
wanting to open the Pandora's character-list box of worms again...
Can we start assembling a list?
...
jim
Bob Morris

Vincent, Leszek

tags

participants (1)