Well, I get your point and it certainly wouldn't be a huge burden on developers to check both the Latin and English names. But I thought the whole point of suggesting a controlled vocabulary was so that people would have one recommended value to use for each state they want to represent. Why do we have ISO 639-1 language codes? A good developer could just check for English, english, EN, en, eng., Eng., Engl, engl., etc. etc.
Steve

Bob Morris wrote:

On Fri, Nov 19, 2010 at 8:56 AM, Steve Baskauf
<steve.baskauf@vanderbilt.edu> wrote:

[concensus discussion]
...  Why make all software check for two alternatives
when a consensus would fix the problem?  (Consensus... did I say that
word in a tdwg-content email????)


Ummm, because plenty of data will not meet the consensus? Because
robust software checks for things that may occur even if  they violate
expectations, rules, standards, recommendations, conventions, or
consensus?

One model of consensus might be "what most of the real current data
does". Since so much biodiversity data is dirty,  I'm pretty sure that
would bring howls from taxonomists following this list. But it would
also have a shot at exposing more data, if one also believes that most
consuming applications are not robust. Better is for the community to
make recommendations (by consensus or some other mechanism) and let
developers of non-robust applications accept responsibility for their
non-robustness.

My self-serving(*) position is that a huge amount of dirty current
data is being served by organizations/people who have no idea that it
is dirty, and no systematic way to find out that it is. By contrast,
the relatively small number of client developers are likely to have a
good idea of where the dirt is and often can deal with it. The
well-known social problem with that arises in circumstances such as
yours, where domain scientists are writing software out of necessity
or urgency, and rightfully want to get on with their science. They
then have to make choices about where to spend their time: on software
engineering or science. Many have little choice, since they are paid
to be scientists, not software engineers.  Nor are lay software
engineers the only authors of non-robust software. Analogous
time-constraints imposed on professionals often result in the same
kinds of problems.  Alas, there is no single solution to this
conundrum. But my (not biologically informed) guess is that for the
problem in this thread, supporting both alternatives does not impose a
big burden on developers.  In which case there is no need for
consensus. :-)

Bob Morris
(*) http://etaxonomy.org/mw/FP2010:_Continuous_Quality_Control

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu