<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Well, I get your point and it certainly wouldn't be a huge burden on

developers to check both the Latin and English names.&nbsp; But I thought

the whole point of suggesting a controlled vocabulary was so that

people would have one recommended value to use for each state they want

to represent.&nbsp; Why do we have ISO 639-1 language codes?&nbsp; A good

developer could just check for English, english, EN, en, eng., Eng.,

Engl, engl., etc. etc. <br>

Steve<br>

<br>

Bob Morris wrote:

<blockquote

 cite="mid:AANLkTi=ftbt1B9+8P+tO+8efi=NsOhZc3xN5BKnw9W28@mail.gmail.com"

 type="cite">

  <pre wrap="">On Fri, Nov 19, 2010 at 8:56 AM, Steve Baskauf

<a class="moz-txt-link-rfc2396E" href="mailto:steve.baskauf@vanderbilt.edu">&lt;steve.baskauf@vanderbilt.edu&gt;</a> wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">[concensus discussion]

... &nbsp;Why make all software check for two alternatives

when a consensus would fix the problem? &nbsp;(Consensus... did I say that

word in a tdwg-content email????)

    </pre>

  </blockquote>

  <pre wrap=""><!---->

Ummm, because plenty of data will not meet the consensus? Because

robust software checks for things that may occur even if  they violate

expectations, rules, standards, recommendations, conventions, or

consensus?

One model of consensus might be "what most of the real current data

does". Since so much biodiversity data is dirty,  I'm pretty sure that

would bring howls from taxonomists following this list. But it would

also have a shot at exposing more data, if one also believes that most

consuming applications are not robust. Better is for the community to

make recommendations (by consensus or some other mechanism) and let

developers of non-robust applications accept responsibility for their

non-robustness.

My self-serving(*) position is that a huge amount of dirty current

data is being served by organizations/people who have no idea that it

is dirty, and no systematic way to find out that it is. By contrast,

the relatively small number of client developers are likely to have a

good idea of where the dirt is and often can deal with it. The

well-known social problem with that arises in circumstances such as

yours, where domain scientists are writing software out of necessity

or urgency, and rightfully want to get on with their science. They

then have to make choices about where to spend their time: on software

engineering or science. Many have little choice, since they are paid

to be scientists, not software engineers.  Nor are lay software

engineers the only authors of non-robust software. Analogous

time-constraints imposed on professionals often result in the same

kinds of problems.  Alas, there is no single solution to this

conundrum. But my (not biologically informed) guess is that for the

problem in this thread, supporting both alternatives does not impose a

big burden on developers.  In which case there is no need for

consensus. :-)

Bob Morris

(*) <a class="moz-txt-link-freetext" href="http://etaxonomy.org/mw/FP2010:_Continuous_Quality_Control">http://etaxonomy.org/mw/FP2010:_Continuous_Quality_Control</a>

  </pre>

</blockquote>

<br>

<pre class="moz-signature" cols="72">-- 

Steven J. Baskauf, Ph.D., Senior Lecturer

Vanderbilt University Dept. of Biological Sciences

postal mail address:

VU Station B 351634

Nashville, TN  37235-1634,  U.S.A.

delivery address:

2125 Stevenson Center

1161 21st Ave., S.

Nashville, TN 37235

office: 2128 Stevenson Center

phone: (615) 343-4582,  fax: (615) 343-6707

<a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu">http://bioimages.vanderbilt.edu</a>

</pre>

</body>

</html>