<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

<HTML>

<HEAD>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">

<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7654.12">

<TITLE>Re: [tdwg-content] canonical name for named hybrid &amp; infragenericnames</TITLE>

</HEAD>

<BODY>

<!-- Converted from text/plain format -->


<P><FONT SIZE=2>Van: Bob Morris [<A HREF="mailto:morris.bob@gmail.com">mailto:morris.bob@gmail.com</A>]<BR>

Verzonden: do 9-12-2010 14:52<BR>

<BR>

Thanks. To me what is interesting about this thread is that documents<BR>

whose main(?) audience is authors and publishers, do not always<BR>

address the needs of parser writers.<BR>

<BR>

***<BR>

That depends on how you look at it. The ICBN is mostly written so that<BR>

nobody who just browses through will make sense of it. It requires<BR>

any user to read it in some depth, if he is to apply it. Perhaps the<BR>

parser writer should realize that he is no exception?<BR>

<BR>

But, actually, parsers are not going to be the answer to any question<BR>

in biodiversity informatics. This is impossible, as the natural laws<BR>

in a nomenclatural universe are subject to change (almost without<BR>

notice). What was true ten years ago is not necessarily true now:<BR>

it may have been retroactively changed. Anybody doing anything in<BR>

biodiversity informatics should have at least some basic awareness<BR>

of the natural laws that govern nomenclatural universes.<BR>

* * *<BR>

<BR>

It is a rare and happy<BR>

circumstance for a programmer to have the document author to consult!<BR>

<BR>

***<BR>

Not the document, just the recommendation (excluding the Note). The<BR>

ICBN more or less is a wiki (has been for a hundred years).<BR>

* * *<BR>

<BR>

What I \think/ is implied by your answer is (something that requires<BR>

biological knowledge that I don't have, namely) that there are hybrid<BR>

names which are not necessarily a cross of two things, but rather only<BR>

one is mentioned.<BR>

<BR>

***<BR>

No, numbers are irrelevant, provided there are at least two parents<BR>

involved.<BR>

* * *<BR>

<BR>

The distinction then is that &quot;formula&quot; means at<BR>

least two, but there are uses which do not appear in a formula, right?<BR>

<BR>

***<BR>

No, the distinction is that a name is a name, while a formula is a<BR>

summation of (at least two) names.<BR>

<BR>

&nbsp;&nbsp; ×Agropogon littoralis is a name, and it is the same as<BR>

&nbsp;&nbsp;&nbsp; Agropogon littoralis, for most purposes.<BR>

<BR>

    Agrostis stolonifera × Polypogon monspeliensis are two names,<BR>

&nbsp;&nbsp;&nbsp; and the formula indicates their relation, which may be more<BR>

&nbsp;&nbsp;&nbsp; complex than here: see Rec. H.2A.1; so just lifting a formula<BR>

&nbsp;&nbsp;&nbsp; in isolation from the literature is out (Mentha longifolia &gt;<BR>

&nbsp;&nbsp;&nbsp; × rotundifolia is an obsolete form).<BR>

<BR>

* * *<BR>

<BR>

&nbsp;So a natural language name extractor should follow this rule:<BR>

&nbsp;&nbsp;&nbsp; - If the × adjoins text, the token to the left of any predecessor<BR>

white space is not part of a taxon name, but otherwise it is.<BR>

Example: In the fragment &quot;not unlike ×Agropogon littoralis&quot;&nbsp; the token<BR>

'unlike' is not part of a name.<BR>

<BR>

Believe it or not, I am not complaining about ICBN. No programmer<BR>

interpreting a document not written for programmers should complain if<BR>

understanding it assumes knowledge and insight of the intended<BR>

audience. Nor should they complain if they are raising points that are<BR>

addressed in other parts of the document that they haven't read--which<BR>

in this case for me is everything but H.3A.<BR>

<BR>

Robust context sensitive parsers are marginally more complicated to<BR>

write than those that require no lookahead, but this is surely not the<BR>

only name parsing issue that requires lookahead, so I can't even<BR>

complain on that score. In a vaguely related setting, parser writers<BR>

might see the rather nicely set forth<BR>

<A HREF="http://stackoverflow.com/questions/1952931/how-to-rewrite-this-nondeterministic-xml-schema-to-deterministic">http://stackoverflow.com/questions/1952931/how-to-rewrite-this-nondeterministic-xml-schema-to-deterministic</A><BR>

<BR>

<BR>

<BR>

Bob Morris<BR>

p.s. Hey, I thought of something to complain about, albeit not about<BR>

ICBN: I sure wish spec writers targeting software would banish<BR>

&quot;should&quot; from their documents in favor of &quot;must&quot;, even if multiple<BR>

choices are accompanied by &quot;... is preferred&quot;.&nbsp; Well, maybe it's a<BR>

little complaint about the nomenclatural codes, because&nbsp; movement<BR>

towards born-digital, semantically marked-up systematics literature<BR>

will bump into it when people try to write semantically enhanced<BR>

applications. It would be far better if publishers followed a set of<BR>

rules with no &quot;should&quot; in them, for which compliance could be tested<BR>

before publication.<BR>

<BR>

***<BR>

There are more distinctions than just &quot;must&quot; and &quot;should&quot; in the ICBN.<BR>

Eliminating the &quot;should&quot; is not going to happen, but sometimes a<BR>

&quot;should&quot; will grow up to become a &quot;must&quot;.<BR>

<BR>

Paul van Rijckevorsel<BR>

<BR>

</FONT>

</P>


</BODY>

</HTML>