[tdwg-content] canonical name for named hybrid & infragenericnames

Peter DeVries pete.devries at gmail.com
Thu Dec 9 18:20:55 CET 2010


I think it makes the most sense to model these based on biology and
informatics and then be able to output a code compliant string.

One reason is that the code can change and so you don't want to have that a
fixed part of the most fundamental units of your information systems.

Another advantage is you don't have to have different intermediate
structures and forms for each of the nomenclatural codes.

Do we want to have one entity for species or four+ that have to be
duplicated throughout the entire software stack?

Done this way if the code changes you just need to alter the output code.

You also don't always know what is the appropriate code for a given string
until the end.

Respectfully,

- Pete


On Thu, Dec 9, 2010 at 7:52 AM, Bob Morris <morris.bob at gmail.com> wrote:

> Thanks. To me what is interesting about this thread is that documents
> whose main(?) audience is authors and publishers, do not always
> address the needs of parser writers. It is a rare and happy
> circumstance for a programmer to have the document author to consult!
>
> What I \think/ is implied by your answer is (something that requires
> biological knowledge that I don't have, namely) that there are hybrid
> names which are not necessarily a cross of two things, but rather only
> one is mentioned.  The distinction then is that "formula" means at
> least two, but there are uses which do not appear in a formula, right?
>  So a natural language name extractor should follow this rule:
>    - If the × adjoins text, the token to the left of any predecessor
> white space is not part of a taxon name, but otherwise it is.
> Example: In the fragment "not unlike ×Agropogon littoralis"  the token
> 'unlike' is not part of a name.
>
> Believe it or not, I am not complaining about ICBN. No programmer
> interpreting a document not written for programmers should complain if
> understanding it assumes knowledge and insight of the intended
> audience. Nor should they complain if they are raising points that are
> addressed in other parts of the document that they haven't read--which
> in this case for me is everything but H.3A.
>
> Robust context sensitive parsers are marginally more complicated to
> write than those that require no lookahead, but this is surely not the
> only name parsing issue that requires lookahead, so I can't even
> complain on that score. In a vaguely related setting, parser writers
> might see the rather nicely set forth
>
> http://stackoverflow.com/questions/1952931/how-to-rewrite-this-nondeterministic-xml-schema-to-deterministic
>
>
>
> Bob Morris
> p.s. Hey, I thought of something to complain about, albeit not about
> ICBN: I sure wish spec writers targeting software would banish
> "should" from their documents in favor of "must", even if multiple
> choices are accompanied by "... is preferred".  Well, maybe it's a
> little complaint about the nomenclatural codes, because  movement
> towards born-digital, semantically marked-up systematics literature
> will bump into it when people try to write semantically enhanced
> applications. It would be far better if publishers followed a set of
> rules with no "should" in them, for which compliance could be tested
> before publication.
>
>
>
> Robert A. Morris
> Emeritus Professor  of Computer Science
> UMASS-Boston
> 100 Morrissey Blvd
> Boston, MA 02125-3390
> Associate, Harvard University Herbaria
> email: morris.bob at gmail.com
> web: http://bdei.cs.umb.edu/
> web: http://etaxonomy.org/mw/FilteredPush
> http://www.cs.umb.edu/~ram
> phone (+1) 857 222 7992 (mobile)
>
>
> On Thu, Dec 9, 2010 at 3:39 AM,  <dipteryx at freeler.nl> wrote:
> > Having personally written Rec. H.3A.1, I do not see that it offers
> > scope for being misread: the placement of the multiplication sign
> > is a matter of style (and insight). As background information, the
> > ICBN-preferred style is to put it directly in front of the name or
> > epithet (no space whatsoever: ×Agropogon littoralis): just keep
> > it nice together, so as to give computers no chance to mess it
> > up (after all, at a line break, a computer is likely to separate
> > these over more than one line).
> >
> > Rec. H.3A Note 1 has been put in there (redundantly) for those who
> > are careless readers, just to make sure the matter could not
> > possibly be misunderstood by even the most whimsical. So, in a
> > formula, the parents are separated by: space, multiplication sign,
> > space;
> >      Agrostis stolonifera × Polypogon monspeliensis.
> >
> > Paul van Rijckevorsel
> >
> > * * *
> > -----Oorspronkelijk bericht-----
> > Van: tdwg-content-bounces at lists.tdwg.org namens Bob Morris
> > Verzonden: wo 8-12-2010 20:12
> > Aan: Markus Döring (GBIF)
> > CC: tdwg-content at lists.tdwg.org List
> > Onderwerp: Re: [tdwg-content] canonical name for named hybrid &
> > infragenericnames
> >
> > Your placement of the multiplication sign ×  does not seem code
> > compliant. It looks too close. Maybe.  Also there might be a question
> > about whether a TDWG requirement to use the multiplication sign can be
> > easily implemented by all providers.
> >
> > On these subjects The Appendix on Hybrid Names of ICBN seems
> > contradictory in that H.3A.1
> > (http://ibot.sav.sk/icbn/frameset/0071AppendixINoHa003.htm, quoted
> > below)  seems to allow your placement, but Note 1. there seems to
> > require space. Note 1. would, with H.3A.1 imply that there must be
> > more white space to the left than right of the multiplication sign or
> > its surrogate. One spacing that seems to violate all interpretations
> > of A.3A.1 is equal white space around the multiplication sign. My
> > guess is that the overwhelming fraction of printed hybrid names are
> > thereby noncompliant unless something elsewhere resolves the issue).
> > Making the amount of white space significant in a parsed string  is a
> > horrifying thought.
> >
> > --Bob Morris
> >
> > "Recommendation H.3A
> >
> > H.3A.1. The multiplication sign ×, indicating the hybrid nature of a
> > taxon, should be placed so as to express that it belongs with the name
> > or epithet but is not actually part of it. The exact amount of space,
> > if any, between the multiplication sign and the initial letter of the
> > name or epithet should depend on what best serves readability.
> >
> > Note 1. The multiplication sign × in a hybrid formula is always placed
> > between, and separate from, the names of the parents.
> > H.3A.2. If the multiplication sign is not available it should be
> > approximated by a lower case letter "x" (not italicized)."
> > http://ibot.sav.sk/icbn/frameset/0071AppendixINoHa003.htm
> >
> >
> > ======================
> >
> >
> >
> > On Wed, Dec 8, 2010 at 1:14 PM, "Markus Döring (GBIF)"
> > <mdoering at gbif.org> wrote:
> >> talking about canonical names again I want to use the oppertunity and
> get
> >> rid of another question I have.
> >> What is the code compliant canonical version of named hybrids (not
> >> formulas) and infrageneric names?
> >>
> >>
> >> Are these examples correct?
> >>
> >> Botanical section:
> >> verbatim: Maxillaria sect. Multiflorae Christenson
> >> canonical:  Maxillaria sect. Multiflorae
> >>
> >> Botanical subgenus:
> >> verbatim: Anthemis subgen. Maruta (Cass.) Tzvelev
> >> canonical:  Anthemis subgen. Maruta
> >>
> >> Botanical series:
> >> verbatim: Artemisia ser. Codonocephalae (Pamp.) Y.R.Ling
> >> canonical:  Artemisia ser. Codonocephalae
> >>
> >> Zoological subgenus:
> >> verbatim: Murex (Promurex) Ponder & Vokes, 1988
> >> canonical:  Murex subgen. Promurex
> >> # if we use parenthesis to indicate the subgenus we can only guess if
> its
> >> an author or subgenus name
> >>
> >> Zoological species
> >> verbatim: Leptochilus (Neoleptochilus) beaumonti Giordani Soika 1953
> >> canonical: Leptochilus beaumonti
> >>
> >>
> >>
> >> Botanical named genus hybrid:
> >> verbatim: ×Agropogon littoralis (Sm.) C. E. Hubb.
> >> canonical: ×Agropogon littoralis
> >>
> >> Botanical named infrageneric hybrid:
> >> verbatim: Eryngium nothosect. Alpestria Burdet & Miège
> >> canonical: Eryngium nothosect. Alpestria
> >>
> >> Botanical named species hybrid:
> >> verbatim: Salix ×capreola Andersson (1867)
> >> canonical: Salix ×capreola Andersson (1867)
> >>
> >> Botanical variety, named species hybrid:
> >> verbatim: Populus ×canadensis var. serotina (R. Hartig) Rehder
> >> canonical: Populus ×canadensis var. serotina
> >>
> >> Botanical named infraspecific hybrid:
> >> verbatim: Polypodium vulgare nothosubsp. mantoniae(Rothm.) Schidlay
> >> canonical: Polypodium vulgare nothosubsp. mantoniae
> >>
> >>
> >>
> >
> > _______________________________________________
> > tdwg-content mailing list
> > tdwg-content at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >
> >
>
>
>
> --
>
>
>
> On Thu, Dec 9, 2010 at 3:39 AM,  <dipteryx at freeler.nl> wrote:
> > Having personally written Rec. H.3A.1, I do not see that it offers
> > scope for being misread: the placement of the multiplication sign
> > is a matter of style (and insight). As background information, the
> > ICBN-preferred style is to put it directly in front of the name or
> > epithet (no space whatsoever: ×Agropogon littoralis): just keep
> > it nice together, so as to give computers no chance to mess it
> > up (after all, at a line break, a computer is likely to separate
> > these over more than one line).
> >
> > Rec. H.3A Note 1 has been put in there (redundantly) for those who
> > are careless readers, just to make sure the matter could not
> > possibly be misunderstood by even the most whimsical. So, in a
> > formula, the parents are separated by: space, multiplication sign,
> > space;
> >      Agrostis stolonifera × Polypogon monspeliensis.
> >
> > Paul van Rijckevorsel
> >
> > * * *
> > -----Oorspronkelijk bericht-----
> > Van: tdwg-content-bounces at lists.tdwg.org namens Bob Morris
> > Verzonden: wo 8-12-2010 20:12
> > Aan: Markus Döring (GBIF)
> > CC: tdwg-content at lists.tdwg.org List
> > Onderwerp: Re: [tdwg-content] canonical name for named hybrid &
> > infragenericnames
> >
> > Your placement of the multiplication sign ×  does not seem code
> > compliant. It looks too close. Maybe.  Also there might be a question
> > about whether a TDWG requirement to use the multiplication sign can be
> > easily implemented by all providers.
> >
> > On these subjects The Appendix on Hybrid Names of ICBN seems
> > contradictory in that H.3A.1
> > (http://ibot.sav.sk/icbn/frameset/0071AppendixINoHa003.htm, quoted
> > below)  seems to allow your placement, but Note 1. there seems to
> > require space. Note 1. would, with H.3A.1 imply that there must be
> > more white space to the left than right of the multiplication sign or
> > its surrogate. One spacing that seems to violate all interpretations
> > of A.3A.1 is equal white space around the multiplication sign. My
> > guess is that the overwhelming fraction of printed hybrid names are
> > thereby noncompliant unless something elsewhere resolves the issue).
> > Making the amount of white space significant in a parsed string  is a
> > horrifying thought.
> >
> > --Bob Morris
> >
> > "Recommendation H.3A
> >
> > H.3A.1. The multiplication sign ×, indicating the hybrid nature of a
> > taxon, should be placed so as to express that it belongs with the name
> > or epithet but is not actually part of it. The exact amount of space,
> > if any, between the multiplication sign and the initial letter of the
> > name or epithet should depend on what best serves readability.
> >
> > Note 1. The multiplication sign × in a hybrid formula is always placed
> > between, and separate from, the names of the parents.
> > H.3A.2. If the multiplication sign is not available it should be
> > approximated by a lower case letter "x" (not italicized)."
> > http://ibot.sav.sk/icbn/frameset/0071AppendixINoHa003.htm
> >
> >
> > ======================
> >
> >
> >
> > On Wed, Dec 8, 2010 at 1:14 PM, "Markus Döring (GBIF)"
> > <mdoering at gbif.org> wrote:
> >> talking about canonical names again I want to use the oppertunity and
> get
> >> rid of another question I have.
> >> What is the code compliant canonical version of named hybrids (not
> >> formulas) and infrageneric names?
> >>
> >>
> >> Are these examples correct?
> >>
> >> Botanical section:
> >> verbatim: Maxillaria sect. Multiflorae Christenson
> >> canonical:  Maxillaria sect. Multiflorae
> >>
> >> Botanical subgenus:
> >> verbatim: Anthemis subgen. Maruta (Cass.) Tzvelev
> >> canonical:  Anthemis subgen. Maruta
> >>
> >> Botanical series:
> >> verbatim: Artemisia ser. Codonocephalae (Pamp.) Y.R.Ling
> >> canonical:  Artemisia ser. Codonocephalae
> >>
> >> Zoological subgenus:
> >> verbatim: Murex (Promurex) Ponder & Vokes, 1988
> >> canonical:  Murex subgen. Promurex
> >> # if we use parenthesis to indicate the subgenus we can only guess if
> its
> >> an author or subgenus name
> >>
> >> Zoological species
> >> verbatim: Leptochilus (Neoleptochilus) beaumonti Giordani Soika 1953
> >> canonical: Leptochilus beaumonti
> >>
> >>
> >>
> >> Botanical named genus hybrid:
> >> verbatim: ×Agropogon littoralis (Sm.) C. E. Hubb.
> >> canonical: ×Agropogon littoralis
> >>
> >> Botanical named infrageneric hybrid:
> >> verbatim: Eryngium nothosect. Alpestria Burdet & Miège
> >> canonical: Eryngium nothosect. Alpestria
> >>
> >> Botanical named species hybrid:
> >> verbatim: Salix ×capreola Andersson (1867)
> >> canonical: Salix ×capreola Andersson (1867)
> >>
> >> Botanical variety, named species hybrid:
> >> verbatim: Populus ×canadensis var. serotina (R. Hartig) Rehder
> >> canonical: Populus ×canadensis var. serotina
> >>
> >> Botanical named infraspecific hybrid:
> >> verbatim: Polypodium vulgare nothosubsp. mantoniae(Rothm.) Schidlay
> >> canonical: Polypodium vulgare nothosubsp. mantoniae
> >>
> >>
> >>
> >
> > _______________________________________________
> > tdwg-content mailing list
> > tdwg-content at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >
> >
>
>
>
> --
> Robert A. Morris
> Emeritus Professor  of Computer Science
> UMASS-Boston
> 100 Morrissey Blvd
> Boston, MA 02125-3390
> Associate, Harvard University Herbaria
> email: morris.bob at gmail.com
> web: http://bdei.cs.umb.edu/
> web: http://etaxonomy.org/mw/FilteredPush
> http://www.cs.umb.edu/~ram
> phone (+1) 857 222 7992 (mobile)
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>



-- 
---------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20101209/bde8eeab/attachment.html 


More information about the tdwg-content mailing list