[tdwg-content] Does a species entail a specific classification or does it have many classifications.

Peter DeVries pete.devries at gmail.com
Mon Nov 29 01:18:56 CET 2010


Hi Gregor,

I have had some more time to think about this issue.

In an ideal world the combination of genus and specific epithet should map
to one concept.

I believe that was Linnaeus's original intent.

Unfortunately the current system seems to have strayed from this original
goal.

Also, it was unanticipated that changes in phylogenetic thinking would cause
as many changes to the "identifier: or name for a given species.

Dima and I came up with the somewhat strange URI's for name strings so we
could look at ways to connect some sort of species concept to the various
names in the GNI.

The GNI itself does not always know what kind of thing a given name string
represents.

It also does not know if a given name string is valid or not.

The GNI needs some additional parts to make sense of these. I will leave it
to others to describe those additional parts.

After looking through a large set of species related publications etc. it is
clear that the vast majority of information sources only list the genus and
specific epithet.

To make sense of those and do entity extraction etc. I think the following
service would be useful.

You have a name, lets say Duplictus annoyii.

To get information about what that name might be you reference a semantic
web service using a URL form of the name.

http://example.org/names/Duplictus_annoyii

If you a human using a web browser this will return a page that lists all
the things it might be.

Kingdom Animalia Duplictus annoyii Schmoe 1886 => appears in database in X,
Y, Z
Kingdom Animalia Duplictus annoyii Schmoe 1886 sec Thomas 1920 => appears in
database in A, I, P
Kingdom Plantae   Duplictus annoyii Carefulnot => appears in database M

If there are no exact matches it could return:

Similar Name Strings
Kingdom Animalia Duplictus annoyi Schmoe 1886 => appears in database in X,
Y, Z
Kingdom Plantae   Duplictus annous Carefulnot  => appears in database in A,
I, P
Kingdom Plantae   Duplictus annous Carefulnot  => appears in database M
Kingdom Plantae   Duplicta  annous (Carefulnot)  => appears in database in
X, Y, Z

A semantic web program would receive an RDF version of the information above
and could use various heuristics to determine which are the most likely full
names.

Eventually the service could also include opinions about which name variants
are valid and which are not.

Felis concolor => Felis concolor Linnaeus 1771

"Valid" names would include properly formed synonyms so Felis concolor would
return Felis concolor Linnaeus 1771.

The associated URLs for each of these forms would be

HTML Page http://example.org/names/Duplictus_annoyii.html

RDF Page http://example.org/names/Duplictus_annoyii.rdf

This service would help map a given genus specific epithet combination to
possible full scientific names with authorship.

You would need an additional even more opinionated web service that would
tell you what the most current full name is for a given scientific name with
authorship.

One that recognizes that "Felis concolor Linnaeus 1771" is a older synonym
for "Puma concolor (Linnaeus, 1771)"

In GNUB / GNITE these two URI's could be considered reconciliation groups.

http://example.org/recgroups/Animal/Felis_concolor    => PrefName:""Felis
concolor Linnaeus 1771"
http://example.org/recgroups/Animal/Puma_concolor   => PrefName:"Puma
concolor (Linnaeus, 1771)"

This allows a different service or different groups to make statements about
these.

http://example.org/recgroups/Animal/Felis_concolor olderSynonymFor
http://example.org/recgroups/Animal/Puma_concolor

or

http://example.org/recgroups/Animal/Felis_concolor isBasionymOf
http://example.org/recgroups/Animal/Puma_concolor

Note that this system allows for different groups to make different an
potentially conflicting statements about the different reconcillation
groups.

These different statements are from different groups so can assigned to
different "graphs"

<subject> <predicate> <object> <graph>

For example:

http://example.org/recgroups/Animal/Aedes_triseriatus isPreferedNameVariantOf
<speciesXYZ> <urn:frustrated.medical.entomologists.org>
http://example.org/recgroups/Animal/Ochlerotatus_triseriatus
isPreferedNameVariantOf
<speciesXYZ> <urn:frustrated.dipterist.taxonomist.org>

For the vast majority of people that simply care about mapping these names
to the same "thing:

They simply do the following in their own knowledge base.

http://example.org/recgroups/Animal/Aedes_triseriatus sameAs
http://example.org/recgroups/Animal/Ochlerotatus_triseriatus

Now for them facts like:

http://example.org/recgroups/Animal/Aedes_triseriatus larvalDevelopmentIn
<stdVocab:TreeHoles>

http://example.org/recgroups/Animal/Ochlerotatus_triseriatus
takesBloodMealsFrom
<stdVocab:Birds>

http://example.org/recgroups/Animal/Ochlerotatus_triseriatus knownVectorOf
<stdVocab:LacCrosseEncephalitis>

Are interpreted as being statements about the same "thing"

Respectfully,

- Pete

On Thu, Nov 25, 2010 at 10:06 AM, Gregor Hagedorn <g.m.hagedorn at gmail.com>wrote:

> A side remark, about where I believe the whole discussion is misleading:
>
> > Puma concolor   se:v6n7p
> > That way in the future if the name changes without a change in the
> concept.
> > Eupuma concolor se:v6n7p
> > The data says linked.
>
> This always looks nice... However, with such proposals we, the
> computer guys, make the concept-assessment someone elses problem (i.e.
> the taxonomists, ecologists, pathologist, etc.), and, at the same
> time, do not provide them the means to communicate. The assumption is
> that a scientist or applied worker would know whether to add se:v6n7p
> to a given taxon name or not.
>
> With my taxonomer/pathologist hat on: I mostly have no clue which
> concept XXX concolor is - and whether it is changed or not. Puma
> concolor may be a different concept than Puma concolor. We are, of
> course, guilty of communicating in a shamefully loose way (s.str., s.
> lat. etc.), which could and should be improved by citing a secundum,
> but beyond that: mapping concepts is a taxonomic opinion, no objective
> truth.
>
> So given that any trivial mapping mechanism can map multiple IDs (Puma
> concolor, Eupuma concolor) to a single concept - the proposal saves
> this trivial processing time, but does not contribute to the problem
> of communicating in a way that is suitable to assess taxon concepts.
>
> ----
>
> Aside: Please compare the highly linked, and generally correctly
> linked Wikipedias with other content management system for the
> advantage of human legible IDs [[Puma concolor]] over
> http://x.y.net/node/234872561 - links. My own observation is that in
> the latter case only a fraction of the desirable links are created,
> and that these are quite often going to wrong, or perhaps obsoleted
> places.
>
> I therefore think:
> se:Puma_concolor_sec._Smith
> would be a much more useful mechanism than all the
> computer-scientists-only proposals like se:v6n7p or
>
> http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9
>
> Gregor
>



-- 
---------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20101128/bfa8f8ac/attachment.html 


More information about the tdwg-content mailing list