[tdwg-content] ITIS TSNID to uBio NamebankIDs mapping

Peter DeVries pete.devries at gmail.com
Wed Jun 1 06:42:59 CEST 2011


Hi Kevin,

Sorry I thought you were mad at me for creating yet another set of ID's. :-\

Yes I agree that it is unlikely that we will get everyone to adopt the same
name.

In some cases, it is unclear what the "right name" is. The mosquito
community seems to be split about *Aedes triseriatus* and *Ochlerotatus
triseriatus. *With the differences largely bases on name stability*.*
*
*
The only way to really see this is to watch what names are appearing in
articles.

As a check I did a Google Scholar search to see how many recent publications
use *Felis concolor* rather than *Puma concolor*. The Felis variant count
was *3,920* since the year 2000.

Another advantage of the linking approach you suggest is that it allows for
different interpretations and approaches and makes those visible and
findable.

In order to make it clearer whether something is the same as another thing
we need to encourage people to better document each species with e-type like
LOD pages that contain links to specimens, etc.

For some uses I think this page and it's linked resources is good enough
that someone could use it to determine if what they observed on a BioBlitz
was an instance of this concept.

http://lod.taxonconcept.org/ses/CTZ8z.html

For a number of other species you really need links to good photographs that
help clarify characters and the locations of specimens that they could use
for comparison.

Unfortunately, the system does not reward the creation of clearer
descriptions of existing or new species. It tends to reward the
re-categorizing of previously described species.

If you had open, easily citable and trackable species descriptions then
there might be a shift in the how different activities are rewarded.

One reason being that the descriptions are accessible to, and findable by,
anyone with an internet connection.

In addition a masters student might not be able to fully document a species
but they might be able to document the distribution, variability in
morphology and DNA for a species over a specific geographical area like the
North American Midwest.

It is these kinds of resources that are needed by other biologists, citizens
and governments.

Once it is easier to determine what butterfly is visiting what flower than
it is possible to work out what plants each butterfly species seems to
dependent on.

Respectfully,

- Pete

On Tue, May 31, 2011 at 8:53 PM, Kevin Richards <
RichardsK at landcareresearch.co.nz> wrote:

>  Pete,
>
> I’m not trying to say what you are doing is a waste of time/impossible.  I
> actually think RDF + semantics are a good way forward, but this really
> implies that we need to rely on the semantics and linkages rather than
> having a SINGLE ID for a taxon name.  (which is what I thought Steve was
> getting at).  Each instance of a taxon name can have its own ID and then all
> these instances are connected via ontology defined semantic links.  This
> seems more appropriate to me than insisting everyone uses the “Global Taxon
> Name ID X”.
>
>
>
> In your example of *Aedes triseriatus* and *Ochlerotatus triseriatus* –
> these are two different names so they need two different IDs, they may be
> linked by a single taxon concept, but they are separate names.  So which of
> these now 3 IDs do you expect people to use, and according to what source??
>
>
>
> For example if we have a name, eg the Robin, Erithacus rubecula, mentioned
> in IT IS (TSN : 559964) and also in EOL (www.eol.org/pages/1051567), also
> in GBIF (http://data.gbif.org/species/21266780), also in avibase (
> http://avibase.bsc-eoc.org/species.jsp?avibaseid=C809B2B90399A43D), which
> ID are you hoping people will use??  Would you put the IT IS ID in your own
> dataset as the ID for that name – unlikely.  Or would it be better to link
> them up with semantic linkages.
>
>
>
> What I take from recommendation 8 of the GUID applicability guide (as Steve
> puts is "stop making up new identifiers when somebody else already has one
> for the thing you are talking about”) is that if you DON’T already have a
> record in your own database for a taxon name/concept, then reuse an existing
> one.  NOT ditch all your current IDs and adopt someone else’s (especially
> hard considering it is so hard to work out which if the multitude of names
> ad concept IDs that directly relates to your taxon name).
>
>
>
> I am all for limiting the number of IDs for the “same” thing, but in some
> cases it is more useful to build linkages than force this tight integration
> of data and IDs.  Especially for taxon names and concepts, where it is
> complex to define if you are even talking about the “same” thing or not.
>
>
>
> Kevin
>
>
>
> *From:* Peter DeVries [mailto:pete.devries at gmail.com]
> *Sent:* Wednesday, 1 June 2011 12:38 p.m.
> *To:* Kevin Richards
> *Cc:* Steve Baskauf; tdwg-content at lists.tdwg.org; Gerald Guala; Nicolson,
> David; Alan J Hampson; Orrell, Thomas
>
> *Subject:* Re: [tdwg-content] ITIS TSNID to uBio NamebankIDs mapping
>
>
>
> Hi Kevin,
>
>
>
> I forgot one mention some other things that are different about my project.
>
>
>
> You can write a simple SPARQL query to get a list of all the TaxonConcept's
> that have ITIS ids, or all those that have ITIS and NCBI ID's etc.
>
>
>
> You can do this on any SPARQL endpoint that hosts the data.
>
>
>
> You can download the entire data set and run the queries on your own
> endpoint.
>
>
>
> You can write a script that runs the query and downloads the ITIS numbers
> and exports them to CSV etc.
>
>
>
> - Pete
>
>
>
> On Tue, May 31, 2011 at 5:16 PM, Peter DeVries <pete.devries at gmail.com>
> wrote:
>
> Hi Kevin,
>
> On Tue, May 31, 2011 at 3:27 PM, Kevin Richards <
> RichardsK at landcareresearch.co.nz> wrote:
>
> This is exactly why this problem still exists and will be very complex to
> solve - everyone says "we should have a single ID for a specific taxon name,
> there seems to be several IDs 'out there' that refer to the same taxon name,
> so Im going to create another ID to link them all up" - yet another ID that
> no one will particularly want to follow - you would have to get everyone to
> agree that your combinations/integration of taxon names is the best one and
> hope everyone follows it - unlikely in this domain.
>
>
>
> Isn't this kind of what the The Plant List, and eBird already do?
>
>
>
> A difference being that they tie these to a specific name and specific
> classification.
>
>
>
> The Plant list is not really even open so it is difficult to people to
> adopt it in mass.
>
>
>
> For instance, if I manage a herbarium, how do I easily reconcile my species
> list with the entities represented in the Plant List?
>
>
>
> eBird has millions of records which implies that they have been able to
> convince the observers in the field to adopt their system. You are correct
> in that there are probably a lot of taxonomists that don't like their list.
>
> It differs from many of the other classifications, but remember the system
> rewards them for not agreeing. Note the difference between the microbial
> taxonomists and other taxonomists. In the case of the microbial
>
> workers, the system rewards them for solving problems not debating
> alternatives. Also, if a good idea comes out that will make it easier for
> the microbiologists to solve the problems they are rewarded for solving,
> they are less likely to care whose idea it is.
>
>
>
> Like the microbiologists, there are lots of biologists that work with
> species with the goal of addressing some non-taxonomic problem.
>
>
>
> They don't really care if the name is *Aedes triseriatus* or *Ochlerotatus
> triseriatus, *but they do care that the identifier that they connect their
> data to is stable.
>
>
>
> In regards to the issue of market forces,I suspect (but have no knowledge
> of) that there were probably decisions made in devising these lists that
> have more to do with appeasing certain personalities that creating best
> list. With the way this system rewards people it is likely that the
> "correct" version will float to the top only after that person has passed
> away. I don't have much faith that the best system will always float to the
> top, That has a lot to do with the personalities and how the system rewards
> are setup. Theoretically, it is possible for one strong personality or group
> to force others to adopt their less than optimal solution - at least this
> seems to happen in other environments.
>
>
>
> Also, there are all sorts of ways that people can use the publication
> record to rewrite history. Simply cite the review paper that cites the
> original paper. Or don't cite it at all.
>
>
>
> I would have used only the ITIS TSN but if the name changes the ID changes.
> This isn't "wrong", it just does not solve my problem.
>
>
>
> * ITIS also should add the spiders from the World Spider Catalog.
>
>
>
> Another issue that I think has inhibited adoption of a common list is that
> people can't agree on a particular name or a particular classification.
>
>
>
> Since you can model a species concept as having many names and many
> classifications why not do so?
>
>
>
> If this idea was originally accepted, I would not have needed to create
> TaxonConcept.org.
>
>
>
> My plan has aways been to get something that works to solve some problems
> and then let some larger group take it over.
>
>
>
> In a sense, I am more like the microbiologists in that I am not being paid
> to solve this or debate this problem.
>
>
>
> I am doing it because I think something like this is needed, and it is an
> interesting and personally rewarding puzzle.
>
>
>
> - Pete
>
>
>
>
>
>
> My thoughts are that the most likely way this will be solve is by stnadard
> market type pressures - ie the best solution/IDs will be used the most and
> "float" to the top.  It is easy to say that the global taxon name data is a
> mess, but if you think about it 30 years ago taxon name data were very
> disparate, duplicated, unconnected, many with NO IDs at all.  So I beleive
> we are making progress and that we will continue to do so albeit at a fairly
> slow rate.
>
> Kevin
>
>
>
> "I agree. This was one of the reasons that I setup TaxonConcept the way I
> did. It attempts to connect both the LOD entities and the foreign key based
> entities."
>
>  Please consider the environment before printing this email
> Warning:  This electronic message together with any attachments is
> confidential. If you receive it in error: (i) you must not read, use,
> disclose, copy or retain it; (ii) please contact the sender immediately by
> reply email and then delete the emails.
> The views expressed in this email may not be those of Landcare Research New
> Zealand Limited. http://www.landcareresearch.co.nz
>
>
>
>
> --
>
> ------------------------------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> Email: pdevries at wisc.edu
> TaxonConcept <http://www.taxonconcept.org/>  &  GeoSpecies<http://about.geospecies.org/> Knowledge
> Bases
> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
>
> --------------------------------------------------------------------------------------
>
>
>
>
> --
>
> ------------------------------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> Email: pdevries at wisc.edu
> TaxonConcept <http://www.taxonconcept.org/>  &  GeoSpecies<http://about.geospecies.org/> Knowledge
> Bases
> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
>
> --------------------------------------------------------------------------------------
>
> ------------------------------
> Please consider the environment before printing this email
> Warning: This electronic message together with any attachments is
> confidential. If you receive it in error: (i) you must not read, use,
> disclose, copy or retain it; (ii) please contact the sender immediately by
> reply email and then delete the emails.
> The views expressed in this email may not be those of Landcare Research New
> Zealand Limited. http://www.landcareresearch.co.nz
>



-- 
------------------------------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
Email: pdevries at wisc.edu
TaxonConcept <http://www.taxonconcept.org/>  &
GeoSpecies<http://about.geospecies.org/> Knowledge
Bases
A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
--------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110531/81508256/attachment.html 


More information about the tdwg-content mailing list