[tdwg-content] Producing a global taxon register (was: ITIS TSNID to uBio NamebankIDs mapping)

Tony.Rees at csiro.au Tony.Rees at csiro.au
Sat Jun 4 01:04:46 CEST 2011


Hi all (jumping in with some trepidation...)

It's good to hear some ramp-up may be coming of activity in the GNUB space (congratulations, Rich et al.). My main concern, however is that it does not solve my particular problem - which is in a nutshell, given "any" cited taxonomic name, what can we tell about it - with regard to its classification, nomenclatural and taxonomic/synonym status, and certain attributes (initially for my use case, simple geologic time - is it extant or not - and simple habitat classification - is it marine or not - though of course infinitely expandable from there).

To me the vision of GNUB is too grand - to index all usages of all names in all sources - and the vision of GNI is too limited - to index the names but not actually record/harmonise/verify/manage (in a structured way) any associated information. I'm after something in between - what I have tentatively previously called HCAL - a hierarchical catalogue of all life (presuming that at least one "management" hierarchy is incorporated) - or maybe just a GTR - global taxon register. Sort of, waiting for the Catalogue of Life and/or ITIS to be complete, for both extant and fossil taxa, and also incorporate selected "taxon attributes" as above. (This is the space into which my IRMNG database is cast as a preliminary/"working for now" solution, but obviously without the significant resourcing / community cooperation required to build and sustain the thing for the long term).

So my question is, how can such a product emerge from ongoing developments in GN* space, or other...

Over to the experts,

Best - Tony

________________________________________
From: tdwg-content-bounces at lists.tdwg.org [tdwg-content-bounces at lists.tdwg.org] On Behalf Of Richard Pyle [deepreef at bishopmuseum.org]
Sent: Saturday, 4 June 2011 8:48 AM
To: tdwg-content at lists.tdwg.org
Subject: Re: [tdwg-content] ITIS TSNID to uBio NamebankIDs mapping

Working backwards through this thread...

I hadn't read Dima's post until just now, and I see that at least a couple of his points (i.e., #2, #5, #6) apply to exposing the UUIDs externally. However, I think that a simple protocol (such as replacing spaces with "_", and avoiding characters that look the same but are different -- such as the Cyrillic 'a') could go a long way to mitigating those problems.

On the other hand, it really depends on what the identifier is for.  The string "Danaus_plexippus_(Linnaeus_1758)" may be more friendly to our eyes, but "A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523" is definitely more friendly to a computer (Dima's points 1, 3 & 4, among others).  My feeling is that the push for GUIDs is more about enabling computer-computer conversations, than it is about enabling human-human or human-computer interactions; and therefore we should not get bogged down in the "ugliness" of the identifiers.  In the context of electronic data services, the "ugliness" potential of the "Danaus_plexippus_(Linnaeus_1758)" approach to identifiers is far greater than the ugliness potential of "A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523", when it comes to interlinking electronic biodiversity data.  It is nothing for a computer to render relevant metadata of the object identified by "A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523" into "Danaus plexippus (Linnaeus_1758)" on a computer screen or piece of paper for human-eyeball consumption.  But there are many pitfalls (some noted by Dima) for a computer to unambiguously resolve "Danaus_plexippus_(Linnaeus_1758)" back to a meaningful data object.

I guess my revised point is:  GNI (and uBio/NameBank) are essentially the only taxonomic databases out there where a human-friendly persistent/actionable identifier of the sort being discussed is even plausible as an option.  It may not even be wise in this context (as per Dima's points), but it *might* be, depending on the need for a human-friendly identifier.

Maybe the simplest thing to do would be to not regard "http://gni.globalnames.org/name_strings/Danaus_plexippus_(Linnaeus_1758)" as an identifier per se, but rather as a protocol for a web service.  In other words, if you append a text string to the root URL "http://gni.globalnames.org/name_strings/", GNI would run that text string against its index and return whatever metadata based on a text-string match.  This is not mutually exclusive with an "identifier" in the form of "http://gni.globalnames.org/name_strings/A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523", that would less ambiguously resolve a known record in GNI.  At this point, the line between "identifier" and "service" gets fuzzy, of course.  But the analogy is true in ZooBank:

The persistent "Identifer" looks like this:
A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523

One way that this identifier can be represented as an *actionable* identifier is this:
urn:lsid:zoobank.org:act:A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523

Another "actionable" form of the identifier might be this:
http://zoobank.org/urn:lsid:zoobank.org:act:A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523

or this:
http://zoobank.org/A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523

or even this(?):
http://lsid.tdwg.org/urn:lsid:zoobank.org:act:A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523

(all of which work, by the way)

However, the following are examples of what I would think of as *services*:
http://www.google.com/search?q=Danaus+plexippus+(Linnaeus+1758)
http://lsid.tdwg.org/summary/urn:lsid:zoobank.org:act:A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523
http://darwin.zoology.gla.ac.uk/~rpage/lsid/tester/?q=urn:lsid:zoobank.org:act:A9F435E0-8ED7-46DD-BAB4-EA8E5BF41523&submit=Go

But really, from the perspective of the end-user, does it matter if it's an identifier or a service?  Ultimately, they ask the questions, and the answers appear on their computer screens.

Aloha,
Rich





> -----Original Message-----
> From: tdwg-content-bounces at lists.tdwg.org [mailto:tdwg-content-
> bounces at lists.tdwg.org] On Behalf Of Dmitry Mozzherin
> Sent: Friday, June 03, 2011 4:34 AM
> To: David Remsen (GBIF)
> Cc: tdwg-content at lists.tdwg.org; Dmitry Mozzherin; Orrell, Thomas; Alan J
> Hampson; Nicolson, David; Gerald Guala
> Subject: Re: [tdwg-content] ITIS TSNID to uBio NamebankIDs mapping
>
> In my opinion UUIDs have a few advantages over strings --
>
> 1. It is uuid, so it will work with uuid tools (current and future ones)
> 2. It is less  ambiguous -- For example -- what is the difference between Betulа and
> Betula for your eyes? (one of them has a Cyrillic 'a')
> 3. Database wise it is faster to search because it is just a 128bit number, while
> a name is at least 245 byte varchar -- it makes searching much faster because
> in relational databases the size of keys directly proportional to the search
> speed
> 4. UUID v. 5
> (http://en.wikipedia.org/wiki/Universally_unique_identifier)
> allows to generate UUID algorithmically without looking up a database (no
> need for network connection)
>  5. Links like http://gni.globalnames.org/name_strings/Danaus_plexippus_(Linnaeus_1758) might be ambigous -- I can think of several ways I can represent name string
> part in the url and they will all resolve to the same thing in GNI.
> 6. Unescaped unicode characters in url containing literal name strings (people
> will forget to escape them) will depend on an implementation of a url
> resolver
>
> Saying this links like
> http://gni.globalnames.org/name_strings/Danaus_plexippus_(Linnaeus_175
> 8)
> are definitely attractive and is it good to have them as another way to access
> a name!
> My personal preference would be not use them as main identifier because
> of the reasons 1, 2, 3 and 5.
>
> Dima
>
>
>
>
> On Fri, Jun 3, 2011 at 7:59 AM, David Remsen (GBIF) <dremsen at gbif.org>
> wrote:
> > Why not use the name as the basis for the resolvable identifier
> > instead of a uuid. Isnt there a 1:1 cardinality between the name and
> > the uuid in the GNI?  Doesnt that mean that
> >
> > http://gni.globalnames.org/name_strings/4ef223c4-0c3e-5e84-ace9-
> 755c34
> > c601ec
> > and
> >
> http://gni.globalnames.org/name_strings/Danaus_plexippus_(Linnaeus_175
> > 8)
> >
> > are equally unique?  The latter is certainly more readable.  In those
> > cases where the namestring is a homonym like
> >
> > http://gni.globalnames.org/name_strings/Oenanthe
> >
> > couldn't you just return the addresses of the two globally unique
> > forms of the name when you resolve it?
> >
> > http://gni.globalnames.org/name_strings/Oenanthe_Smith_1899
> >
> > http://gni.globalnames.org/name_strings/Oenanthe_Jones_1900
> >
> > Wouldn't those be as globally unique and easier to read and adjust to?
> > Or am I missing something.  I always wanted to do that with ubio IDs
> > after a back and forth with Gregor Hagedorn and wished we hadn't
> > exposed those integers.
> >
> > DR
> >
> >> Hi Steve,
> >>
> >> I don't have time to go through this in detail, and I can't speak for
> >> the GNI, but I can tell you about how the GNI URI's work at least for now.
> >>
> >> A while back Dima Mozzherin and I were looking into how triples etc.
> >> might be of use to the GNI.
> >>
> >> We needed a way to generate unique URI's for each name.
> >>
> >> We wanted to avoid having to keep these in sync and not require
> >> everyone to look each ID up through some service.
> >>
> >> Dima came up with the following plan. We use the namestring as seed
> >> to generate a unique UUID.
> >>
> >> Basically this is a shared algorithm which the GNI and TaxonConcept
> >> both use. But it could be used by anyone.
> >>
> >> You feed the name string to the algorithm and it spits out a UUID. We
> >> append then append that to a URI and web service so it is resolvable.
> >>
> >> So the name Danaus plexippus (Linnaeus 1758) =>
> >> 4ef223c4-0c3e-5e84-ace9-755c34c601ec
> >>
> >> So if the GNI and and another group have the same namestring they
> >> have the same UUID.
> >>
> >> People can then can link their data set to the GNI with the following
> >> URI
> >>
> >> http://gni.globalnames.org/name_strings/4ef223c4-0c3e-5e84-ace9-
> 755c3
> >> 4c601ec
> >>
> >> RDF
> >> http://gni.globalnames.org/name_strings/4ef223c4-0c3e-5e84-ace9-
> 755c3
> >> 4c601ec.rdf
> >>
> >> <http://gni.globalnames.org/name_strings/4ef223c4-0c3e-5e84-ace9-
> 755c
> >> 34c601ec.rdf>If you think of your data set as one table and the GNI
> >> as another, this URI serves as the foreign key that connects them
> >> together.
> >>
> >> Some on the list don't like how these look, but there is a tremendous
> >> advantage in not having to worry about syncing two large data sets
> >> and determining if a given integer is already in use.
> >>
> >> Also Rod Page has written a recently about UUID's.
> >> http://iphylo.blogspot.com/2011/05/zoobank-on-couchdb-uuids-replicati
> >> on.html
> >>
> >> <http://iphylo.blogspot.com/2011/05/zoobank-on-couchdb-uuids-
> replicat
> >> ion.html>There may be a way to do something similar with bit.ly like
> >> identifiers that are shorter (mCcSp), but I think it the general idea
> >> is a good one.
> >>
> >> If you recall from my talk at TDWG, I was able to use these to make
> >> statements that one namestring was a synonym etc. of another etc.
> >>
> >> The algorithm we use is written in Ruby but I could be ported to many
> >> different languages since UUIDs are widely supported.
> >>
> >> Respectfully,
> >>
> >> - Pete
> >>
> >>
> >>
> >> On Thu, Jun 2, 2011 at 11:41 PM, Steven J. Baskauf <
> >> steve.baskauf at vanderbilt.edu> wrote:
> >>
> >>>  My email access has been sporadic since this thread developed, so
> >>> at this point I'll respond to points made in several of the
> >>> messages.
> >>>
> >>> First, I should note that there has been previous discussion on this
> >>> list on a similar topic from
> >>> http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002231.htm
> >>> lthrough
> >>> http://lists.tdwg.org/pipermail/tdwg-content/2011-
> January/002231.html.
> >>> One can review what was said at that time rather quickly by starting
> >>> on the first linked message and clicking on the "Next Message" link
> >>> until you get to the end of the range I gave above.
> >>>
> >>> My reason for the request for information that started this thread
> >>> was that I wanted to link to a URI that would anchor the name
> >>> portion of a name/sensu pair (TNU or Taxon Concept a la TCS if you
> >>> prefer) as in this RDF
> >>> snippet:
> >>>
> >>>    <tc:nameString>Quercus rubra L.</tc:nameString>
> >>>    <tc:hasName
> >>>
> rdf:about="http://www.ubio.org/authority/metadata.php?lsid=urn:lsid:ubio
> .org:namebank:448439"
> >>> <http://www.ubio.org/authority/metadata.php?lsid=urn:lsid:ubio.org:n
> >>> amebank:448439>/>
> >>>
> >>>
> >>> At this point in the discussion, I'm not actually talking about
> >>> creating a link to a taxon concept but rather to a taxon name, so
> >>> some of the issues Pete raised don't apply here (e.g. what's the
> >>> "right" name for a concept
> >>> -
> >>> the question here is simply what's a stable identifier for the name) .
> >>> In
> >>> principle, I could probably just provide the name string and be done
> >>> with it.  However, having some degree of faith that Smart, Computer
> >>> Savvy People might some day be able to use the metadata returned by
> >>> the URI (or perhaps metadata which they already have in a triple
> >>> store onsite) to do cool things like knowing that my name is the
> >>> same as an orthographic variant or that "Quercus rubra  L." is
> >>> basically the same thing as "Quercus rubra", I would like to also
> >>> provide a functional URI.
> >>>
> >>> As an end -user who isn't very interested in the technical issues
> >>> involving names, I don't really care what URI I use.  I would prefer
> >>> for it to be widely recognized and for it to "work" (i.e. be
> >>> resolvable).  In the earlier
> >>> (January) thread, there was discussion about existing identifiers.
> >>> There
> >>> were a number of posts, but in particular
> >>> http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002258.htm
> >>> l
> >>> http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002259.htm
> >>> ldiscussed the relative merits of ITIS and uBio ID numbers.  My
> >>> take-home message from this was that uBio represented the largest
> >>> single set of names with assigned identifiers (see
> >>> http://gni.globalnames.org/data_sourcescited in Pete's email) and
> >>> that uBio metadata provides useful references.
> >>> Hence my interest in referencing uBio ids as a URI.  However, as a
> >>> practical matter, the organizations that I share images with either
> >>> want ITIS TSNs (EOL and Morphbank) or just names (Discover Life).
> >>> Nobody is asking for uBio identifiers or any other identifier.
> >>>
> >>> I found Kevin's comment at
> >>> http://lists.tdwg.org/pipermail/tdwg-content/2011-May/002486.html
> >>> very
> >>> thought-provoking: "My thoughts are that the most likely way this
> >>> will be solved is by standard market type pressures - ie the best
> >>> solution/IDs will be used the most and 'float' to the top."  I'm not
> >>> going to make a judgment about what is the "best" solution or ID.
> >>> But I would say that in "computer"
> >>> history, being the "best" doesn't necessarily mean that something
> >>> will be used.  Take for example, the FOAF vocabulary.  What the heck
> >>> is Friend of a Friend?  I would venture to say that most of the
> >>> people using the FOAF vocabulary don't know or care.  The FOAF
> >>> vocabulary was the one that people started to use and once that
> >>> happened, people didn't switch even if there was something better.
> >>> I'm not familiar with the history of other stuff like YouTube and
> >>> Craig's List, but I would guess that they weren't necessarily "the
> >>> best" systems - they were just the one that the most people started
> >>> using first and once that happened, people didn't switch.  I'm using
> >>> ITIS IDs because they are easy to get and the people I communicate
> >>> with want them.  Whether they are the "best" or "done correctly"
> >>> doesn't matter to me as much as the fact that that they are widely
> >>> recognized and stable (and that thus far every name that I've looked
> >>> for has been in their database).
> >>>
> >>> I think that one reason why this question has been on my mind is
> >>> that I've been waiting for GNUB (Global Name Use Bank) to come out.
> >>> I'm not really up on how it is going to work, but my impression is
> >>> that it was going to be based on the Global Name Index (GNI) which
> >>> was mentioned in that earlier January thread.  At that point, the
> >>> GNI names didn't have any identifiers that were exposed to the
> >>> public as permanent GUIDs.  I'm assuming that if GNUB refers to GNI
> >>> names, they will have some kind of identifiers.  So if that happens
> >>> how is the GUID recommendation 8 going to be followed?  As Kevin
> >>> said in
> >>> http://lists.tdwg.org/pipermail/tdwg-content/2011-June/002499.html
> >>> "What I take from recommendation 8 of the GUID applicability guide
> >>> ... is that if you DON'T already have a record in your own database
> >>> for a taxon name/concept, then reuse an existing one.  "  What we
> >>> have here with GNI is a situation where none of the records have
> >>> identifiers.  In my mind, the "best practice" according to
> >>> recommendation 8 would be for the GNI to reuse existing identifiers
> >>> where they exist and NOT make up new ones.  This is a bit more
> >>> complicated because the ITIS identifiers (which are in common
> >>> use)
> >>> don't have an http URI version that is resolvable, and while the
> >>> uBio identifiers have a resolvable http URI, it's in the form of a
> >>> proxied LSID, which I've already complained is very ugly.  So I'd
> >>> like to hear some ideas about how to have "reused" identifiers in
> >>> the GNI.
> >>>
> >>> One thing that comes to my mind would be to have a "domain name"
> >>> like "http://purl.org/gni/" <http://purl.org/gni/> or
> >>> "http://purl.org/tn/"<http://purl.org/tn/>("tn" for "taxon name")
> >>> and to follow it with a namespace/id combination similar to what is
> >>> done with lsids.  So for example "itis/19408" and "ubio/448439"
> >>> could be appended, creating http://purl.org/gni/itis/19408and
> >>> http://purl.org/gni/ubio/448439 for "Quercus rubra  L."  Both URIs
> >>> could point to the same RDF and that RDF could indicate that the two
> >>> identifiers are owl:sameAs .  I realize from what Bob Morris has
> >>> cautioned in the past that there are problems with owl:sameAs when
> >>> the two things aren't actually the same thing (e.g. if the uBio ID
> >>> refers to a name string only but the ITIS TSN refers to the name
> >>> plus an "accepted" status and a relationship to parent taxa).
> >>> However, if there were an understanding that the GNI only refers to
> >>> name strings, then one could still refer to
> >>> http://purl.org/gni/itis/19408 as an identifier for the name string
> >>> of the thing (whatever it is) that is referred to by an ITIS TSN of
> >>> 19408.  I don't think there would be a problem saying that and the
> >>> ubio ID were "owl:sameAs".  Some kind of solution like this would
> >>> allow people to easily generate a resolvable URI for a name if they
> >>> were using ITIS TSNs or uBio IDs.  If the name that one wanted to
> >>> use was so obscure that it was one of the 9.5 million names that
> >>> uBio has that ITIS doesn't have, then that name would only have the
> >>> ubio version.  I have no idea whether this would be a good idea or
> >>> not, but I was really cringing to think about 19 million newly
> >>> minted UUIDs appended to
> >>> "http://gni.globalnames.org/"<http://gni.globalnames.org/>and
> >>> figuring out how to connect those horrid things to the names and
> >>> ITIS TSNs that I'm already using.  I think that I said this before,
> >>> but using the purl.org domain rather than one like
> >>> http://gni.globalnames.org/ would in the future allow somebody else
> >>> to take over management of providing the metadata when the GUIDs
> are
> >>> resolved without having to deal with issues of who "owns" the domain
> >>> name.
> >>>
> >>> Steve
> >>>
> >>>
> >>>
> >>> Kevin Richards wrote:
> >>>
> >>>  Pete,
> >>>
> >>> I'm not trying to say what you are doing is a waste of time/impossible.
> >>> I
> >>> actually think RDF + semantics are a good way forward, but this
> >>> really implies that we need to rely on the semantics and linkages
> >>> rather than having a SINGLE ID for a taxon name.  (which is what I
> >>> thought Steve was getting at).  Each instance of a taxon name can
> >>> have its own ID and then all these instances are connected via
> >>> ontology defined semantic links.  This seems more appropriate to me
> >>> than insisting everyone uses the "Global Taxon Name ID X".
> >>>
> >>>
> >>>
> >>> In your example of *Aedes triseriatus* and *Ochlerotatus
> >>> triseriatus* - these are two different names so they need two
> >>> different IDs, they may be linked by a single taxon concept, but
> >>> they are separate names.  So which of these now 3 IDs do you expect
> >>> people to use, and according to what source??
> >>>
> >>>
> >>>
> >>> For example if we have a name, eg the Robin, Erithacus rubecula,
> >>> mentioned
> >>> in IT IS (TSN : 559964) and also in EOL (www.eol.org/pages/1051567),
> >>> also
> >>> in GBIF (http://data.gbif.org/species/21266780), also in avibase (
> >>> http://avibase.bsc-eoc.org/species.jsp?avibaseid=C809B2B90399A43D),
> >>> which
> >>> ID are you hoping people will use??  Would you put the IT IS ID in your
> >>> own
> >>> dataset as the ID for that name - unlikely.  Or would it be better to
> >>> link
> >>> them up with semantic linkages.
> >>>
> >>>
> >>>
> >>> What I take from recommendation 8 of the GUID applicability guide (as
> >>> Steve
> >>> puts is "stop making up new identifiers when somebody else already has
> >>> one
> >>> for the thing you are talking about") is that if you DON'T already have
> >>> a
> >>> record in your own database for a taxon name/concept, then reuse an
> >>> existing
> >>> one.  NOT ditch all your current IDs and adopt someone else's
> >>> (especially
> >>> hard considering it is so hard to work out which if the multitude of
> >>> names
> >>> ad concept IDs that directly relates to your taxon name).
> >>>
> >>>
> >>>
> >>> I am all for limiting the number of IDs for the "same" thing, but in
> >>> some
> >>> cases it is more useful to build linkages than force this tight
> >>> integration
> >>> of data and IDs.  Especially for taxon names and concepts, where it is
> >>> complex to define if you are even talking about the "same" thing or not.
> >>>
> >>>
> >>>
> >>> Kevin
> >>>
> >>>
> >>>
> >>> *From:* Peter DeVries
> >>> [mailto:pete.devries at gmail.com<pete.devries at gmail.com>]
> >>>
> >>> *Sent:* Wednesday, 1 June 2011 12:38 p.m.
> >>> *To:* Kevin Richards
> >>> *Cc:* Steve Baskauf; tdwg-content at lists.tdwg.org; Gerald Guala;
> >>> Nicolson,
> >>> David; Alan J Hampson; Orrell, Thomas
> >>> *Subject:* Re: [tdwg-content] ITIS TSNID to uBio NamebankIDs mapping
> >>>
> >>>
> >>>
> >>> Hi Kevin,
> >>>
> >>>
> >>>
> >>> I forgot one mention some other things that are different about my
> >>> project.
> >>>
> >>>
> >>>
> >>> You can write a simple SPARQL query to get a list of all the
> >>> TaxonConcept's
> >>> that have ITIS ids, or all those that have ITIS and NCBI ID's etc.
> >>>
> >>>
> >>>
> >>> You can do this on any SPARQL endpoint that hosts the data.
> >>>
> >>>
> >>>
> >>> You can download the entire data set and run the queries on your own
> >>> endpoint.
> >>>
> >>>
> >>>
> >>> You can write a script that runs the query and downloads the ITIS
> >>> numbers
> >>> and exports them to CSV etc.
> >>>
> >>>
> >>>
> >>> - Pete
> >>>
> >>>
> >>>
> >>> On Tue, May 31, 2011 at 5:16 PM, Peter DeVries
> <pete.devries at gmail.com>
> >>> wrote:
> >>>
> >>> Hi Kevin,
> >>>
> >>> On Tue, May 31, 2011 at 3:27 PM, Kevin Richards <
> >>> RichardsK at landcareresearch.co.nz> wrote:
> >>>
> >>> This is exactly why this problem still exists and will be very complex
> >>> to
> >>> solve - everyone says "we should have a single ID for a specific taxon
> >>> name,
> >>> there seems to be several IDs 'out there' that refer to the same taxon
> >>> name,
> >>> so Im going to create another ID to link them all up" - yet another ID
> >>> that
> >>> no one will particularly want to follow - you would have to get everyone
> >>> to
> >>> agree that your combinations/integration of taxon names is the best one
> >>> and
> >>> hope everyone follows it - unlikely in this domain.
> >>>
> >>>
> >>>
> >>> Isn't this kind of what the The Plant List, and eBird already do?
> >>>
> >>>
> >>>
> >>> A difference being that they tie these to a specific name and specific
> >>> classification.
> >>>
> >>>
> >>>
> >>> The Plant list is not really even open so it is difficult to people to
> >>> adopt it in mass.
> >>>
> >>>
> >>>
> >>> For instance, if I manage a herbarium, how do I easily reconcile my
> >>> species
> >>> list with the entities represented in the Plant List?
> >>>
> >>>
> >>>
> >>> eBird has millions of records which implies that they have been able to
> >>> convince the observers in the field to adopt their system. You are
> >>> correct
> >>> in that there are probably a lot of taxonomists that don't like their
> >>> list.
> >>>
> >>> It differs from many of the other classifications, but remember the
> >>> system
> >>> rewards them for not agreeing. Note the difference between the
> microbial
> >>> taxonomists and other taxonomists. In the case of the microbial
> >>>
> >>> workers, the system rewards them for solving problems not debating
> >>> alternatives. Also, if a good idea comes out that will make it easier
> >>> for
> >>> the microbiologists to solve the problems they are rewarded for solving,
> >>> they are less likely to care whose idea it is.
> >>>
> >>>
> >>>
> >>> Like the microbiologists, there are lots of biologists that work with
> >>> species with the goal of addressing some non-taxonomic problem.
> >>>
> >>>
> >>>
> >>> They don't really care if the name is *Aedes triseriatus* or
> >>> *Ochlerotatus
> >>> triseriatus, *but they do care that the identifier that they connect
> >>> their
> >>> data to is stable.
> >>>
> >>>
> >>>
> >>> In regards to the issue of market forces,I suspect (but have no
> >>> knowledge
> >>> of) that there were probably decisions made in devising these lists that
> >>> have more to do with appeasing certain personalities that creating best
> >>> list. With the way this system rewards people it is likely that the
> >>> "correct" version will float to the top only after that person has
> >>> passed
> >>> away. I don't have much faith that the best system will always float to
> >>> the
> >>> top, That has a lot to do with the personalities and how the system
> >>> rewards
> >>> are setup. Theoretically, it is possible for one strong personality or
> >>> group
> >>> to force others to adopt their less than optimal solution - at least
> >>> this
> >>> seems to happen in other environments.
> >>>
> >>>
> >>>
> >>> Also, there are all sorts of ways that people can use the publication
> >>> record to rewrite history. Simply cite the review paper that cites the
> >>> original paper. Or don't cite it at all.
> >>>
> >>>
> >>>
> >>> I would have used only the ITIS TSN but if the name changes the ID
> >>> changes.
> >>> This isn't "wrong", it just does not solve my problem.
> >>>
> >>>
> >>>
> >>> * ITIS also should add the spiders from the World Spider Catalog.
> >>>
> >>>
> >>>
> >>> Another issue that I think has inhibited adoption of a common list is
> >>> that
> >>> people can't agree on a particular name or a particular classification.
> >>>
> >>>
> >>>
> >>> Since you can model a species concept as having many names and many
> >>> classifications why not do so?
> >>>
> >>>
> >>>
> >>> If this idea was originally accepted, I would not have needed to create
> >>> TaxonConcept.org.
> >>>
> >>>
> >>>
> >>> My plan has aways been to get something that works to solve some
> >>> problems
> >>> and then let some larger group take it over.
> >>>
> >>>
> >>>
> >>> In a sense, I am more like the microbiologists in that I am not being
> >>> paid
> >>> to solve this or debate this problem.
> >>>
> >>>
> >>>
> >>> I am doing it because I think something like this is needed, and it is
> >>> an
> >>> interesting and personally rewarding puzzle.
> >>>
> >>>
> >>>
> >>> - Pete
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> My thoughts are that the most likely way this will be solve is by
> >>> stnadard
> >>> market type pressures - ie the best solution/IDs will be used the most
> >>> and
> >>> "float" to the top.  It is easy to say that the global taxon name data
> >>> is a
> >>> mess, but if you think about it 30 years ago taxon name data were very
> >>> disparate, duplicated, unconnected, many with NO IDs at all.  So I
> >>> beleive
> >>> we are making progress and that we will continue to do so albeit at a
> >>> fairly
> >>> slow rate.
> >>>
> >>> Kevin
> >>>
> >>>
> >>>
> >>> "I agree. This was one of the reasons that I setup TaxonConcept the way
> >>> I
> >>> did. It attempts to connect both the LOD entities and the foreign key
> >>> based
> >>> entities."
> >>>
> >>>  Please consider the environment before printing this email
> >>> Warning:  This electronic message together with any attachments is
> >>> confidential. If you receive it in error: (i) you must not read, use,
> >>> disclose, copy or retain it; (ii) please contact the sender immediately
> >>> by
> >>> reply email and then delete the emails.
> >>> The views expressed in this email may not be those of Landcare
> Research
> >>> New
> >>> Zealand Limited. http://www.landcareresearch.co.nz
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> ------------------------------------------------------------------------------------
> >>> Pete DeVries
> >>> Department of Entomology
> >>> University of Wisconsin - Madison
> >>> 445 Russell Laboratories
> >>> 1630 Linden Drive
> >>> Madison, WI 53706
> >>> Email: pdevries at wisc.edu
> >>> TaxonConcept <http://www.taxonconcept.org/>  &
> >>> GeoSpecies<http://about.geospecies.org/> Knowledge
> >>> Bases
> >>> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
> >>>
> >>> --------------------------------------------------------------------------------------
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> ------------------------------------------------------------------------------------
> >>> Pete DeVries
> >>> Department of Entomology
> >>> University of Wisconsin - Madison
> >>> 445 Russell Laboratories
> >>> 1630 Linden Drive
> >>> Madison, WI 53706
> >>> Email: pdevries at wisc.edu
> >>> TaxonConcept <http://www.taxonconcept.org/>  &
> >>> GeoSpecies<http://about.geospecies.org/> Knowledge
> >>> Bases
> >>> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
> >>>
> >>> --------------------------------------------------------------------------------------
> >>>
> >>> ------------------------------
> >>> Please consider the environment before printing this email
> >>> Warning: This electronic message together with any attachments is
> >>> confidential. If you receive it in error: (i) you must not read, use,
> >>> disclose, copy or retain it; (ii) please contact the sender immediately
> >>> by
> >>> reply email and then delete the emails.
> >>> The views expressed in this email may not be those of Landcare
> Research
> >>> New
> >>> Zealand Limited. http://www.landcareresearch.co.nz
> >>>
> >>>
> >>> --
> >>> Steven J. Baskauf, Ph.D., Senior Lecturer
> >>> Vanderbilt University Dept. of Biological Sciences
> >>>
> >>> postal mail address:
> >>> VU Station B 351634
> >>> Nashville, TN  37235-1634,  U.S.A.
> >>>
> >>> delivery address:
> >>> 2125 Stevenson Center
> >>> 1161 21st Ave., S.
> >>> Nashville, TN 37235
> >>>
> >>> office: 2128 Stevenson Center
> >>> phone: (615) 343-4582,  fax: (615)
> >>> 343-6707http://bioimages.vanderbilt.edu
> >>>
> >>>
> >>
> >>
> >> --
> >> ------------------------------------------------------------------------------------
> >> Pete DeVries
> >> Department of Entomology
> >> University of Wisconsin - Madison
> >> 445 Russell Laboratories
> >> 1630 Linden Drive
> >> Madison, WI 53706
> >> Email: pdevries at wisc.edu
> >> TaxonConcept <http://www.taxonconcept.org/>  &
> >> GeoSpecies<http://about.geospecies.org/> Knowledge
> >> Bases
> >> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
> >> --------------------------------------------------------------------------------------
> >> _______________________________________________
> >> tdwg-content mailing list
> >> tdwg-content at lists.tdwg.org
> >> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >>
> >
> >
> >
> > ----------------------------------------------------------------------------
> > David Remsen, Senior Programme Officer
> > Electronic Catalog of Names of Known Organisms
> > Global Biodiversity Information Facility Secretariat
> > Universitetsparken 15, DK-2100 Copenhagen, Denmark
> > Tel: +45-35321472   Fax: +45-35321480
> > Skype: dremsen
> > ----------------------------------------------------------------------------
> >
> >
> >
> > _______________________________________________
> > tdwg-content mailing list
> > tdwg-content at lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content


_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content


More information about the tdwg-content mailing list