Different reasons for different GUIDs

Tue Sep 13 21:57:21 CEST 2005

> Hmmm, I'm struggling to see how adding name-usage instances (of which
> there will be an awful lot more than just names)

Absolutlely!  Sometimes hundreds or thousands of usages per name.  But if
computers are good for anything, they're good for managing large volumes of
information.

> will make this diagram
> any less complex. In fact, the names in the diagram could be regarded
> as name-usages (if I just tack on the literature source), omitting
> subsequent usages of a name.

It's not less complex in a visual representation sense, just in a
GUID-management sense.

> I take it you are limiting the mapping of Taxon Name GUIDs to be
> between GUIDs for the "same" name in different databases?

I believe that was the point that Peter was on about, yes.  True "duplicate"
GUIDs.

> I say "same"
> because, leaving taxon-concepts aside, the presence of homonyms between
> the codes makes mapping names as strings problematic

Indeed!  One of the main reasons for using GUIDs instead of text strings in
the first place!!

> I should point out that the picture I attached is a particularly messy
> example where the plant species Oxylobium lineare has been moved to a
> new genus (Gastrolobium), but the resulting binomial (G. lineare)
> already exists (although it is treated as a synonym of G.
> callistachys). Consequently, a new species epithet was created,
> resulting in G. ebrachteolatum. I found this out the hard way when
> looking at TreeBASE names, trying to figure out how come a sequence for
> Gastrolobium ebrachteolatum was listed in GenBank as being from
> Oxylobium lineare. Furthermore, these are just the links represented in
> MOBOT. There should be additional links (for example, Oxylobium
> linearis and Callistachys linearis must be synonyms).

I'd imagine hundreds, or thousands of cross links via usage instances.  But
the point is, the currency by which taxon names are dynamic at all, is the
Name-usage instance.

Some people get fidgety when they try to contemplate all the name-usages of
all the name objects (sort of like contemplating the billions and billions
of stars within each of the billions and billions of galaxies).  But the
system doesn't need to be complete to be useful.  It would grow gradually
over time as people needed to reference Name-usage instances.  In the
simplest form, people might start by plugging in only those name-usage
instances that constitute Code-compliant original descriptions for the taxa
in their own area of interest.  The botanists might further plug in those
name usage instances that constitute new combinations.  When someone revises
a whole group and works out a complete revision of a group with robust
synonymies, they might go ahead and plug in the Name-usage instances covered
by his/her work.

The point is (getting back to my original post to this list), it would be
designed in an open-ended way, such that it can be scaled up over the years,
decades, (centuries?) to come, as all of this information is, evenutally,
captured in a structured digital form. If done well, allowing for such
long-term scalability should impact today's needs only trivially.  If
today's needs are simply to get all the original descriptions, then fine --
but structure them as name-usage instances that can be expanded as research
effort allows/provides.  Also, don't think of it as one person or group of
database nerds doing all the work.  Rather, an open-access system would
encourage the taxonomic world to participate.  If done *really* well, the
tools associated with plugging the data in could be wrapped up into desktop
software tools that actually make a taxonomists job a LOT easier and more
efficient -- thereby providing a strong incentive to participate.

Aloha,
Rich