Hmmm, I'm struggling to see how adding name-usage instances (of which there will be an awful lot more than just names)
Absolutlely! Sometimes hundreds or thousands of usages per name. But if computers are good for anything, they're good for managing large volumes of information.
will make this diagram any less complex. In fact, the names in the diagram could be regarded as name-usages (if I just tack on the literature source), omitting subsequent usages of a name.
It's not less complex in a visual representation sense, just in a GUID-management sense.
I take it you are limiting the mapping of Taxon Name GUIDs to be between GUIDs for the "same" name in different databases?
I believe that was the point that Peter was on about, yes. True "duplicate" GUIDs.
I say "same" because, leaving taxon-concepts aside, the presence of homonyms between the codes makes mapping names as strings problematic
Indeed! One of the main reasons for using GUIDs instead of text strings in the first place!!
I should point out that the picture I attached is a particularly messy example where the plant species Oxylobium lineare has been moved to a new genus (Gastrolobium), but the resulting binomial (G. lineare) already exists (although it is treated as a synonym of G. callistachys). Consequently, a new species epithet was created, resulting in G. ebrachteolatum. I found this out the hard way when looking at TreeBASE names, trying to figure out how come a sequence for Gastrolobium ebrachteolatum was listed in GenBank as being from Oxylobium lineare. Furthermore, these are just the links represented in MOBOT. There should be additional links (for example, Oxylobium linearis and Callistachys linearis must be synonyms).
I'd imagine hundreds, or thousands of cross links via usage instances. But the point is, the currency by which taxon names are dynamic at all, is the Name-usage instance.
Some people get fidgety when they try to contemplate all the name-usages of all the name objects (sort of like contemplating the billions and billions of stars within each of the billions and billions of galaxies). But the system doesn't need to be complete to be useful. It would grow gradually over time as people needed to reference Name-usage instances. In the simplest form, people might start by plugging in only those name-usage instances that constitute Code-compliant original descriptions for the taxa in their own area of interest. The botanists might further plug in those name usage instances that constitute new combinations. When someone revises a whole group and works out a complete revision of a group with robust synonymies, they might go ahead and plug in the Name-usage instances covered by his/her work.
The point is (getting back to my original post to this list), it would be designed in an open-ended way, such that it can be scaled up over the years, decades, (centuries?) to come, as all of this information is, evenutally, captured in a structured digital form. If done well, allowing for such long-term scalability should impact today's needs only trivially. If today's needs are simply to get all the original descriptions, then fine -- but structure them as name-usage instances that can be expanded as research effort allows/provides. Also, don't think of it as one person or group of database nerds doing all the work. Rather, an open-access system would encourage the taxonomic world to participate. If done *really* well, the tools associated with plugging the data in could be wrapped up into desktop software tools that actually make a taxonomists job a LOT easier and more efficient -- thereby providing a strong incentive to participate.
Aloha, Rich