By "Namestring", do you include author, our just taxonomic nomenclatural elements? In either case, why even bother establishing a GUID for taxonomic names at all? Why not just use the namestring itself?
Oh come on, you can't be serious ;-) Names aren't unique, GUIDs are.
You didn't answer my question: By "Namestring", do you include author, our just taxonomic nomenclatural elements? If you do not include author details (as I suspect from your comment above), then you are right: namestrings are non-unique because of homonyms. If you do include author elements, then namestrings can approach uniqueness.
So, let me ask you a seemingly simple question: If you feel that any namestring (with the possible exception of orthographic variants) should receive a GUID, then does the GUID represent the namestring per se? Or, does it represent the name "object" (with implied secondary data such as authorship, type specimen, etc.)? If the namestring per se, then again I ask why do we need GUIDs at all? If the answer is "to disambiguate homonyms", then it seems the GUID represents the abstract name "object", rather than the sequence of text characters.
So, if a "Name GUID" is intended to represent a name "object", then what is the definition and scope of such objects that would be candidates for GUIDs? All names? Scientific names only? What sort of attributes would these name objects have?
Moreover, I don't think anyone has suggest that GUIDs be assigned "solely" to basionyms.
Er, I quote from your earlier post.
I completely agree -- but again, what gets a "Name" GUID? (as opposed
to a
"usage" GUID or a "concept" GUID) Only basionyms? (I hope!) Or also different combinations? (I hope not!) Or also spelling variants? (I
*really*
hope not!!)
I'll say it again: I don't think anyone has suggest that GUIDs be assigned "solely" to basionyms. As you said yourself: "The idea that only the resolution system needs to be able to distinguish between specimens, taxon names, etc., seems unfortunate." If I understand this sentence correctly, you are saying that it's important to think of distinct "domains" of GUIDs. For example, publications are a different domain from specimens, and taxon names are a different domain still. Please note in my quote above my qualification of "Name" GUID. I was talking about GUIDs within the Taxon name "domain" only. To say that I hope that "Name" GUIDs are only applied to basionyms is *NOT* the same as saying that non-basionyms should not have *any* GUID -- just that I don't think that non-basionyms fall within the scope of what I believe should be treated as a name-object (i.e., candidate for a "Name GUID").
Stated another way, if you dissect the informational components of nomenclature, basionyms represent a (mostly) unambiguous and (mostly) objectively-definable unit object. Nomenclature beyond basionyms (combinations, orthographic variants, etc.) *can* be thought of as additional instances of name objects, but they can *also* be thought of as attributes of name usage instances (a different domain from name objects). My preference is the latter (not because I'm a zoologist, but because I see it as more reflective of how nomenclatural information is really structured).
For me the point is that if you have a GUID, I can map to you and add value to my projects. Without a GUID, it's a pain.
But if the units to which we assign GUIDs are non-comparable, how can we cross-map them (without it being a pain)?
If it hurts don't do it! Put another way, why not make the mapping purely nomenclatural -- this name string also occurs in ITIS, and don't make any claims that whatever ITIS means by that name is what you mean.
I can do that just fine without GUIDs. My hope is that GUIDs will enable us to exchange and harness information MUCH more effectively than we currently can. In reference to your idea of how GUIDs should be assigned, you said it yourself in your previous post: "we're pretty close to this already". If building an infrastructure of GUIDs only allows us to do incrementally more than we can already now, then I wonder whether we need to bother with GUIDs at all. I would like to see the infrastructure established that allows us to do much more than we can right now -- and that largely hinges on the reusability aspect of GUIDs.
Alternatively, let users decide what mappings they want to make between you and ITIS. Expertise is rare and widely distributed. I think you're making things harder for yourself than the need to be.
No, I'm trying to help foster a foundation for information exchange that will make things easier for future generations of biologists (and non-biologists). With a few simple standards and conventions and definitions, I think we can develop a system of GUIDs that lets us harness biological information much more effectively than is possible now.
If we think of the scientific literature, many paper have at least two GUIDs (DOI and PubMed), both of which are useful, and which serve different purposes.
What is the value of expanding that to potentially hundreds, or thousands of GUIDs for each paper -- one GUID from each database that has a table of literature records? Wouldn't our information exchange be much more efficient if we all adopted (or at least mapped to) either the DOI or the PubMed GUID (or both)? If so, why doesn't this same logic apply to taxon names?
I guess my point was that this is a case where more than one GUID exists for the "same" thing (a publication), and we manage to cope. Yes, "one GUID to rule them all" might be nice, but I just don't buy that it's going to happen any time soon. Just look at the discussion this topic has generated...
Well, maybe we need two different discussion lists: 1) How to use GUIDs to make what we already do a bit easier to manage (low cost, fast implementation, incremental improvement); and 2) How to use GUIDs to dramatically change the way biological data is exchanged (higher cost, slower implementation, fundamental improvements).
Again, I think we make a mistake if we equate GUIDs with solving all our problems. Rather, I suggest they give us an infrastructure on which to build tools for mapping names/concepts/whatever.
I think we are in full agreement on this. I don't think that GUIDs can solve all of our problems any more than you do. However, I think they offer the potential of solving a lot more problems than you seem to be reaching for.
In a latr post, you wrote:
GUIDs enable you to explicitly identify objects, that's all.
We agree on this point -- but if I understand your position correctly, the "objects" are the records in a database. In my view, the "objects" are definable, abstract nomenclatural objects that are shared across many, many databases. I also agree with your prior comment that converting LUIDs to GUIDs using the format "<my database server>:<my data type>:<my id>" is very easy. I just don't think it allows us to do much more than we already can do (as magnificently demonstrated by the taxonomic web portals that you and the uBio folks have developed).
Aloha, Rich