Re: Different reasons for different GUIDs
Roderic wrote: "I say 'same' because, leaving taxon-concepts aside, the presence of homonyms between the codes makes mapping names as strings problematic"
Richard replied: "Indeed! One of the main reasons for using GUIDs instead of text strings in the first place!!"
There's two thing in getting rid of homonymy. The first is entity identification, which is indeed as Richard said what the assignment of unique identifiers is about. The second thing is name resolution, ie. the name to the correctly identified entity. And as I said the latter might either need user-intervention (question raised from the resolution system) or context-dependent resolution. Here's a simple example that has nothing to do with taxa:
Suppose there are three people called "Roderic Page" on this planet, one in the UK, one in Hawaii and one in Belgium. Let's consider we can take their social security number (SSN) as a global unique identifier, so each of the three will have been assigned a different SSN. If we have their SSN, then their identity is guaranteed to be unique. If we lookup their identity by means of their name (knowing that there is homonymy) than will come up with three possible people. So we will need additional information to resolve between the options. Either we can ask about their country of origin, which would be sufficient in this case for resolution, or if the name appears in a text we could extract enough contextual information to single out original name to one of the possible entities. This is what the human brain would do to resolve names to concepts they know, and this is exactly what computers do when performing context-sensitive information.
In the case of taxonomic names, apart from the name itself, the original publication, authors of the publication, or whatever other information could be considered as contextual information that helps resolving the name. What would be sufficient (at the end there should be one entity left after resolution) and necessary (we do not want to require too much context information, otherwise the resolution might become unpractical) context information for resolving taxonomic names, I leave to the experts in the field.
As was pointed out before by someone: the GUIDs help to 'remember' the outcome of the resolution, so that each instance of the name needs to be resolved only once and is replaced (at least through the eyes of the information system) by the corresponding GUID.
One also has to take into account that once there is a clearly established system of globally unique identifiers that is accepted by a community, than all of the problems are resolved if the assignment of identifiers for new entities is carefully organized. All we are stuck with is the legacy of the past, as to assign the identifiers to the occurrences of the names that carried no identifiers with them.
Peter Dawyndt
participants (1)
-
Peter Dawyndt