Re: Different reasons for different GUIDs
Rod,
Of course, an electronic version of a paper, and its abstract stored in another database are not the same thing (at least, not electronically). This is an example were a simple cross reference can link the two GUIDs (in many cases, the PubMed record provides this cross reference).
there's a thin line between entities and instances, and dependent on the application some 'things' that were previously regarded as instances might be raised to the entity level. The Object Management Group (OMG) has created a nice standard for that [Cattell R & Barry D (1997). The Object Database Standard: ODMG 2.0. Morgan Kaufmann Publishers] which is regarded as the 'standard model' for object oriented design. They use a slightly different terminology where-in they use the term 'first class object' for 'entity' (an object carrying identity; in other words has a unique identifier assigned to it) and a 'literal' (an object without identity as an instance that does not need identity, because no tracking and tracing on occurrences of the object are regarded necessary). However, at some point in time some applications might need tracking and tracing of literals, so that they are turned into first-class object by assigning identity to them, and finding instances of their occurrence (this is exactly the task that is ahead of us to turn life science literals (eg. pure names that had no previous tracking capabilities) into first-class objects (with a GUID assigned to them).
In the example you mention, my intuition would say that in this particular case their is no need to assign two different GUIDs to both the full text version and the abstract version of the paper. What we need here I think is the application of what is now commonly seen as a need for multiple resolution, where a single identifier (in this case a GUID referencing a 'publication' object) is made actionable, in such a way that several services can be attached to the single identifier. In the case you mentioned, this would be one service that directs the user to the full paper version of the article (if he has authorization to do so) and a second service that directs the user to the abstract version of the paper.
Only at the application level should then be decided how these services are used. One possibility would be give the user a list of all possible services (eg. do you want full version or abstract only). Another could be to check wether the user has authorization to see the full paper version of the paper. If so, then make use of the full paper version service, otherwise use the abstract only service.
This issue is termed 'making an identifier actionable' in the DOI handbook, which I recommend as mandatory reading for all the people in this discussion group. In not only deals with technical issues on the introduction of GUIDs, but also tackles social and legal (IPR) implications. Also the business plan of DOI is well documented.
Regarding mapping between GUIDs, I suspect the mapping may get very complicated, and a simple tree might not be appropriate. Attached is an example of relationships between names in MOBOT, based on RDF I generate for LSIDs for MOBOT names.
This is a nice example of an entity-relationship diagram, where indeed the relationships between entities can be much more complicated than only a tree-like structure. What I was trying to explain in a previous mailed was merely a system to manage identity (uniqueness) of entities alone. The example you have shown, reminds me of the work done by George Garrity on the taxonomy of bacteria, for which he has made a nicely animated presentation that can be downloaded from
http://www.cpdr.ucl.ac.be/bioinf/papers/bioinf/BrusselsGarrity0710052b.ppt
this also shows the complex interrelationsships that have been created during the establishment of new taxon concepts and names.
Peter Dawyndt
participants (1)
-
Peter Dawyndt