Dear colleagues
I managed now somehow (due to also a lot of real life work as James called this) to catch up with reading the many mails posted lately.
I fully agree with Kevin's statement herunder that ID's should have no meaning. Anything meaningfull put into an ID can for various reasons change with time and be anyhow too confusing and misleading for those wanting "human readable" ID's. This would also be against the rule that an ID should be stable in time for "identification of an object".
As far it has been reported to me the meaningless Assession Number used in Genbank/EMBL works technically fine. They have some problems with being sure to what species the sequence now really relays and if a voucher specimen is available or not, but It seems to me that this has more to do with the content provided to their system than with the chosen ID system?
Best regards
Pat
Kevin Richards RichardsK@LANDCARERESEARCH.CO.NZ wrote: I agree with the idea of providing most of the information attached to a GUID in the metadata. After talking to the guys at IBM who wrote the LSID implementation, the data for an LSID was mainly intended for quite static data objects, such as documents and images. Most data, and especially dynamic data, is best returned as metadata. And RDF
As for breaking opacity and having "meaning" within the GUID itself, this goes against a fundamental software principle that an ID should have no meaning. The namespaces of LSIDs are reasonably uncontrolled and hence this will no doubt result in erroneous assumptions. Anyway, in some cases you will know what the type of object a GUID is pointing to by the context that it resides in - eg the example Rod gave, <isMisspellingOf rdf:resource="urn:lsid:authority:namespace:123" /> - you will know that in your database you store links of misspellings to other taxon names.
Kevin
deepreef@BISHOPMUSEUM.ORG 17/11/2005 9:47:43 a.m. >>>
Before we have a flurry of name flavours (yuck), I'd suggest that many of these "flavours" are really relationships between names,
Absolutely! -- I was using "flavours" in the sense that they would be defined in terms of relationships, not in terms of the data objects themselves. Sorry if it seemed I meant it that way. Perhaps there is value in establishing a metadata attribute for "IsBasionym" or something like that, but I think these things should really be defined by relationships among GUID-represented data objects, rather than as attributes of the data objects themselves.
Hence, I think before we embark on adding complexity, why not have GUIDs for names and then specify relationships between these names? This is what I've been doing with the LSIDs we serve here.
Why not have GUIDs for NameUsages, and then specify relationships between these NameUsages? (simply because "names" are much more difficult to define than "NameUsages")
We could do with an ontology for these relationships so that, for example, if somebody asks for the synonyms of name "x", and two names are linked by the relationship "isBasionymOf" we recover that relationship (in other words, our tools know that 'isBasionymOf" is a kind of synonym).
Sounds good to me!
Again, let's try and keep things simple.
Agreed! And simple not so much for the sake of simplicity, but more for the sake of flexibility.
This leads to a much more general topic, but I think if we adopt RDF for our metadata we gain a lot of query and inference tools "for free." GUIDs by themselves are boring, it's the metadata that matters.
Agreed!
My general feeling is that most of the useful information is metadata, the only candidates for data (in my mind) being the literal text string (NameString), a ref pointer to a defined Documentation instance (if any), and perhaps position information (e.g. page) within the Documentation instance. All of these might be intially recorded in error, but could be corrected via the LSID revision id (version).
I think everything about a name probably is metadata, and doing this opens up the possibility of doing useful things with that metadata (see above). In the LSID stuff we've been doing at Glasgow the only time we provide data is if we are serving images, in which case it makes sense to serve a stream of bytes (e.g., a TIFF image). In one sense, data isn't that interesting (or, put another way in many cases it requires a person to make sense of it, whereas computers can handle metadata).
After having just spent the past couple hours brushing up my LSID "geek-speak", I agree completely with Rod on this.
Aloha, Rich
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ WARNING: This email and any attachments may be confidential and/or privileged. They are intended for the addressee only and are not to be read, used, copied or disseminated by anyone receiving them in error. If you are not the intended recipient, please notify the sender by return email and delete this message and any attachments.
The views expressed in this email are those of the sender and do not necessarily reflect the official views of Landcare Research.
Landcare Research http://www.landcareresearch.co.nz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--------------------------------- Yahoo! FareChase - Search multiple travel sites in one click.