[tdwg-guid] RE: tdwg-guid Digest, Vol 5, Issue 5
Thanks for all the responses and discussions. First I'll quickly answer some questions, then ask some more.
Richard, when I say "collection" I mean "collection event". I'll try to be more precise in the future but, just as a warning, I'm bound to slip.
Someone said something about incorporating the accession number into the LSID. You could do that, but it violates the "semantically opaque" aspect of the LSID identifier.
Paul said: "Anybody got any views (strong, otherwise or proxy for others views) on whether the LSID should refer to data+metadata or just metadata?"
I think Roger's response covers the issue, but I'll add my perspective. Ultimately, you need both (and this is my STRONG view). In our current situation, we have no need to resolve an LSID to data, only metadata for records that have no bytestream representation. That will change once we implement LSIDs for our images, documents and GIS files.
Roger, I think your explanation is getting me closer to understanding how we should implement this, but I think I'll have to delve into RDF a bit more. Right now I'm thinking in database schemas and not RDF and I think the schema we have will allow the scenario you describe but I'd like to spell it out a little more just to make sure I understand it. I'm assuming when you say a "thing that changes", that will have a single LSID that does not change, but everything else about the "thing that changes" will change based upon the current version. So this is what I understand, in pseudo-RDF: ("Specimen_metadata" just represents the changing content, I just use different numbers to indicate this.)
Specimen X Record LSID: Specimen_X_LSID versionedAs: Version_3_LSID Specimen_metadata: [567]
Specimen X Version 1 LSID: Version_1_LSID isReplacedBy: Version_2_LSID Replaces: NULL isVersionOf: Specimen_X_LSID Specimen_metadata: [123]
Specimen X Version 2 LSID: Version_2_LSID isReplacedBy: Version_3_LSID Replaces: Version_1_LSID isVersionOf: Specimen_X_LSID Specimen_metadata: [345]
Specimen X Version 3 LSID: Version_3_LSID isReplacedBy: NULL Replaces: Version_2_LSID isVersionOf: Specimen_X_LSID Specimen_metadata: [567]
Your description of "versionedAs" in your email seems pretty clear, so that is what I based the above on, but from the TDWG spec it is not so clear to me if I got it right. From what you say, it sounds like versionedAs would only point to one version of the metadata (the current version) but the description below makes it sound like a way to access older versions and would therefore have to be a listing of all the older versions. Or, does versionedAs go in the non-current versions and point to the unchanging LSID of the "thing that changes"?
http://rs.tdwg.org/ontology/voc/Common#versionedAs "When the metadata linked to by this LSID is liable to change the LSID authority may choose to make an older version of the metadata available via another LSID. For some clients retrieving the version of the data current at a particular time (rather than the current metadata) may be important. By storing the value of the versionedAs LSID that is supplied with the current metadata they can be sure that they will always be able to refer to a particular version of the metadata. It is recommended that the DublinCore term isVersionOf is used to point from the versioned metadata to the principle LSID."
Thanks for all the help!
Jason
Jason,
I am glad things are getting clearer for you. The examples you give are correct.
There is nothing wrong with including the accession number when you mint the LSID. I imagine that it is what 90% of everyone will do. The bad thing is if a client parses the LSID and say "Ah Ha! The last part is the accession number!" and goes on to use it in a calculation - because they will be wrong in 10% of cases. The client must treat the LSID as opaque. The server doesn't treat it as opaque because it has to do something with it to serve the associated meta/data.
[It is probably better to use an accession of barcode number than a primary key of the database record because moving between implementations internally (restoring backups etc) it is easy to suddenly find that you have accidently recreated the primary keys in a new sequence. They can be a pain to maintain if you want to refactor your internal schema. But this is just an implementation point - you may do things differently]
The description of the versionedAs property is not set in stone yet. Can you suggest a clearer wording on the basis of your understanding. If you can I will change it and you will contributed to the onward march of the ontology.
All the best,
Roger
On 5 Jun 2007, at 01:58, Jason Best wrote:
Thanks for all the responses and discussions. First I'll quickly answer some questions, then ask some more.
Richard, when I say "collection" I mean "collection event". I'll try to be more precise in the future but, just as a warning, I'm bound to slip.
Someone said something about incorporating the accession number into the LSID. You could do that, but it violates the "semantically opaque" aspect of the LSID identifier.
Paul said: "Anybody got any views (strong, otherwise or proxy for others views) on whether the LSID should refer to data+metadata or just metadata?"
I think Roger's response covers the issue, but I'll add my perspective. Ultimately, you need both (and this is my STRONG view). In our current situation, we have no need to resolve an LSID to data, only metadata for records that have no bytestream representation. That will change once we implement LSIDs for our images, documents and GIS files.
Roger, I think your explanation is getting me closer to understanding how we should implement this, but I think I'll have to delve into RDF a bit more. Right now I'm thinking in database schemas and not RDF and I think the schema we have will allow the scenario you describe but I'd like to spell it out a little more just to make sure I understand it. I'm assuming when you say a "thing that changes", that will have a single LSID that does not change, but everything else about the "thing that changes" will change based upon the current version. So this is what I understand, in pseudo-RDF: ("Specimen_metadata" just represents the changing content, I just use different numbers to indicate this.)
Specimen X Record LSID: Specimen_X_LSID versionedAs: Version_3_LSID Specimen_metadata: [567]
Specimen X Version 1 LSID: Version_1_LSID isReplacedBy: Version_2_LSID Replaces: NULL isVersionOf: Specimen_X_LSID Specimen_metadata: [123]
Specimen X Version 2 LSID: Version_2_LSID isReplacedBy: Version_3_LSID Replaces: Version_1_LSID isVersionOf: Specimen_X_LSID Specimen_metadata: [345]
Specimen X Version 3 LSID: Version_3_LSID isReplacedBy: NULL Replaces: Version_2_LSID isVersionOf: Specimen_X_LSID Specimen_metadata: [567]
Your description of "versionedAs" in your email seems pretty clear, so that is what I based the above on, but from the TDWG spec it is not so clear to me if I got it right. From what you say, it sounds like versionedAs would only point to one version of the metadata (the current version) but the description below makes it sound like a way to access older versions and would therefore have to be a listing of all the older versions. Or, does versionedAs go in the non-current versions and point to the unchanging LSID of the "thing that changes"?
http://rs.tdwg.org/ontology/voc/Common#versionedAs "When the metadata linked to by this LSID is liable to change the LSID authority may choose to make an older version of the metadata available via another LSID. For some clients retrieving the version of the data current at a particular time (rather than the current metadata) may be important. By storing the value of the versionedAs LSID that is supplied with the current metadata they can be sure that they will always be able to refer to a particular version of the metadata. It is recommended that the DublinCore term isVersionOf is used to point from the versioned metadata to the principle LSID."
Thanks for all the help!
Jason
participants (2)
-
Jason Best
-
Roger Hyam