Thanks for the additional information, Paul.
Yes Rich, our plan is to apply the LSID to the accession 'number' (actually an accession 'code' as we have an historical legacy of suffix 'a', 'b', etc for subdivisions of the original collection which in many cases is a collection of objects rather than one physical object - a bag of leaves for example). And yes, there are some possible problems with errors associated with the metadata but ... in the cotenxt of a DBMS where the accession number is set to unique values only, duplications are in reality impossible, and yes there are far more important challenges to address than this ... ;-)
O.K., thanks -- and I agree!
I assume you are correct about the
001100010011001000110011001101000011010100110110
... I'm a systematist leaning towards nomenclature rather than an IT person.
You and me both. I was able to create the binary conversion not because I'm a techno-whiz, but because I know just enough about how to use Google to be dangerous (i.e., http://www.theskull.com/javascript/ascii-binary.html). But the main question was about exactly how one would convert the text "12345" into a binary data blob for the LSID "data" (as opposed to metadata).
I guess the 'change of ownership' comment was directed at the importance of retaining the accession number as this is cited in the literature, and the utility of keeping this as a resolvable LSID.
Ah! That makes sense. But still, I'm a little uneasy about "committing" an LSID to an accession number by branding it with data. The main advantage I see is that no matter how much manipulation happens with the metadata, there will still be "something" permanently included with the LSID as data that a human might be able to use to sort things out if the metadata get changed too much. Even still, though, I think I would want to also include an institutional and/or collection prefix, so that the embedded number is (potentially) more interpretable to an outside observer.
A rather complex model is required for 'managing' the objects of a collecting event and what subsequently happens to those objects, which others have more experience of and valid opinions on - I refer, for example, to a pit trap for insects where multiple objects are assigned an initial accession number, the objects are subsequently divided and divided again and again and finally a few may end up on pins as name bearing types.
Right -- that's a common occurrence in natural history collections. The number "12345" is assigned to a multi-species lot, and then later that lot is split up into constituent parts. I don't see this as a major problem, though. In cases where institutions typically retain the original number for one of the original parts, and simply assign new numbers to the bits that were "removed", then the only thing that changes on the original accession number/LSID is the metadata for its contents, and the new accession numbers/LSIDs would, presumably, include pointers back to the original number/LSID as a "removed from" indicator -- but again, it only affects metadata. Conversely, institutions that assign new numbers to all components of a split-up lot (effectively depreciating the original number and retaining its meaning to the original multi-part lot) will also only need to manage metadata changes.
The more I think about it, the more I like your approach of branding the LSID to the accession event, rather than some conceptual notion of a "specimen" -- which is actually more dynamic than I think most people realize). But I'd like to see TDWG create some sort of standard that we can all collectively follow in how to actually do this. Personally, I'd like to see that standard include institution code and collection code within the binary data blob.
Aloha, Rich