Hi Gabi,

It’s been raised before, but I’d also like to discourage anyone assembling record identifiers algorithmically.

Should a collection code change in time then one of 2 things happen, neither of which are desirable:
  a) the record ID changes and all links and caches are broken
  b) the record ID stays the same, but is confusing to any observer as it contradicts the record content

We see this happen very regularly as GBIF index datasets, and it results in the kind of blogs Rod offers regularly.  
I would encourage using well curated primary keys, and then offering search services such as http://gabi.de/specimen?catalogNumber=XXX&collecctionCode=YYY to aid discovery.

Cheers,
Tim

On 06 May 2014, at 15:45, Dröge, Gabriele <g.droege@BGBM.ORG> wrote:

Thanks Gregor! I must say I really like this idea for our short term solution! Will discuss this with my team.
 
Best,
Gabi
 
Von: Gregor Hagedorn [mailto:gregor.hagedorn@mfn-berlin.de] 
Gesendet: Dienstag, 6. Mai 2014 13:57
An: Dröge, Gabriele
Cc: tdwg-content@lists.tdwg.org
Betreff: Re: [tdwg-content] delimiter characters for concatenated IDs
 
On the original question: my preference when creating an identifier for the triplet would be to be transparent, and create the identifer with the dwc terms embedded, 
 
i.e. use 
institutionCode=BGBM&collectionCode=xxx&catalogNumber=289742874239872
 
as the "self-coined" identifier string. While "&" may not be unique, the delimiter 
&collectionCode=
is very likely so. And it is transparent, self-documenting what is going on (and I do not mean that people need to parse it - which they could in either case)
 
Gregor

 

On 5 May 2014 15:24, "Dröge, Gabriele" <g.droege@bgbm.org> wrote:
Hi everyone,
 
I guess there might have been some discussions about proper delimiter characters in the past that I have missed.
 
In several projects, first of all in GGBN (Global Genome Biodiversity Network, http://www.ggbn.org), there is a need for making a decision now. We need to reference between different records and databases and within Darwin Core we want to use the relatedResourceID to do so.
 
During our GGBN workshop at TDWG last year we agreed on concatenating the traditional triple ID (Catalogue Number, Collection Code, Institution Code) and add further parameters if required too (e.g. GUID, access point). We have checked those parameters and can definitely not use a single character as delimiter.
 
So my question to you is, if there are already some suggestions on using two characters together as delimiters. It would be great if we could find a solution more than one community could agree on.
 
Otherwise I would like to open the discussion and suggest "\\", "||", "\|", "§|", "§§", or "\§".
 
Best wishes,
Gabi
-----------------------------------------------------------------
Gabriele Droege
Coordinator - DNA Bank Network
Global Genome Biodiversity Network (GGBN)
Berlin-Dahlem DNA Bank
Women's Officer ZE BGBM
 
Botanic Garden and Botanical Museum Berlin-Dahlem
Freie Universität Berlin
Koenigin-Luise-Str. 6-8
14195 Berlin
Germany
 
 


_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content



 
--
---------------------------
Dr. Gregor Hagedorn
Head of Digital World and Information Science
Museum für Naturkunde Berlin
Leibniz-Institut für Evolutions- und Biodiversitätsforschung
Invalidenstrasse 43, 10115 Berlin 
+49 (0)30 2093 8576 (work)
+49-(0)30-831 5785 (private)
gregor.hagedorn@mfn-berlin.de
http://www.naturkundemuseum-berlin.de
http://linkedin.com/in/gregorhagedorn

This communication, together with any attachments, is intended only for the person(s) to whom it is addressed. Redistributing or publishing it without permission may be a violation of copyright or privacy rights.
_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content