[tdwg-content] delimiter characters for concatenated IDs

Chuck Miller Chuck.Miller at mobot.org
Tue May 6 16:52:45 CEST 2014

Does all of this scale to millions or 100s of millions of records?  “DataCite also has a dashboard that shows which members have how many DOIs for which they fail resolution.”  If the number of failed occurrence record DOIs for a member is say 100,000 out of 10 million, who has the resources to sort through 100,000 records and fix them?  It’s one thing to have a journal article with a DOI, it’s a totally different one to have a million records with DOIs.

And the key question of them all – Who pays for the 100s of millions of DOIs if DataCite or CrossRef mint them?  If GBIF could be set up as an RA paying a single annual fee to DataCite, then the DOI cost problem could possibly be resolved.


From: tdwg-content-bounces at lists.tdwg.org [mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Hilmar Lapp
Sent: Tuesday, May 06, 2014 9:27 AM
To: Tim Robertson [GBIF]
Cc: tdwg-content at lists.tdwg.org; Markus Döring (GBIF); tomc at cs.uoregon.edu
Subject: Re: [tdwg-content] delimiter characters for concatenated IDs

On Tue, May 6, 2014 at 10:10 AM, Tim Robertson [GBIF] <trobertson at gbif.org<mailto:trobertson at gbif.org>> wrote:

  a) client has a specimen record they wish to stamp with an identifier
  b) client requests DOI (or other format) from the issuing service and provides the minimum metadata in a DwC-esque profile, potentially with a preferred suffix
  c) service provides identifier, and client stores this along with their digital record
  d) from this point on, the DOI identifies the record

Pretty much, I think.

If so, what would happen on resolution?  Does the client provide the target URL during minting which will be the redirection target on resolution?


 Does the service have to monitor the availability and return the cached copy on outage?

No. This is why you sometimes get an error when resolving a DOI for an article.

The thing with CrossRef is that by agreeing to the terms of service publishers are obligated to maintain the DOI metadata record, including a properly resolving landing page. Repeat-offending publishers get dinged. DataCite also has a dashboard that shows which members have how many DOIs for which they fail resolution.

What seems most important to me when I think this through is that the identifier needs to be minted as early on as possible in the record life - before it is shared with others.  Which leads us back to the question of whether we envisage people adopting a model where they effectively submit their record data in order to get an identifier.  If not, at least if we got stable IDs on records in whatever form, we can manage the resolvability bit later, and identify duplicates.

Note that you can mint the DOI before registering it. It won't resolve until it's registered, but there is no requirement that publication must precede deciding on the DOI.

In other words, the client decides on what the DOI is to be, not the registration agency.


Hilmar Lapp -:- informatics.nescent.org/wiki<http://informatics.nescent.org/wiki> -:- lappland.io<http://lappland.io>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20140506/47606ebb/attachment.html 

More information about the tdwg-content mailing list