Chuck, The scaling issue is one of the reasons BiSciCol started working with ARKs. But key to any solution is having a process for maintaining the identifiers and that is best done by a 3rd party and costs $ Good idea on sharing that burden!
Sent from my iPhone
On May 6, 2014, at 7:52 AM, Chuck Miller Chuck.Miller@mobot.org wrote:
Does all of this scale to millions or 100s of millions of records? “DataCite also has a dashboard that shows which members have how many DOIs for which they fail resolution.” If the number of failed occurrence record DOIs for a member is say 100,000 out of 10 million, who has the resources to sort through 100,000 records and fix them? It’s one thing to have a journal article with a DOI, it’s a totally different one to have a million records with DOIs.
And the key question of them all – Who pays for the 100s of millions of DOIs if DataCite or CrossRef mint them? If GBIF could be set up as an RA paying a single annual fee to DataCite, then the DOI cost problem could possibly be resolved.
Chuck
From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Hilmar Lapp Sent: Tuesday, May 06, 2014 9:27 AM To: Tim Robertson [GBIF] Cc: tdwg-content@lists.tdwg.org; Markus Döring (GBIF); tomc@cs.uoregon.edu Subject: Re: [tdwg-content] delimiter characters for concatenated IDs
On Tue, May 6, 2014 at 10:10 AM, Tim Robertson [GBIF] trobertson@gbif.org wrote:
a) client has a specimen record they wish to stamp with an identifier b) client requests DOI (or other format) from the issuing service and provides the minimum metadata in a DwC-esque profile, potentially with a preferred suffix c) service provides identifier, and client stores this along with their digital record d) from this point on, the DOI identifies the record
Pretty much, I think.
If so, what would happen on resolution? Does the client provide the target URL during minting which will be the redirection target on resolution?
Yes.
Does the service have to monitor the availability and return the cached copy on outage?
No. This is why you sometimes get an error when resolving a DOI for an article.
The thing with CrossRef is that by agreeing to the terms of service publishers are obligated to maintain the DOI metadata record, including a properly resolving landing page. Repeat-offending publishers get dinged. DataCite also has a dashboard that shows which members have how many DOIs for which they fail resolution.
What seems most important to me when I think this through is that the identifier needs to be minted as early on as possible in the record life - before it is shared with others. Which leads us back to the question of whether we envisage people adopting a model where they effectively submit their record data in order to get an identifier. If not, at least if we got stable IDs on records in whatever form, we can manage the resolvability bit later, and identify duplicates.
Note that you can mint the DOI before registering it. It won't resolve until it's registered, but there is no requirement that publication must precede deciding on the DOI.
In other words, the client decides on what the DOI is to be, not the registration agency.
-hilmar
-- Hilmar Lapp -:- informatics.nescent.org/wiki -:- lappland.io
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content