[Tdwg-guid] DOIs and persistence -- when DOIs go bad

Sally Hinchcliffe S.Hinchcliffe at kew.org
Fri May 26 10:12:05 CEST 2006


Rod wrote:
...
> The DOI example above is particularly annoying for me, because it  
> relates to IPNI's LSIDs. One of my favourite plant taxa is Poissonia  
> heterantha (lsidres:urn:lsid:ipni.org:names:20012728-1). The IPNI  
> metadata for this name include the following:
> 
> <tn:publishedIn>Syst. Bot. 28(2): 401 (2003). </tn:publishedIn>
> 
> Wouldn't it be nice to have something like:
> 
> <tn:publishedIn  
> rdf:resource="doi:10.1600/0363-6445(2003)028[0387:PORLFR]2.0.CO;2" />
> 
> Given a DOI, users can locate the article quickly and, via CrossRef,  
> extract metadata. The more taxonomic literature is associated with  
> GUIDs such as DOIs, Handles, and LSIDs, the better.

This somewhat relates to a cross conversation (sorry, fell off the 
mailing list) we've been having with Neil Thomson about his BHL 
proposal & it points up another social problem, this time with legacy 
data

Even if the DOIs were working as advertised, what would it take to 
get to the point where we could supplement 'Syst. Bot. 28(2): 401 
(2003).' with "doi:10.1600/0363-6445(2003)028[0387:PORLFR]2.0.CO;2" ?

This example is actually quite close to being doable 
programmatically. 'Syst. Bot.' is a standardised publication 
abbreviation in IPNI, and the collation is in a standard form too, so 
if we had a source of DOIs tied to articles in Syst. Bot. we could 
imagine writing a routine to programmatically import the relevant DOI 
into the IPNI record.

We're chugging away with standardisation in IPNI and we're doing 
updates at the moment that are cleaning up aspects of 10,000, 15,000, 
up to 50,000 records at a time. But there are 1.5 million records to 
do and we can still find (just from within Poissonia) citations like 
'Adansonia, ix. (1870) 295. ' or 'Bol. Mus. Hist. Nat. Tucuman no. 6: 
8. 1925 Hauman, in Kew Bull. 1925: 279. 1925 ' which look much less 
tractable

60% of publications are not standardised in IPNI, and of the 
standardised ones many of them predate DOIs (are there any moves 
among journals to retrofit DOIs to older series?) Plus I've noticed 
in the IPNI and other editors at Kew a strange reluctance to type in 
strings that look like 
'10.1600/0363-6445(2003)028[0387:PORLFR]2.0.CO;2'
- I don't know whether barcode reading technology could help here...

I think that any solution to standardising citations of literature 
must inevitably include (if not be confined to) some use of the 
infrastructure that DOIs represent. But we will also have to come up 
with a plan that will help us deal with the vast weight of legacy 
data that databases like IPNI carry

Sally*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** S.Hinchcliffe at rbgkew.org.uk





More information about the tdwg-tag mailing list