[Tdwg-guid] DOIs and persistence -- when DOIs go bad
Sally Hinchcliffe
S.Hinchcliffe at kew.org
Fri May 26 10:12:05 CEST 2006
Rod wrote:
...
> The DOI example above is particularly annoying for me, because it
> relates to IPNI's LSIDs. One of my favourite plant taxa is Poissonia
> heterantha (lsidres:urn:lsid:ipni.org:names:20012728-1). The IPNI
> metadata for this name include the following:
>
> <tn:publishedIn>Syst. Bot. 28(2): 401 (2003). </tn:publishedIn>
>
> Wouldn't it be nice to have something like:
>
> <tn:publishedIn
> rdf:resource="doi:10.1600/0363-6445(2003)028[0387:PORLFR]2.0.CO;2" />
>
> Given a DOI, users can locate the article quickly and, via CrossRef,
> extract metadata. The more taxonomic literature is associated with
> GUIDs such as DOIs, Handles, and LSIDs, the better.
This somewhat relates to a cross conversation (sorry, fell off the
mailing list) we've been having with Neil Thomson about his BHL
proposal & it points up another social problem, this time with legacy
data
Even if the DOIs were working as advertised, what would it take to
get to the point where we could supplement 'Syst. Bot. 28(2): 401
(2003).' with "doi:10.1600/0363-6445(2003)028[0387:PORLFR]2.0.CO;2" ?
This example is actually quite close to being doable
programmatically. 'Syst. Bot.' is a standardised publication
abbreviation in IPNI, and the collation is in a standard form too, so
if we had a source of DOIs tied to articles in Syst. Bot. we could
imagine writing a routine to programmatically import the relevant DOI
into the IPNI record.
We're chugging away with standardisation in IPNI and we're doing
updates at the moment that are cleaning up aspects of 10,000, 15,000,
up to 50,000 records at a time. But there are 1.5 million records to
do and we can still find (just from within Poissonia) citations like
'Adansonia, ix. (1870) 295. ' or 'Bol. Mus. Hist. Nat. Tucuman no. 6:
8. 1925 Hauman, in Kew Bull. 1925: 279. 1925 ' which look much less
tractable
60% of publications are not standardised in IPNI, and of the
standardised ones many of them predate DOIs (are there any moves
among journals to retrofit DOIs to older series?) Plus I've noticed
in the IPNI and other editors at Kew a strange reluctance to type in
strings that look like
'10.1600/0363-6445(2003)028[0387:PORLFR]2.0.CO;2'
- I don't know whether barcode reading technology could help here...
I think that any solution to standardising citations of literature
must inevitably include (if not be confined to) some use of the
infrastructure that DOIs represent. But we will also have to come up
with a plan that will help us deal with the vast weight of legacy
data that databases like IPNI carry
Sally*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** S.Hinchcliffe at rbgkew.org.uk
More information about the tdwg-tag
mailing list