Dear Anna,
Thank you for making the connection
between these two groups. I think it would help if I explained
(particularly for the TDWG-LIT group) what questions are being addressed by the
TDWG-GUID work under the general heading of “PublicationBank”.
During the first GUID workshop, we recognized
that different classes of information require (for want of a better term)
different strengths of GUIDs. It
is a great help for us to be able to recognise that two references are to the
same piece of data because they use the same GUID to reference it. Let me
give some examples.
If I state that my taxon concept includes
a specimen with LSID urn:lsid:my.org:specimen:123 and someone else also
includes the same LSID in the list of specimens examined as part of their
revision, it helps us to make some firm deductions about shared material. It
seems reasonable that we will be able to associate identifiers with specimens
in a way that ensures that the vast majority of specimens can receive a single
identifier, meaning that all references to that identifier refer to that
specimen and that all references to that
specimen use that identifier. This second part is what I mean
when I speak of a strong
identifier.
Now consider the situation with taxon
names. Many people are going to wish to refer to the same names (or
nomenclatural acts). It will clearly be really valuable if we can work
towards having a single GUID for each validly published name, so that we can maximize
the interconnectedness of our data. If I say that refer to the name with
the LSID urn:lsid:my.org:names:xyz and that LSID has data or metadata
indicating that it relates to Aus bus
Jones, 2004, and you use the LSID urn:lsid:another.org:names.abc to refer to
the same Aus bus Jones, 2004,
then we are still left with the same string matching problems we have right now
with names. It therefore seems sensible to work with the nomenclators as
the “preferred” issuers of LSIDs for taxon names (recognising the
gaps we have today for zoological names) and to encourage a move to using those
identifiers whenever we wish to provide a secure reference to each name. (Of
course this implies an urgent need for tools and services to make this easy.)
Turning to taxon concepts, we had a long
debate as to whether it was plausible to try to enforce the same degree of preferred
issuers for LSIDs for taxon concepts. If I publish the first LSID-enabled
revision of a group, I may need to assign LSIDs to refer to many different
taxon concepts. Someone else databasing the taxonomy of the group will
have a similar task. Unless we manage a central easy-to-search registry
for people quickly to find out whether someone has already assigned an LSID to Aus bus Jones, 2004 sensu Smith, 2006, we
will never be able to make any assumptions based on the fact that I have used
urn:lsid:my.org:concepts:123.1 and you have used urn:lsid:my.org:concepts:abc.001.
Even though the two identifiers are different there is still a good chance
that they may refer to the same concept (expressed as name-according to-publication).
It seems much more reasonable instead to tackle the problem of getting
really strong LSIDs for names (through the nomenclators) and doing the same
thing for the taxonomic literature (through someone for whom we used the
placeholder name “PublicationBank”). Any concept LSID can
resolve through its metadata to two LSIDs, one for a name and one for a
publication. Comparing concept LSIDs can therefore be based on the
comparisons between these two more fundamental objects.
So, from the standpoint of the GUID group,
the requirement here is a very specific one. We need to find a way to
manage assigning LSIDs to the publications that make up the taxonomic
literature, so that we can all have what would amount to a master list of
relevant publications. From this angle, “all” that is needed
is a secure registry into which the bibliographic data can be stored, cleaned
and assigned identifiers. Of course such a resource could also be an
excellent place to register the location of online digital versions of each publication.
At that point it becomes something even more valuable. On the other
hand, considering it this way suggests that it may already naturally be
addressed as part of the BHL or a similar effort, and part of what we would
like to do is to identify any existing initiatives which may serve as a part or
all of what is required for the LSID work.
As I see it, the TDWG-LIT work gives a
framework for the exchange of these bibliographic data, but we also need to
understand the best way to get the kind of integrated biodiversity bibliography
we would like to have.
Does that all make sense?
Best wishes,
Donald
---------------------------------------------------------------
Donald Hobern (dhobern@gbif.org)
Programme Officer for Data Access and Database Interoperability
Global Biodiversity Information Facility Secretariat
Universitetsparken 15, DK-2100
Tel: +45-35321483
---------------------------------------------------------------
From:
Sent: 21 March 2006 20:10
To: TDWG-GUID@LISTSERV.NHM.KU.EDU
Subject: Re: PublicationBank -
requirements evaluation
Dear
Robert,
You
may not be aware that TDWG has a list devoted to taxonomic literature
standards. It would be great if you (and anyone else interested) would
join in that discussion ( TDWG Literature standards mailing list tdwg-lit@lists.tdwg.org; sign up
at http://lists.tdwg.org/mailman/listinfo/tdwg-lit_lists.tdwg.org/general
) and add your expertise. The list has only been active since early
February, and the complete correspondence is in the archives ( http://lists.tdwg.org/pipermail/tdwg-lit_lists.tdwg.org/
).
Anna
L. Weitzman, Ph.D.
Informatics Branch Chief,
Smithsonian Institution,
phone: (202) 633-0846
fax: (202) 786-3180
email: weitzman@si.edu
INOTAXA - http://www.sil.si.edu/digitalcollections/bca/status.cfm
electronic Biologia Centrali-Americana - http://www.sil.si.edu/digitalcollections/bca/
>>> rhuber@WDC-MARE.ORG 21-Mar-2006 5:06:30 AM >>>
Dear all,
Below is a short 'survey' which hopefully can help to get an overview
on how bibliographic information currently is stored in your databases.
If you don't like to fill such forms, any other info on your current
literature db is also welcome, just send it to me by email!
The list maybe incomplete, if you think important questions are missing
there just let me and the others know.
I will try to sumarize the results on the wiki later.
best regards, Robert
1) How is your literature database/module organised?
- [ ]Database structure completely normalized
- [ ]Database structure not/incomplete normalized
2) How do you hold your bibliographic information?
- [ ]Complete set of Bib info (Author, Title,Source, Volume, Pages)
- [ ]Incomplete set of Bib info
- [ ]Abbreviations (e.g. Stafleu&Cowan)
- [ ]Bib Info and Abbreviations
- Specify which bibliographic fields you hold in your db:
--[ ]Author(s)
--[ ]Title
--[ ]Source (Journal/Book)
--[ ]Pages
--[ ]Date(s)
--[ ]Volume
--[ ]Issue
--[ ]Series
--[ ]URL/GUID
--[ ]Source Editors
--[ ]Series Editors
--[ ]Other:
3) How do you store author names:
- [ ]Abbreviations (e.g. Brummitt & Powell)
- [ ]Complete Name as String, one author per string
- [ ]Complete Name as String, all authors in one string
- [ ]Last Name, First Name separated
4) How do you store journal names/ other sources
- [ ]Complete Name
- [ ]Abbreviation
- [ ]Both
- [ ]If you hold abbreviations acc. to which standard?
Dr. Robert Huber
WDC-MARE / PANGAEA - www.pangaea.de
, www.wdc-mare.org
Stratigraphy.net - www.stratigraphy.net
_____________________________________________
MARUM - Institute for Marine Environmental Sciences (location)
University
Leobener Strasse
POP 330 440
28359
Phone ++49 421 218-65593, Fax ++49 421 218-65505
e-mail rhuber@@wdc-mare.org , robert.huber@stratigraphy.net