Topic 3: GUIDs for Taxon Names and Taxon Concepts
S.Hinchcliffe at KEW.ORG
Fri Nov 4 09:56:02 CET 2005
Rod Page wrote:
> Perhaps to try and clarify my previous post (not a good idea to write
> before the coffee has kicked in), as a straw man what's wrong with the
> following approach:
> 1. Data providers serve up their own records, each identified by a GUID
Agree, although for some areas there might be some control over _who_
is a data provider to prevent overlap?
> 2. At a minimum a "record" is metadata about that object, using
> existing vocabulary as much as possible. This means Dublin Core for
> basic stuff (title, creator, etc.), Prism for bibliographic details,
> Basic Geo (WGS84 for geographic co-ordinates, Darwin Core for other
> specimen stuff, etc.
> 3. Such records may refer to other objects (e.g., a sequence may refer
> to a publication and a museum voucher specimen), all by way of GUIDs.
> 4. Each record carries a Creative Commons license specifying what you
> can do with the data. The present system where museums have some text
> saying what can and can't be done is too clumsy, and putting the GBIF
> data usage agreement in the way of searching GBIF is way too
> If we do this (and we're pretty close to this already), then I think
> we've got the core of a useful resource. Partly because all of this is
> technically easy to do, and is already being done in other areas. So,
> why not just do it?
> I think the role of GUIDs is to (a) unambiguously identify digital
> objects, and (b) tell us where to get that digital record.
> Many of the issues that crop up here are, in my opinion, not really
> about GUIDs. Regarding the issue of taxonomic concepts, I think this is
> something of a red herring. If concepts are essentially a pairing of
> name and use, then given a GUID for a name and a publication, there
> will be huge numbers of these, and they will occur in all sorts of
> domains (taxonomy, ecology, development, conservation, medicine, etc.).
> They will also exist in databases (e.g., GenBank, TreeBASE, etc.). But
> if they are pairings of names and usage (e.g., publication, database),
> then given a name GUID and a usage GUID, do we need anything else,
> really? Names, publications, and specimens are primary, concepts are
Interesting - you could end up with a sort of hashed or concatenated
guid (because people don't like dealing with composite keys). This
might help with specimens which are duplicates of other specimens
from the same collecting event - no need to unambiguously ID the
event itself centrally if you can ID the collector and the ID he/she
gave it ... although IDs for Collectors seems like it will be a long
> It seems to me the core notion of a concept is knowing what somebody
> meant when using a name. This is (a) a problem of inference, and (b)
> probably intractable in most cases (who knows what ecologist "x" meant
> by species "y" in 1910, regardless of what classification he or she may
> have claimed to be using).
> One example where this inference is more straightforward, especially
> given 1-4 above, is if observations are linked to specimens. For
> example, in TreeBASE there are various occurrences of "Apomys datae" in
> different studies. Given that these names are linked to sequences,
> which are linked to specimens, we can infer that these different
> occurrences are the same thing. This kind of inference is made much
> easier when data is openly available. I also think that inference of
> concepts will be domain-specific, localised to particular questions,
> and not necessarily something taxonomic databases need to themselves
> support. Do we seriously think we can build a database of the usage of
> all names and what they meant? I suggest this knowledge will emerge
> over time on the foundations of 1-4 above. Since taxonomists don't
> control how names are used, and probably use names a lot less than
> other biologists (after all, we provide names so biologists can
> communicate), I don't think it is the role of taxonomists to document
> every concept.
> The KISS principle (Keep It Simple Stupid) would seem to apply here.
> Why not focus on something that is achievable, has value, and can be
> linked to what people in other areas are doing (such as bioinformatics,
> the Semantic Web, etc.).
> In other words, maybe it's time to stop thinking like taxonomists...
> Professor Roderic D. M. Page
> Editor, Systematic Biology
> DEEB, IBLS
> Graham Kerr Building
> University of Glasgow
> Glasgow G12 8QP
> United Kingdom
> Phone: +44 141 330 4778
> Fax: +44 141 330 2792
> email: r.page at bio.gla.ac.uk
> web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
> Subscribe to Systematic Biology through the Society of Systematic
> Biologists Website: http://systematicbiology.org
> Search for taxon names at http://darwin.zoology.gla.ac.uk/~rpage/portal/
> Find out what we know about a species at http://ispecies.org
> To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com
*** Sally Hinchcliffe
*** Computer section, Royal Botanic Gardens, Kew
*** tel: +44 (0)20 8332 5708
*** S.Hinchcliffe at rbgkew.org.uk
More information about the tdwg-tag