Re: Topic 3: GUIDs for Taxon Names and Taxon Concepts
Perhaps to try and clarify my previous post (not a good idea to write before the coffee has kicked in), as a straw man what's wrong with the following approach:
1. Data providers serve up their own records, each identified by a GUID
2. At a minimum a "record" is metadata about that object, using existing vocabulary as much as possible. This means Dublin Core for basic stuff (title, creator, etc.), Prism for bibliographic details, Basic Geo (WGS84 for geographic co-ordinates, Darwin Core for other specimen stuff, etc.
3. Such records may refer to other objects (e.g., a sequence may refer to a publication and a museum voucher specimen), all by way of GUIDs.
4. Each record carries a Creative Commons license specifying what you can do with the data. The present system where museums have some text saying what can and can't be done is too clumsy, and putting the GBIF data usage agreement in the way of searching GBIF is way too cumbersome.
If we do this (and we're pretty close to this already), then I think we've got the core of a useful resource. Partly because all of this is technically easy to do, and is already being done in other areas. So, why not just do it?
----
I think the role of GUIDs is to (a) unambiguously identify digital objects, and (b) tell us where to get that digital record.
Many of the issues that crop up here are, in my opinion, not really about GUIDs. Regarding the issue of taxonomic concepts, I think this is something of a red herring. If concepts are essentially a pairing of name and use, then given a GUID for a name and a publication, there will be huge numbers of these, and they will occur in all sorts of domains (taxonomy, ecology, development, conservation, medicine, etc.). They will also exist in databases (e.g., GenBank, TreeBASE, etc.). But if they are pairings of names and usage (e.g., publication, database), then given a name GUID and a usage GUID, do we need anything else, really? Names, publications, and specimens are primary, concepts are secondary.
It seems to me the core notion of a concept is knowing what somebody meant when using a name. This is (a) a problem of inference, and (b) probably intractable in most cases (who knows what ecologist "x" meant by species "y" in 1910, regardless of what classification he or she may have claimed to be using).
One example where this inference is more straightforward, especially given 1-4 above, is if observations are linked to specimens. For example, in TreeBASE there are various occurrences of "Apomys datae" in different studies. Given that these names are linked to sequences, which are linked to specimens, we can infer that these different occurrences are the same thing. This kind of inference is made much easier when data is openly available. I also think that inference of concepts will be domain-specific, localised to particular questions, and not necessarily something taxonomic databases need to themselves support. Do we seriously think we can build a database of the usage of all names and what they meant? I suggest this knowledge will emerge over time on the foundations of 1-4 above. Since taxonomists don't control how names are used, and probably use names a lot less than other biologists (after all, we provide names so biologists can communicate), I don't think it is the role of taxonomists to document every concept.
The KISS principle (Keep It Simple Stupid) would seem to apply here. Why not focus on something that is achievable, has value, and can be linked to what people in other areas are doing (such as bioinformatics, the Semantic Web, etc.).
In other words, maybe it's time to stop thinking like taxonomists...
------------------------------------------------------------------------ ---------------------------------------- Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names at http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species at http://ispecies.org
___________________________________________________________ To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com
participants (1)
-
Roderic Page