Jessie,
Many thanks for this valuable
response. For reference, I agree with all of Jessie’s judgments
below.
Donald
---------------------------------------------------------------
Donald Hobern (dhobern@gbif.org)
Programme Officer for Data Access and Database Interoperability
Global Biodiversity Information Facility Secretariat
Universitetsparken 15, DK-2100
Tel: +45-35321483
---------------------------------------------------------------
From:
Sent: 04 November 2005 14:18
To:
Subject: Re: Topic 3: GUIDs for
Taxon Names and Taxon Concepts - Maybe ...
Dear All
Sorry for coming in so late into the discussion in
this list – I have a bit of catching up to do….but have been
reading the listings on this Topic with great interest and agreeing with much
of what has been said by everyone but still seeing misunderstandings or
disagreements between people on what we mean and what we should do which is
inevitable given that people have different reasons for wanting GUIDs etc.
To summarise my take on what’s been said and if
I misquote anyone please forgive me and feel free to correct me….
I agree with Donald (and Rich) about needing
Taxon_Name_GUIDs and Taxon_Concept_GUIDs.
I agree with Donald that a shorthand to referring to
a Taxon_Concept is the combination of Taxon_Name and the Taxon_Publication.
This is slightly but importantly different to Rich here in that the publication
should be seen as a taxonomic publication (for another discussion elsewhere)
rather than any publication – i.e. not simply usage of a name which could
include an observation or identification – this interpretation
opens up Taxonomic_Concepts too widely to cover potential taxa rather
than those which have been defined, described and published in some scientific
manner.)
I agree with the principle that GUIDs do not have to
be issued from a central repository BUT….
I do not agree that we should be issuing GUIDs for
every taxonomic record – be that for Taxonomic_Name or Taxonomic_Concept.
There needs to be some control – even if the control is self imposed.
I agree with Sally that we should have Nomenclators
provide GUIDs for Taxon_Names but would be more strict and say that these names
should have only one GUID and that the GUID should always resolve to exactly
the same name. So IPNI would be plant names, Index fugorum fungi names, zoobank
when available animal names etc. and that we should share the resource of
fixing this rather than duplicating effort (if I had any resource to give;-) )
IPNI and IF wouldn’t overlap in names.
I agree that anyone should be able to assign GUIDs
but for their “own”
Taxonomic_Concepts, (so for example a provider like say Mammal species of the world
(or Bergey’s manual for bacteria) which has well described concepts,
could issue GUIDs for all of the concepts they recognise and other users of
concepts could use their GUID in their database to record observations etc if
they agreed with their concept. If they didn’t agree with the MSW concept
they could publish their own Taxonomic_Concept and GUID and relate it to some
MSW Taxonomic_Concept GUID) BUT….
I would argue that we should try to avoid generating
GUIDs multiple times for legacy concepts. For example, as mentioned by Rich the
original descriptions of names are original concepts – we wouldn’t
want ITIS, GBIF, Species 2000, SEEK and whomever else creating different GUIDS
of their own to represent this concept if possible or we would never move
towards knowing we’re talking about the same thing (- how this might be
managed is probably for another discussion topic).
I agree with Rod to the extent that moving forward
with a distributed GUID publication system is easy where providers publish
GUIDs for their own digital objects BUT….
I don’t think it’s helpful or in the
spirit of GUIDs to take a digital object and republish it with another GUID
without managing the replication of the GUIDs for the object – which
could effectively be what happens if every data provider just issues GUIDs for
the data they hold which is effectively the same as what someone else has
already published. I think we do want to move towards reuse of GUIDs as much as
possible and therefore sharing of work and easier and more accurate data
integration. Also, what’s the point of having millions of unique GUIDs if
I don’t know which one to put in “my” database to represent
the Taxon_Name or Taxonomic_Concept I want to refer to? Rod would suggest this
be self selecting (use your favourite provider, or the one you’re told to
– but I don’t think it’s that simple and doesn’t get us
towards a good solution – unless the one you’re told to use does it
properly!)
I don’t agree with Rod that we should do only
purely nomenclatural mappings – as for most of the projects I’m
involved with and people I’ve spoken to this will not solve their
problems – we do need to know what the name means. However nomenclatural
mapping is useful and a part of solving the overall problem.
I agree with Rod that GUIDs will not solve all our
problems but if managed and planned in a sensible way they will help us on our
way to solving the problem.
What providers of GUIDs need to decide is what they
are providing a GUID for – in our area is it a Taxon_Name, a
Taxonomic_Concpet, a Taxonomic_Observation, a Specimen, a Description (of a
specimen or Taxonomic_Concept), a Taxonomic_Publication or whatever. Each of
these things can probably refer to some to the other things and these should be
using GUIDs.
For example if you’re an ecologist you might
record survey data and issue-
a GUID for a Taxonomic_Observation which contains
a GUID to a
Taxonomic_Concept which contains
a GUID
to a Taxonomic_Name which in turn will contain
a GUID
to a Specimen – the type specimen
a GUID
to a Taxonomic_Name maybe the basionym
a
Description which in turn will contain
GUIDs to
the Specimen(s) or Taxonomic_Concept described
The ecologist may only want to know about their
observation of the Taxonomic_Concept and not concern themselves with the
embedded detail. So hopefully you’ll see that we really want commonality
in usage of GUIDs as much as possible so I think we need to think a bit about
how we manage this – possibly another topic for discussion and certainly
one that GBIF could lead.
I agree with Rod the idea of a local GUID as being
nonsensical – GUID meaning Globally Unique Identifier. The issue is about
global uniqueness and resolvability which are separate things but related in
this discussion. And as Rod says about whether we mean the same thing. Now as
Rod says sorting out the name part is a huge step forward I think sorting out
the Concept part is a bigger step forward – but won’t EVER solve the
issue of whether or not someone identified something properly or whether
they’re thinking the same as me but if we have Taxonomic_Concepts (with
GUIDs) we’re nearer to knowing what we mean. Isn’t this what the
semantic web people are saying – you can’t assume that because the
name is the same we’re talking about the same thing….. So I do
believe we need Taxonomic_Concepts and Taxonomic_Names and all we’re
trying to do is sort out the terminology and semantics for what concepts of
organisms exist or have existed in the world.
Jessie
Prof. J B Kennedy
School of Computing
EH10 5DT
Tel: +44 (0)131 455 2772
Fax: +44 (0)131 455 2727
Email: j.kennedy@napier.ac.uk
WWW: http://www.soc.napier.ac.uk/jessie
This
message is intended for the addressee(s) only and should not be read, copied or
disclosed to anyone else outwith the University without the permission of the
sender. It is your responsibility to ensure that this message and any
attachments are scanned for viruses or other defects. Napier University does
not accept liability for any loss or damage which may result from this email or
any attachment, or for errors or omissions arising after it was sent. Email is
not a secure medium. Email entering the University's system is subject to
routine monitoring and filtering by the University.