Taxon debate synthesis?

Thu Nov 17 11:12:41 CET 2005

Another quick reply re this thread...

One GUID system for Name Usages - why stop there - why not one GUID
system for anything? We are drawing arbitrary lines in trying to scope
what should or should not be encompassed in this GUID system. 
Why are we doing this? I would say to make it easier to use and refer to
taxon names, concepts, observations, general usages etc. consistently so
that the biologists can do better science. So what do biologists need to
know - the main thing (IMO) is what they are talking about so that when
they communicate electronically references are more meaningful/accurate.

I would've thought that they would want to know that if they get a GUID
for a concept then they aren't using a name or vice versa. 

One GUID system I think will encourage people to duplicate effort and
publish GUIDs for the sake of showing they've done something because
that's the easy bit. We should encourage one GUID for one thing be that
a concept or name or whatever rather than a GUID for every
representation of every electronic version of a record representing one
thing (I know this will be hard but we should work towards that
otherwise we're wasting people's time and money)

I agree from an implementation point of view one GUID system is easier
but from a user of the system I'm not so convinced or form an
implementer of a system trying to deal with all of the different types
of things you might get returned when all you want are names or
whatever.

> 
> >
> > I would point out that within the "TaxonName" flavor, there may be
> > "subflavors" like "Basionym", "New Combination", and "Orthographic
> > Variant" -- just as within the TaxonConcept flavor there are
> > "Original",
> > "Revision", "Nominal", etc. subflavors. (Apologies for my
Amerikanized
> > spelling...)
> >
> 
> Before we have a flurry of name flavours (yuck), I'd suggest that many
> of these "flavours" are really relationships between names, e.g.
> "Physeter catadon" is a misspelling of "Physeter catodon". Hence, I'd
> argue for having GUIDs for these two name strings, and in the metadata
> for one name have something like
> 
> <isMisspellingOf rdf:resource="urn:lsid:authority:namespace:123" />
> 
> Hence, I think before we embark on adding complexity, why not have
> GUIDs for names and then specify relationships between these names?
> This is what I've been doing with the LSIDs we serve here.
> 
And of course this is what TCS has done in modelling the TaxonNames
element. There are explicit relationships you can have between names
such as misspelling of or basionymof. These relationships can't exist
between taxon concepts just as a name being included in another name
can't exist between taxon Names.

One of the reasons we were asked in TCS to split Taxon Names form Taxon
concepts.

> We could do with an ontology for these relationships so that, for
> example, if somebody asks for the synonyms of name "x", and two names
> are linked by the relationship "isBasionymOf" we recover that
> relationship (in other words, our tools know that 'isBasionymOf" is a
> kind of synonym).
> 
> Again, let's try and keep things simple.

I'm not convinced at the end of the day this would be simple but....

> 
> This leads to a much more general topic, but I think if we adopt RDF
> for our metadata we gain a lot of query and inference tools "for
free."
> GUIDs by themselves are boring, it's the metadata that matters.
> 
> >
> > My general feeling is that most of the useful information is
metadata,
> > the
> > only candidates for data (in my mind) being the literal text string
> > (NameString), a ref pointer to a defined Documentation instance (if
> > any),
> > and perhaps position information (e.g. page) within the
Documentation
> > instance.  All of these might be intially recorded in error, but
could
> > be
> > corrected via the LSID revision id (version).
> 
> I think everything about a name probably is metadata, and doing this
> opens up the possibility of doing useful things with that metadata
(see
> above). In the LSID stuff we've been doing at Glasgow the only time we
> provide data is if we are serving images, in which case it makes sense
> to serve a stream of bytes (e.g., a TIFF image). In one sense, data
> isn't that interesting (or, put another way in many cases it requires
a
> person to make sense of it, whereas computers can handle metadata).

Am I right in thinking that the proposal means that all of the data for
taxon names or concepts or observations or whatever might be construed
as name usages will be described as meta-data for each GUID?
To me this meta-data is what we've been describing as data when
modelling TCS. 
I agree GUIDs aren't interesting in themselves but I'm not convinced
that computers can handle meta-data.....of the complexity we'd need to
model names, concepts, observations etc. I'd be interested in seeing
evidence of this - again I could be wrong......

Jessie

This message is intended for the addressee(s) only and should not be read, copied or disclosed to anyone else outwith the University without the permission of the sender.
It is your responsibility to ensure that this message and any attachments are scanned for viruses or other defects. Napier University does not accept liability for any loss
or damage which may result from this email or any attachment, or for errors or omissions arising after it was sent. Email is not a secure medium. Email entering the 
University's system is subject to routine monitoring and filtering by the University.