Another quick reply re this thread...
One GUID system for Name Usages - why stop there - why not one GUID system for anything? We are drawing arbitrary lines in trying to scope what should or should not be encompassed in this GUID system. Why are we doing this? I would say to make it easier to use and refer to taxon names, concepts, observations, general usages etc. consistently so that the biologists can do better science. So what do biologists need to know - the main thing (IMO) is what they are talking about so that when they communicate electronically references are more meaningful/accurate.
I would've thought that they would want to know that if they get a GUID for a concept then they aren't using a name or vice versa.
One GUID system I think will encourage people to duplicate effort and publish GUIDs for the sake of showing they've done something because that's the easy bit. We should encourage one GUID for one thing be that a concept or name or whatever rather than a GUID for every representation of every electronic version of a record representing one thing (I know this will be hard but we should work towards that otherwise we're wasting people's time and money)
I agree from an implementation point of view one GUID system is easier but from a user of the system I'm not so convinced or form an implementer of a system trying to deal with all of the different types of things you might get returned when all you want are names or whatever.
I would point out that within the "TaxonName" flavor, there may be "subflavors" like "Basionym", "New Combination", and "Orthographic Variant" -- just as within the TaxonConcept flavor there are "Original", "Revision", "Nominal", etc. subflavors. (Apologies for my
Amerikanized
spelling...)
Before we have a flurry of name flavours (yuck), I'd suggest that many of these "flavours" are really relationships between names, e.g. "Physeter catadon" is a misspelling of "Physeter catodon". Hence, I'd argue for having GUIDs for these two name strings, and in the metadata for one name have something like
<isMisspellingOf rdf:resource="urn:lsid:authority:namespace:123" />
Hence, I think before we embark on adding complexity, why not have GUIDs for names and then specify relationships between these names? This is what I've been doing with the LSIDs we serve here.
And of course this is what TCS has done in modelling the TaxonNames element. There are explicit relationships you can have between names such as misspelling of or basionymof. These relationships can't exist between taxon concepts just as a name being included in another name can't exist between taxon Names.
One of the reasons we were asked in TCS to split Taxon Names form Taxon concepts.
We could do with an ontology for these relationships so that, for example, if somebody asks for the synonyms of name "x", and two names are linked by the relationship "isBasionymOf" we recover that relationship (in other words, our tools know that 'isBasionymOf" is a kind of synonym).
Again, let's try and keep things simple.
I'm not convinced at the end of the day this would be simple but....
This leads to a much more general topic, but I think if we adopt RDF for our metadata we gain a lot of query and inference tools "for
free."
GUIDs by themselves are boring, it's the metadata that matters.
My general feeling is that most of the useful information is
metadata,
the only candidates for data (in my mind) being the literal text string (NameString), a ref pointer to a defined Documentation instance (if any), and perhaps position information (e.g. page) within the
Documentation
instance. All of these might be intially recorded in error, but
could
be corrected via the LSID revision id (version).
I think everything about a name probably is metadata, and doing this opens up the possibility of doing useful things with that metadata
(see
above). In the LSID stuff we've been doing at Glasgow the only time we provide data is if we are serving images, in which case it makes sense to serve a stream of bytes (e.g., a TIFF image). In one sense, data isn't that interesting (or, put another way in many cases it requires
a
person to make sense of it, whereas computers can handle metadata).
Am I right in thinking that the proposal means that all of the data for taxon names or concepts or observations or whatever might be construed as name usages will be described as meta-data for each GUID? To me this meta-data is what we've been describing as data when modelling TCS. I agree GUIDs aren't interesting in themselves but I'm not convinced that computers can handle meta-data.....of the complexity we'd need to model names, concepts, observations etc. I'd be interested in seeing evidence of this - again I could be wrong......
Jessie
This message is intended for the addressee(s) only and should not be read, copied or disclosed to anyone else outwith the University without the permission of the sender. It is your responsibility to ensure that this message and any attachments are scanned for viruses or other defects. Napier University does not accept liability for any loss or damage which may result from this email or any attachment, or for errors or omissions arising after it was sent. Email is not a secure medium. Email entering the University's system is subject to routine monitoring and filtering by the University.
participants (1)
-
Kennedy, Jessie