Hi All
Following up on Rich and Roger's discussion.....
Really?? Which two? I assume you mean "one kind of GUID" for taxon
objects, vs. "two kinds of GUIDs". But earlier Rod Page had proposed
three
different "levels", with GUIDs for all three. Others seemed to like
to
three-level approach, but some slightly re-defined what the three
levels
ought to be. I pointed out at least 8 different "units" of a name,
not
all
of which could be unambiguously pidgeonholed into either a "TaxonName"
object or a "TaxonConcept" object. You advocate two kinds (levels?
types?
domains?) of GUID for taxon objects (i.e., Name & Concept).
Personally, I
think the ultimate goal should be as you describe: separate GUID
domains/kinds/types for TaxonNames vs. TaxonConcepts. But Rod Page
and
others contributing to this thread have convinced me that this might
be
reaching for too much at this stage of the game. Thus, I am
comfortable
with the idea of establishing one domain/kind/type of GUID within the
taxonomy realm, that is the most flexible and unambiguous of them all,
and
which can serve as a "stepping-stone" and/or surrogate to a future
informatics world where we have an unambiguous distinction between a
"TaxonName" object and a "TaxonConcept" object. Importantly,
establishing
GUIDs for usage instances now does not carry the risk that they will
be
rendered useless in the future, because I think there will always be
value
in identifying unique NameUsage instances.
Perhaps what we should do is each put together a page on the
wiki expounding either of the approaches. It will be easier
for some one coming along later to get up to speed on
the arguments and add to them. We then have both cases clearly
stated and we can see which one a consensus builds around.
I agree with this, and took a hasty first stab at it with "Scenario 5"
on
the page you created:
http://wiki.gbif.org/guidwiki/wikka.php?wakka=SeparateNamesAndConcepts
If I can find time, I'd be happy to expand it on a dedicated page,
using a
real-world example (maybe a PDF excerpt of a representative
documentation
instance). I'd be happy to provide the PDF example, if you wish.
I find this all very interesting....
What Rich is proposing here is what the TCS originally proposed to the TDWG community in terms of a model for Taxon concepts and names. However as Rich pointed out I was of the view that if this were to be implemented then we should focus on "original concepts" i.e. the first usage of the name as a result of some taxonomic work and "revision concepts" where a name was redefined as a result of some taxonomic work.
What I was doing was to try and restrict the number of "potential concepts" that we had to deal with. As everyone on this list is probably aware as soon as we talk about concepts there are those who will take us into the realm of every time a concept is thought of in someone's minds we have another potential concept which is clearly not practical for our purposes. So using intent to define new concepts limited the number of concepts to be managed (and IMHO would link the taxon concept work to better science).
My argument for dealing only with original concepts and revision concepts and not names was that the nomenclators were really capturing information on original concepts (but without explicitly storing all the definitional information in their databases). So couldn't we somehow leverage this to get progress on names and concepts.
Where the original TCS approach differs from name usage was that I didn't want just any usage of the name to be confused with well defined taxon concepts or at least (as Roger explained) concepts where there was an intent to define a concept rather than say identify an instance to some existing taxon concept or even just refer to one in a text. General name usage might be useful in the long run but in terms of helping us decide what people have recorded information about out there, but to use them as a basis for, say, data integration issues and resolution then I'm not so sure it'll help us at first anyway.
However this idea was rejected by the community - people didn't want to confuse names with concepts as much as I tried to convince them otherwise - and one of my reasons was as Rich points out you would have a single GUID "type". So we changed TCS so that we could have TaxonNames and TaxonConcepts as separate top level elements (implying separate GUIDs). I think one of the main reasons for this was that people seemed to think that we could sort the names problem more easily and then we could use this "agreed" list of names as the basis on which to build all other systems including taxon concepts. I was of the view that if this was what was needed to progress things then I was happy to separate them out (knowing in my mind that we would need now at least 2 GUIDS for what came from the same publication in the case of original concepts - we would have the name details denoting the publishing of the name for the first time and then we would have the concept as defined in the same publication - probably same page etc. and possibly a nominal concept so that I could identify something as this "name" but using a taxon concept GUID).
So now we were at two "types" of GUIDS. Then Rod suggested a 3 level approach - the two just mentioned and name strings. The discussion between Rod and Rich on why give a name string a GUID was interesting. I could see Rich's point that the GUID is a name string so what's the difference really, you could match on name strings just as easily as GUIDs (maybe). Of course what came out of that was that Rod's name strings weren't simply name strings as Rich (and I and maybe others) had originally thought but that they were name usages, hence I guess Rich's renewed support for the name usage approach (still not clear if name usages here are any name usages or for intended taxon concepts as Rich suggested in a previous email on the topic).
So it seems we've come full circle again and a re still asking the same questions.
Do we want a list of name strings without any attribution? What good would this be. I guess as an index into other sources as others have saidpossibly to help with spelling mistakes etc which could be mapped to possible names or concepts, but at that point I'd see them turning into name usages because people would want to know where it was used so they could get a better idea of what was meant. Trouble is as soon as we start attributing names to publications or any other source we get this enormous explosion of name usages and then we need to deal with the equivalences between them. Which is really the taxon concept problem but multiplied to the power ?
So I vote for 2 GUID systems for the time being with the proviso that TaxonName GUIDs not be allowed to refer to concepts i.e. not for identifying things or describing things, TaxonConcept GUIDs should be used here. But if this was to happen what would a name usage be - would it refer to TaxonName GUIDs or TaxonConcept GUIDs (possibly Nominal)or to neither? I guess it could be either of the first two but probably will be the last, until it becomes practice to refer to TaxonNames and TaxonConcepts using their GUIDs.
I think we really need to support different groups sorting out the problem in different taxonomic groups. I guess in a way much like the Catalogue of Life has been doing. However I'd still see IPNI doing plants, Index Fungorum doing fungi and zoobank doing animal TaxonNames etc and then whomever is working in TaxonConcepts i.e. producing descriptions or definitions for things we really think are out there to publish TaxonConcepts with GUIDs from their perspective while referring to IPNI or zoobank TaxonNames. Slowly the TaxonConcepts will become more self referencing but in the meantime what we'll need is the aggregators to start relating all the TaxonConcepts which have been defined for resolution services. Currently I see the Catalogue of Life as a group producing concepts but which aren't maybe always well tied down to a clear definition or description but which could easily be addressed. So they give a view of the concepts in the world and could act as a good starting point for linking other concepts to.
Jessie
This message is intended for the addressee(s) only and should not be read, copied or disclosed to anyone else outwith the University without the permission of the sender. It is your responsibility to ensure that this message and any attachments are scanned for viruses or other defects. Napier University does not accept liability for any loss or damage which may result from this email or any attachment, or for errors or omissions arising after it was sent. Email is not a secure medium. Email entering the University's system is subject to routine monitoring and filtering by the University.