Re: GUIDs for Taxon Names and Taxon Concepts
Rich,
OK - so I didn't get all the stuff about a NameString occurring many times within a multi page document (perhaps spanning chapters and or contributions from other authors or not as the case may be) and the nature of documentation etc. I'd still like you to sharpen up the definition though.
"Any occurrence of a NameString as it appears or is explicitly implied within some form of static documentation."
"explicitly implied" is actually contradiction in terms:
explicitly = "Fully and clearly expressed; leaving nothing implied." (http://dictionary.reference.com/search?q=explicitly)
implied = "not directly or specifically made known" (http://dictionary.reference.com/search?q=implied)
My formal logic is self taught and very shaky but I believe that from a statement containing a paradox one can argue almost anything. Check out wikipedia (http://en.wikipedia.org/wiki/Contradiction#A_paradox_involving_contradiction). This may explain why you can logic things out of your definition that I don't see as following.
"Unfortunately, the level of disagreement (not just from me) as expressed in this discussion makes it clear that we have not yet nailed it down."
I don't actually see a great deal of disagreement here. We are debating between only two alternatives. It could be much worse. There could be 5 alternatives and 12 people chipping in and disagreeing about what they mean by each alternative - then it would be much harder to reach consensus.
Perhaps what we should do is each put together a page on the wiki expounding either of the approaches. It will be easier for some one coming along later to get up to speed on the arguments and add to them. We then have both cases clearly stated and we can see which one a consensus builds around. If we do this we will then be able to move forward with other issues and for each issue we will be able to look at the implications of going with either of our two ways forward. I guess we will soon get bored of one choice and it will die a natural death or something will come along to replace them both.
This way we can move on to other stuff without having to make a decision on this one right now.
If all else fails we can flip a coin in February next year (just kidding on that one).
I'll probably not get my page finished till later in the week as I have other commitments - but certainly by Friday.
All the best,
Roger
Richard Pyle wrote:
Hi Roger,
Sorry, I wasn't clear enough. I think the simplest way to clarify is to answer your questions directly:
So we have a 'static' document and it is chuck full of NameStrings. I have
circled some of them.
We have Ditrichum cornubicum (a red data book moss). It is mentioned, with
a picture on the
previous page and at the top of this page it is mentioned a few more
times. Further down this
page we have Buxbaumia aphylla which is also mentioned twice. There is a
picture of it on
the next page.
So how many name usages do we have here?
One. A "NameString" is defined as a string of textual characters meant to represent a name of biological organisms. That there may be many replicate copies of the *same* NameString within one documentation instance does not mean there are many NameStrings (keeping in mind the "or is explicitly implied" part of the definition).
Does each mention of the name on the page count as a usage?
No, because they are replicate copies of the same NameString; not different NameStrings.
- would seem to be a silly thing to do.
Yes, it would.
Does mentioning the name on different pages mean different usages?
Not if those different pages are within the same scope of the documentation instance. Sometimes, however, what seems like a single documentation instance may be subdivided into several documentation instances (e.g., chapters in a book), so it depends on what the scope of the documentation instance is. Please note that this realm of ambiguity exists completely independently of the problems we're discussing about GUID "units" in taxonomy, and thus applies equally to all solutions to the question of what object(s) should receive a taxon GUID.
- would also be silly but we don't have anyway to judge
(different pages within a journal or combined work for example?)
We judge based on the defined scope of the documentation instance (hence my parenthetical comment, "assuming sufficient metadata for identifying a documentation instance").
How about same page but different context? The picture may be of a different moss to the one that they mention in the text.
A picture is not a NameString. Multiple replicates of the same literal or explicitly implied NameString within the same defined documentation instance do not constitute multiple NameStrings, and hence do not represent multiple usage instances.
If a subspecies is mentioned does that count as a usage of the specific name (it has been used)
Yes, of course. True even for autonyms, such as "Pseudanthias ventralis ventralis", being a separate NameString from "Pseudanthias ventralis" (the former consisting of 32 textual characters (including the spaces), and the latter consisting of 22 textual characters (including the space).
and likewise a binomial implies a usage of the genus name.
Yes, that is correct. Just as trinomial implies usage of a species name as well as a genus name. Part of "explicitly implied".
There are around 1100 species mentioned in this publication. They are probably mentioned on average 3 times each (a guess) so that is 3300 new name usages.
No, it is 1100 NameString usages that include spaces as part of the NameString (i.e., species-group names), plus an additional 300 NameString usages at the genus level (a guess), plus whatever NameStrings at higher ranks.
Again, a NameString is identified as a string of textual characters meant to represent a name of biological organisms. Multiple copies of the same NameString meant to represent the same name of the same biological organisms within a defined documentation instance do not constitute multiple NameStrings, and hence do not represent multiple NameUsage instances. The "meant to represent a name of biological organisms" part of the definition is necessary to address the fact that homonyms are meant to refer to different names, even though they share the same NameString. Thus, the appearance of two separate homonymous NameStrings intended to represent two separate names, constitute two separate NameUsage instances (in need to two separate GUIDs).
I really can't see how one would apply your definition.
How about now, after I have clarified the definition?
Perhaps if you restricted it to taxonomic works
That is a question of the scope of documentation instances; not the scope of NameUsage instances. I would suggest leaving the definition flexible and broad.
It certainly isn't clear to me.
Is it any more clear now? If not, please let me know what parts don't make sense.
We can easily define what a TaxonConcept is because it implies intent. If I want to create an object that I want you to refer to as a definition of a taxon then I am creating a TaxonConcept and should issue a GUID to make it easy for you to refer to it. If not then I shouldn't bother. If I want to use the services of a nomenclator to define the publication and typification of the name I am using then I can use a TaxonName GUID within my definition - but I don't have to. I can't see how that can be any simpler than that.
Evidently we have different ideas about the definition of the word "simpler"... ;-)
Porley & Hodgetts (2005) have no intension whatsoever of 'committing' nomenclatural acts or of defining any taxa that people will later refer to. They are simple referring to existing concepts.
Really? Who makes that decision? You? Porley & Hodgetts (2005)? SEEK? GBIF? There is an almost perfectly smooth transition gradient in published literature (let alone unpublished documentation sources) between name-only usages and full-blown highly-defined TaxonConcepts. If you advocate two "kinds" of GUIDs (one for TaxonNames, and one for TaxonConcepts), then an arbitrary line needs to be drawn between those that are TaxonName objects, and those that should be treated as TaxonConcept objects. This discussion thread has made clear that this arbitrary line would be drawn in different places by different people. Why not start a GUID system from the simplest and most flexible approach (which also happens to be the most objective in terms of identifying objects to which GUIDs should be assigned), and condense as much of the subjectivity as possible into the second-tier informatics task of establishing links among GUID-represented objects? That second-tier task is already rife with subjectivity and ambiguity no matter how many "kinds" of taxon GUIDs are established.
Yet by your definition they have created over 6k name usages that a diligent publisher might issue GUIDs for.
Well....probably more like 1.5K Name usages. And who is the "publisher" in this case? The publisher of the book? Is that who we expect will assign the GUIDs? Or do you mean the publisher more generically -- as a publisher of GUID assignments (like a data provider)?
Have I completely misinterpreted you definition?
Well...."completely" might be too strong of a word. The word "mostly" seems like a more appropriate adverb in this case.
If so could you define it a little tighter?
I thought I had....let me know if this message does not clarify the definition.
If you imply that the author has to have meant to describe something then
you are just
creating the TaxonConcept definition I am working with here. How else can
you subset
all the times names appear in print?
Well....there minimally has to be sufficient evidence that the author intended a given NameString to represent biological organisms -- so that we can at least classify the NameString as a "name" (as distinguished from the large quantities of other non-name words that might appear within a documentation instance). But whether or not the described "something" rises to the level of a TaxonConcept, or is merely intended as a reference to a pre-existing TaxonConcept (or something else entirely), is an ambiguity that I do not think should clutter the discussion about GUIDs for taxonomic objects.
This is all great fun but we do need to nail it down and move on.
I agree 100%. Unfortunately, the level of disagreement (not just from me) as expressed in this discussion makes it clear that we have not yet nailed it down.
Finally, let me know if you want me to clarify what I mean by "explicitly implied" in my one-sentence definition. This is where I expected some argument, because it is the one area where there is ambiguity in my approach (other than the documentation instance thing -- the ambiguity of which applies equally to all of the proposed taxon GUID solutions). I don't believe that any solution is completely free of ambiguity -- but I believe that NameString Usage instances are (by far) the least ambiguous of objects, and can flexibly serve as "handles" to both TaxonName objects (by any definition) and TaxonConcept objects (in the sense that we all seem to accept that "Aus bus Smith 1995 SEC Jones 2000" can serve as a "handle" to a TaxonConcept instance).
Aloha, Rich
Richard L. Pyle, PhD Database Coordinator for Natural Sciences and Associate Zoologist in Ichthyology Department of Natural Sciences, Bishop Museum 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
--
------------------------------------- Roger Hyam Technical Architect Taxonomic Databases Working Group ------------------------------------- http://www.tdwg.org roger@tdwg.org +44 1578 722782 -------------------------------------
participants (1)
-
Roger Hyam