I am back in the land of Internet access again after several weeks of glorious lack of it.
There have been a number of posts regarding John Wieczorek's proposed resolution of the "class Individual" proposal (http://code.google.com/p/darwincore/issues/detail?id=69) though the creation of the "BiologicalEntity" class (http://lists.tdwg.org/pipermail/tdwg-content/2011-July/002575.html). In these posts two general issues raised:
I. Competency questions, i.e. how will the creation of this class help us do something useful? II. What sorts of resources should be instances of this class, i.e. how should the class be defined?
These general questions were previously discussed at length between October 2009 and February 2011 concurrently with general questions about the meaning of "Event", "Occurrence","Taxon" and other existing DwC classes. Since it requires many hours to review the many tdwg-content posts during that period, I will refer to a summary at http://code.google.com/p/darwin-sw/wiki/TdwgContentEmailSummary for anyone who is interested in skimming the main points of that discussion.
When I proposed the Individual class, I wanted to explicitly recognize a class of resources that would facilitate three things, which I suppose are "competency questions" of a sort: 1. To allow for linking multiple Occurrence records that involved the same organism at different times and/or places (i.e. the intended purpose of the existing dwc:individualID term; i.e. to facilitate resampling). Examples would be mark/recapture, radio tracking, photo-identification of whales, tracking the status of a sessile organism over time, etc. 2. To allow for the linking or grouping of multiple forms of evidence associated with the same organism which may have been collected at the same time and place (multiple forms of documentation for a single occurrence) or during several Occurrences. Examples would be collections of several images, several specimens, or both images and specimens from the same organism. 3. To link multiple Identifications of the same individual organism, particularly when these Identifications were based one different pieces of evidence arising from the same individual. For example, if "duplicate" specimens from the same organism ended up in different museums they might be assigned different Identifications. An Identification asserted for one specimen would apply to other specimens from the same individual organism even if that Identification wasn't explicitly assigned to the other specimens. I have referred to this function as "inferring duplicates" - it could also be called "tracking duplicates" if the two samples were known to have arisen from the same individual organism before they were identified rather than after. (Since I'm not really up on the technical definition of "competency question" please excuse me if I'm misapplying the term to mean "what we want something to be able to do".)
As the discussion progressed, it became clear that others wanted the proposed Individual class to do other things as well. In particular, Rich Pyle articulated a desire to provide a mechanism for grouping in a hierarchical manner pieces of physical evidence that were derived from living organisms. These could include aggregates of organisms, organisms, and pieces of organisms. After some rather heated discussion, Kevin Richards put his finger on the critical distinction between what I wanted the Individual class to do and what Rich wanted: "it seems like Steve's idea for the Individual more closely resembles a many-to-many joining table in a database (ie doesn't serve much use other than connecting two tables/classes together - and doesn't normally relate to a 'real world' type of object). Whereas it seems Rich's idea is to relate it more to 'real-world' objects, such as samples, re-samples, etc, to allow tracking and connectability of the observed/collected/processed individuals..." (http://lists.tdwg.org/pipermail/tdwg-content/2010-November/001956.html). After thinking about this for a very long time, I've become convinced that Kevin was on the right track. It seems to me that what we have here is two different sets of "competency questions" which define sets of entities that overlap but which are not congruent. An actual single, live organism can serve both as a unit for resampling and "attachment" of Identifications, AND as an organizational unit that is part of and which has parts that are biological samples. Some other entities, such as cohesive pack/herds and clonal organisms can also serve both purposes. Other entities cannot: it doesn't make sense to resample dead organisms or pieces of organisms, and an identification applied to part of a taxonomically heterogeneous unit (e.g. a mixed flock of birds) cannot be reliably inferred to apply to another part of the same unit. Neither intended purpose/definition of "Individual" (the many to many database join or the mechanism for hierarchical grouping of physical evidence) is intrinsically "wrong", they simply facilitate different competency questions.
So I believe that the answer to question II (how do you define the class?) is better answered by saying "instances of the class are resources that facilitate the competency questions" than to base the definition on a philosophical discussion of what people imagine an "Individual" or "BiologicalEntity" to be (I think I'm agreeing with the point Bob was trying to make). I think that at a minimum, functions 1 and 2 (defining an entity to which multiple occurrences can be linked and to which multiple forms of evidence can be linked) must be accommodated by the Individual/BiologicalEntity class; the utility of those two functions is certainly implied by the already-existing term dwc:individualID .
Personally, I would like for the third function ("inferring duplicates"; linking multiple Identifications to the same entity and being assured that all Identifications of the same Individual would apply to all artifacts associated with that Individual) to be accommodated by the definition, but the current state of the discussion is unclear on this point. The sticking point is related to the definition of "taxonomic homogeneity". When I used that term, I intended for it to mean that the entity is believed to be homogeneous to the lowest possible level in the way that one knows that two branches from the same tree or two parts of the same clonal organism are guaranteed to have the same taxonomic identify at every level. This is NOT the way Rich was using the term - he explained that he intended that a "taxonomically homogeneous" biological entity could be an aggregate of organisms that are known to be heterogeneous at a lower taxonomic level but that for whatever reason were identified at a higher taxonomic level common to all organisms (e.g. we know that the five fish in this jar are different species of fish, but we are going to identify the lot as class=Actinopterygii for the time being). However, Cam and I have previously defined such entities as "taxonomically heterogeneous" (see http://code.google.com/p/darwin-sw/wiki/TaxonomicHeterogeneity ). I would prefer that "taxonomic homogeneity" be restricted to the definition that I intended simply because I don't know of any other thing to call something that is believed to be taxonomically homogeneous to the lowest possible level. The reason why this is important is that if an Individual/BiologicalEntity is allowed to be taxonomically heterogeneous (sensu darwin-sw) then one cannot infer that discovered "duplicates" share all Identifications given to any duplicate pieces of evidence, whereas if an Individual/BiologicalEntity is required to be taxonomcally homogeneous (sensu darwin-sw, NOT sensu Pyle) then one can make that inference. This is a circumstance that occurs fairly regularly when comparing specimens in different herbaria - one discovers two "duplicate" specimens that have been assigned different Identifications and one infers that both Identifications apply to the source tree, clump of moss, small population of herbs, etc. (i.e. Individual/BiologicalEntity). On the other hand, if a taxonomically heterogeneous (sensu darwin-sw) marine trawl sample is subdivided into two samples that are sent to two museums, one could not safely infer that an Identification made based on one sample could be applied to the other sample. The two subsamples may contain sets of fish that can be identified to taxa that are at a lower level that the initial Identification of the conglomerate sample and which are not the same.
So I'm not saying that the definition of Individual/BiologicalEntity MUST facilitate my competency question 3. What I'm saying is that we MUST make it clear whether or not we intend for the proposed class to be restricted to taxonomically homogeneous entities (sensu darwin-sw) because that will determine whether the class will facilitate competency question 3 or not. It would be a bad thing for different people to have different understandings about the restrictiveness of the term. If there is a consensus that an Individual/BiologicalEntity should be taxonomically homogeneous, then to some extent that provides a practical functional definition of what kinds of things should qualify as Individuals/BiologicalEntities. A herd of caribou, a clump of moss, a tree, a small uniform patch of herbaceous plants, a coral head, a tissue culture sample, and a network of slime mold would be Individuals/BiologicalEntities because there is a reasonable expectation that they are taxonomically homogeneous. If there is a consensus that an Individual/BiologicalEntity need NOT be taxonomically homogeneous, then pretty much any kind of thing involving life would qualify: all of the animals in Yellowstone Park, all the jars sitting on shelves in the Smithsonian Institution, the Great Barrier Reef, etc. If the definition is broadened to that level, then I'm left wondering what competency questions the proposed class could still serve.
This email is now at or has exceeded the length of an email that many people will take the time to read. So I will draw it to a close and post a separate email on the topic of competency questions for John's proposed class "CollectionObject" which I believe address Rich's desire to track "real-world" objects (samples, re-samples, etc.).
Steve
Bob Morris wrote:
There is a series of jokes, and an entire TV quiz show, essentially starting from the meme "What is the question to which the answer is <X>". Now, I am not a biologist (surprise!), so it is likely that domain ignorance leaves me unable to understand whether all the postings in the thread about new DwC term resolution are arguing from the same set of questions their authors hope to have answered by a resolution of the term "Individual". (It's even a little unclear to me whether everybody has the same notion of "resolution of a term", but that's a whole different discussion, which would contain a lot of uses of "rdf:type" and the contentious "rdfs:domain").
I speculate that lengthy term definition debates would be shorter if they started with agreement on competency questions for the term. Competency questions are sort of usage scenarios cast as questions. See http://marinemetadata.org/references/competencyquestionsoverview .
Bob Morris
On Thu, Jul 14, 2011 at 2:41 AM, Richard Pyle deepreef@bishopmuseum.org wrote:
My turn to disagree (strongly, in this case). It's not an instance of a taxon, it's an instance of an Organism. A taxon is merely a non-factual (i.e., opinion-based) attribute of an organism, secondarily associated via an Identification instance.
I could probably be comfortable with "OrganismInstance"; but in that case, why not just "Organism" as Paul suggested? Isn't "Instance" sort of implied by all the classes?
I am certainly open to debate about where the "upper boundary" of an instance of this class, and I agree that "population" could be interpreted more as a low level of "taxon", rather than a high level of "organism". But I certainly don't think that instances of this class should be limited to a singular organism. Would a coral head then constitute thousands of instances of this class? Surely such colonies could be collapsed into a single instance of this class. And the same would likely also be useful for colonies of insects (ants, termites, bees, etc.), as well as small groups (pack of wolves, pod of whales, etc.); not to mention a specimen "lot" in a Museum collection.
I agree it should have only *one* taxon, but that there should be no upper limit on the rank of this taxon. If more than one taxon is identified, then there needs to be a separate instance of this class for each identified taxon. But this only applies when multiple taxa are acknowledged -- it does NOT restrict multiple taxa being linked to the same instance via multiple identifications when there is a difference of opinion about what the correct taxon identity should be. In other words, an instance of this class may be identified as "A" *or* "B", but could not legitimately be identified as "A" *and* "B" simultaneously (except, perhaps in the case of hybrids, but that's another situation altogether).
More later.
Aloha, Rich
-----Original Message----- From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content- bounces@lists.tdwg.org] On Behalf Of Gregor Hagedorn Sent: Tuesday, July 12, 2011 10:09 AM To: Steven J. Baskauf Cc: tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] New terms need resolution: "Individual"
represent a single taxon. I think that Individual is probably not a good name due to confusion with the technical use of that term
elsewhere.
TaxonInstance seems to me to be perhaps most precise. Personally I have a problem merging individual with population, since population -> metapopulation -> subspecies form a continuum in my understanding. But I am quite willing to be pragmatical :-)
Gregor _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content