Hi Gregor,
Again, I think we need to pin down the concepts before we spend too much time thinking about what the terms should be. I understand that it's difficult to discuss the concepts without the terms, but focusing on the terms first is like putting the cart before the horse. Perhaps we should follow Chuck's advice and use latin terms or some other kinds of labels that do not come with so much baggage -- so we don't attach too many pre-conceived notions. The only terms in this space that have "standard" definitions are "Individual" (as inferred from dwc:individualID) and "CollectionObject" (from historical definitions in the MVZ/ASC models). The DSW terms are also useful and well-defined, but are not as well established as standards.
All of the terms you propose ("Sample", "Unit", etc., as well as a number of others such as "item" and "object") were very-much in consideration when we were thinking about these things, but all of them have similar levels of weaknesses to other term options (most people think of a "unit" as a term applied to units of measure, and "Sample" implies the collection or removal of something from nature). We eventually went with "Individual" because it is already defined in DWC (though not well defined).
But let's see if we can have a discussion first about the different concepts we are talking about. Can we nest all of these ideas under a single superclass? Or do we need different top-level classes to define/distinguish:
- Biological things from non-biological things; - Collected/extracted things from things that are observed/documented in-situ; - Things that are taxonomically homogeneous from things that include taxonomic heterogeneity?
Also, to what extent are the things we want to discuss best represented as a nested hierarchy? For example, if an aggregate (school, colony, herd, etc.), has multiple specimens collected at the same time, and then some of those specimens have parts/tissues removed from them, and then some of those tissues have subsamples taken or destroyed for analysis..... are those all instances of the same class of thing related via parent/child hierarchy? Or is each step a separate class of thing? If the latter, how many classes of such things do we need?
I used the word "thing" as much as I could to avoid the term-baggage issue; but another term that I have found useful in these conversation is "Object". As I wrote to Chuck off-list, the top-level superclass might be thought of as "physicalObject"; under-which some of these other concepts (e.g., "biologicalObject" and others) can be represented either as subclasses, or as key properties of instances within the superclass.
Aloha, Rich
-----Original Message----- From: Gregor Hagedorn [mailto:g.m.hagedorn@gmail.com] Sent: Sunday, May 26, 2013 7:37 PM To: Richard Pyle Cc: Steve Baskauf; John Deck; TDWG Content Mailing List; Robert Whitton; Walter G. Berendsohn; Anton Güntsch Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to material samples
"population" or "taxon" -- things get very messy. During the previous discussion, it seemed that everyone agreed that "taxon" was too broad and not useful for our purposes; whereas "colony", "school", "herd", etc. were absolutely necessary (lest we need to treat every polyp on a coral head as a separate database entry, and many use cases involve treating herd/pod/flock as an important unit to be able to track in exactly the same way that "individuals" are tracked -- which is why the definition of dwc:individualID is as it is). The border-line term is "population" -- which is cleary below the realm of taxon, but perhaps a bit too vauge and poorly defined to be regarded as the same
class as an individual.
Yes, and I am also thinking of metapopulations, just for fun...
I understand that we need an operationally defined term with a particular use case. Primarily I agree with Chucks observation that an important
point is
that we believe to understand what an individual is, both in biology and
in
informatics and philosophy, so I just want to warn about the use of this. Secondarily I do believe we need to define it in a way that the upper
limit of
the set we want to refer to becomes clear, at least operationally. I was missing that.
WIth respect to the operational usefullness of a term that does not distinguish between part, individual, set of taxonomical homogeneous OR heterogeneous samples (you excluded tax. heterogeneous, which would exclude lichens, most plants (mycorrhiza!) or any other symbiosis like us humans with our skin and intestine microbes) I have two proposals:
Broadly the term "sample" implies not necessarily physical sampling, you
can
sample observations or data. A sample is a subset of a (statistical) population, selected under operational rules and encompasses the individual and part. Think of sampling music.
The other term is unit. What you describe has been discussed extensively
in
the CDEFG and later ABCD papers and standard and the term "Unit" was chosen for as far as I understand exactly the concept you are looking
for.
Perhaps that can be followed, if we accept that these are not "natural"
units,
but operationally defined units (think of OTU, oper. tax. units). Unit was chosen in ABCD because it applies both in a sampling context, in an
analysis
context, and in a collection curation context.
Apologies should I repeat something already discussed previously.
Gregor