If we go down the road we seem to be traveling now, I think we must allow a wolf pack to be an Individual. It is a definable aggregation of biological individuals which can be assigned an identifier that can be the subject of repeated Occurrences and it can reliably be identified to some taxon. The definition we have doesn't (at this point) forbid that or say that the number of biological individuals in an Individual must stay the same over time. It is very convenient for the use case that got this all started (having a small patch of plants of the same species) that we don't require that the number change. A major feature of Individual is to allow sampling over time (a la the definition of individualID) and if the number of individuals changes between Occurrences I shouldn't have to redefine a new individual. It is quite likely that I won't know or don't care whether the number has changed.
Fair enough. Obviously, if we're going to allow for a population to be an Individual, and we're going to want to track that population over time, then the composition of the populaition is going to change over time. It just seems weird from a specimen perspective to associate the individualCount with the Occurrence, rather than with the Individual. But I have to agree with your points above.
previousIdentifications
Hmm. I suppose yes, but better to just have another instance of Identification. Why not?
When the data are structured that way at the source, yes. But a number of DwC terms exist because many content sources have not parsed/normalized all their data to the full extent of the DwC classes. Therefore, I think previousIdentifications should be kept, and if so, it should be part of the Individual class.
associatedSequences
I suppose you won't agree on this, but I don't see sequences as any different than other tokens/evidence types that I think we should allow to document Occurrences. I would like this term to eventually go away, at least for people using RDF who will explicitly create resources for tokens and then type them.
OK, well I guess "Sequences" per se are functionally equivalent to images, in that they are not the organism themselves, but rather a representation of some aspect of the organisms (in this case, a representation of the molecular structure of the DNA molecules contained within the cells of the organism, rather than a representation of light waves reflected off the exterior of an organism in the case of an image, or of x-ray waves transmitted through an organism in the case of a radiograph image). I was thinking more in terms of tissue samples -- which I will much more stubbornly defend as being in the Individual class -- but I guess more in terms of "individualScope".
Here's one thing I'm not so certain about, though. An in-situ image of an organism is clearly a token of an Occurrence, because it is evidence of the organism at the place/time. An image of the preserved specimen in a Museum, or an x-ray, etc., is not really a token of an Occurrence, because it's not evidence of the organism at the place/time of its capture. Same goes for Sequences -- they are a token of the Individual organism, not of the occurrence of the organism at a place and time. This is why I have a hard time thinking of such things as tokens of an Occurrence, when they are really more tokens of the Individual.
This also assumes that dwc:catalogNumber and dwc:otherCatalogNumbers be re-assigned to Record-level terms. Was there some reason this isn't appropriate? I think it is appropriate because they should be usable with at least two classes: Individual (for living specimens) and Occurrences (e.g. preserved specimens, images)
Hmmm...on that basis, should individualCount and the various tokens also be Record-level terms -- on the basis that they can apply either to an Occurrence, or to an Individual? Actually, in the case of DNA Barcodes and such, isn't it possible to also represent a DNA Sequence as an attribute of a Taxon as well? If the purpose of Record-level terms is to aggregate terms that apply to more than one class, then perhaps that is the solution for a number of these things (including disposition, and maybe even preparation -- depending on how broadly those things are defined)?
I haven't said this before, but are we allowing Individuals to be dead?
Errr....fossils? Preserved specimens? Are they not Individuals? I know you think of them in terms of tracking a living organism over time. But that's only one of the reasons why I support an Individual class (not even the main reason). To me, the main reason is that an "Individual" represents the actual organism(s), separate from an Occurrence, which represents the presence of an organism at a particular place and time.
If we put it in a jar of alcohol and cut it into many separately-cataloged pieces, are all of the pieces still some of the Individual?
This is why we need two things for an Individual:
individualScope (which can range anywhere from the aggregates of multiple individuals, all the way down to the smallest parts of individuals)
And, a mechanism to track series of "derived from" Individuals. The ASC model covered this, I think (right, Stan?)
I think Pete might have been suggesting modeling things that way with "partOf". What if we cut a branch from a tree, glue part of it to a page and turn part of it into a DNA sample that get sequenced. Are those all a part of an Individual?
It seems to me that each unit could represent a separate instance of Individual, but the "parts" need to be clearly aggregated around the single-organism parent Individual, which itself may be a part of another Individual instance that is an aggregate lot of specimens, which itself could be a subsampling of another Individual instance that represents a population in nature.
In my mind, we parse all of these things as separate instances of Individuals, but join them via a hierarchical (parent/child) relationship. If I'm not mistaken, this is how the ASC model managed instances of BiologicalObject (again....right, Stan?)
I don't really want them to be, but maybe I must? Somehow we need to be able to handle road-kill, which will be dead when we make the observation/collection. If we cut a branch from a tree (an Individual), root it, and grow it in a botanical garden, do we call the resulting tree in the garden the same Individual? I would assign it a new identifier and call it a new Individual. I guess my point is that I would only apply the term Individual to dead stuff, pieces of dead stuff, and living pieces of things with extreme caution.
Why extreme caution? What are the risks that we are cautioning ourselves against?
These are some principles that I always try to keep in mind when discussing these things:
- DwC is a data exchange standard, not so much a physical data model. - There is a necessary balance between structuring DwC around how data actually exist in content-provider databases, and how data *should* be represented in a normalised world - When in doubt, DwC should be accomodating, rather than restrictive -- especially when more restrictive needs can be met via associated data filtering
There are other principles as well, but these are the ones I keep having to remind myself of.
I think maybe so. Maybe the appropriate course of action here as well is to let people try different approaches out and if they turn out to work and be needed, then we talk about applying them to Darwin Core.
Ultimately, I think people will use it in accordance to what terms are nested within it -- which is why I think it's important to have this conversation we're having now.
Aloha, Rich