Re: [tdwg-content] tdwg-content Digest, Vol 20, Issue 17

4 Nov 2010

      ...
If we go down the road we seem to be traveling now, 
I think we must allow a wolf pack to be an Individual.  
It is a definable aggregation of biological individuals 
which can be assigned an identifier that can be the 
subject of repeated Occurrences and it can reliably be 
identified to some taxon.  The definition we have doesn't 
(at this point) forbid that or say that the number of 
biological individuals in an Individual must stay the 
same over time.  It is very convenient for the use case 
that got this all started (having a small patch of plants 
of the same species) that we don't require that the 
number change.  A major feature of Individual is to 
allow sampling over time (a la the definition of 
individualID) and if the number of individuals changes 
between Occurrences I shouldn't have to redefine a 
new individual.  It is quite likely that I won't know 
or don't care whether the number has changed.
Fair enough.  Obviously, if we're going to allow for a population to be an
Individual, and we're going to want to track that population over time, then
the composition of the populaition is going to change over time.  It just
seems weird from a specimen perspective to associate the individualCount
with the Occurrence, rather than with the Individual.  But I have to agree
with your points above.
...
...
previousIdentifications
Hmm.  I suppose yes, but better to just have
another instance of Identification.  Why not?
When the data are structured that way at the source, yes.  But a number of
DwC terms exist because many content sources have not parsed/normalized all
their data	to the full extent of the DwC classes.  Therefore, I think
previousIdentifications should be kept, and if so, it should be part of the
Individual class.
...
...
associatedSequences
I suppose you won't agree on this, but I don't see sequences 
as any different than other tokens/evidence types that I 
think we should allow to document Occurrences.  I would like 
this term to eventually go away, at least for people using 
RDF who will explicitly create resources for tokens and then 
type them.
OK, well I guess "Sequences" per se are functionally equivalent to images,
in that they are not the organism themselves, but rather a representation of
some aspect of the organisms (in this case, a representation of the
molecular structure of the DNA molecules contained within the cells of the
organism, rather than a representation of light waves reflected off the
exterior of an organism in the case of an image, or of x-ray waves
transmitted through an organism in the case of a radiograph image).  I was
thinking more in terms of tissue samples -- which I will much more
stubbornly defend as being in the Individual class -- but I guess more in
terms of "individualScope".

Here's one thing I'm not so certain about, though.  An in-situ image of an
organism is clearly a token of an Occurrence, because it is evidence of the
organism at the place/time.  An image of the preserved specimen in a Museum,
or an x-ray, etc., is not really a token of an Occurrence, because it's not
evidence of the organism at the place/time of its capture.  Same goes for
Sequences -- they are a token of the Individual organism, not of the
occurrence of the organism at a place and time.  This is why I have a hard
time thinking of such things as tokens of an Occurrence, when they are
really more tokens of the Individual.
...
This also assumes that dwc:catalogNumber and dwc:otherCatalogNumbers be
re-assigned to Record-level terms. Was there some reason this isn't
appropriate?
I think it is appropriate because they should be usable with at 
least two classes: Individual (for living specimens) and 
Occurrences (e.g. preserved specimens, images)
Hmmm...on that basis, should individualCount and the various tokens also be
Record-level terms -- on the basis that they can apply either to an
Occurrence, or to an Individual? Actually, in the case of DNA Barcodes and
such, isn't it possible to also represent a DNA Sequence as an attribute of
a Taxon as well?  If the purpose of Record-level terms is to aggregate terms
that apply to more than one class, then perhaps that is the solution for a
number of these things (including disposition, and maybe even preparation --
depending on how broadly those things are defined)?
...
I haven't said this before, but are we allowing Individuals 
to be dead?
Errr....fossils? Preserved specimens? Are they not Individuals?  I know you
think of them in terms of tracking a living organism over time.  But that's
only one of the reasons why I support an Individual class (not even the main
reason).  To me, the main reason is that an "Individual" represents the
actual organism(s), separate from an Occurrence, which represents the
presence of an organism at a particular place and time.
...
If we put it in a jar of alcohol 
and cut it into many separately-cataloged pieces, are 
all of the pieces still some of the Individual?
This is why we need two things for an Individual: 

individualScope (which can range anywhere from the aggregates of multiple
individuals, all the way down to the smallest parts of individuals)

And, a mechanism to track series of "derived from" Individuals.  The ASC
model covered this, I think (right, Stan?)
...
I think Pete might have been suggesting modeling 
things that way with "partOf".  What if we cut a 
branch from a tree, glue part of it to a page 
and turn part of it into a DNA sample that 
get sequenced.  Are those all a part of an 
Individual?
It seems to me that each unit could represent a separate instance of
Individual, but the "parts" need to be clearly aggregated around the
single-organism parent Individual, which itself may be a part of another
Individual instance that is an aggregate lot of specimens, which itself
could be a subsampling of another Individual instance that represents a
population in nature.

In my mind, we parse all of these things as separate instances of
Individuals, but join them via a hierarchical (parent/child) relationship.
If I'm not mistaken, this is how the ASC model managed instances of
BiologicalObject (again....right, Stan?)
...
I don't really want them to be, but maybe I must?  
Somehow we need to be able to handle road-kill, 
which will be dead when we make the 
observation/collection.  If we cut a branch from 
a tree (an Individual), root it, and grow it in 
a botanical garden, do we call the resulting tree 
in the garden the same Individual?  I would assign 
it a new identifier and call it a new Individual.  
I guess my point is that I would only apply the 
term Individual to dead stuff, pieces of dead stuff, 
and living pieces of things with extreme caution.
Why extreme caution?  What are the risks that we are cautioning ourselves
against?

These are some principles that I always try to keep in mind when discussing
these things:

- DwC is a data exchange standard, not so much a physical data model.
- There is a necessary balance between structuring DwC around how data
actually exist in content-provider databases, and how data *should* be
represented in a normalised world
- When in doubt, DwC should be accomodating, rather than restrictive --
especially when more restrictive needs can be met via associated data
filtering

There are other principles as well, but these are the ones I keep having to
remind myself of.
...
I think maybe so.  Maybe the appropriate course 
of action here as well is to let people try 
different approaches out and if they turn out 
to work and be needed, then we talk about 
applying them to Darwin Core.
Ultimately, I think people will use it in accordance to what terms are
nested within it -- which is why I think it's important to have this
conversation we're having now.

Aloha,
Rich

Re: [tdwg-content] tdwg-content Digest, Vol 20, Issue 17

Richard Pyle