[tdwg-content] tdwg-content Digest, Vol 20, Issue 17

Thu Nov 4 19:18:01 CET 2010

> If we go down the road we seem to be traveling now, 
> I think we must allow a wolf pack to be an Individual.  
> It is a definable aggregation of biological individuals 
> which can be assigned an identifier that can be the 
> subject of repeated Occurrences and it can reliably be 
> identified to some taxon.  The definition we have doesn't 
> (at this point) forbid that or say that the number of 
> biological individuals in an Individual must stay the 
> same over time.  It is very convenient for the use case 
> that got this all started (having a small patch of plants 
> of the same species) that we don't require that the 
> number change.  A major feature of Individual is to 
> allow sampling over time (a la the definition of 
> individualID) and if the number of individuals changes 
> between Occurrences I shouldn't have to redefine a 
> new individual.  It is quite likely that I won't know 
> or don't care whether the number has changed.

Fair enough.  Obviously, if we're going to allow for a population to be an
Individual, and we're going to want to track that population over time, then
the composition of the populaition is going to change over time.  It just
seems weird from a specimen perspective to associate the individualCount
with the Occurrence, rather than with the Individual.  But I have to agree
with your points above.

> > previousIdentifications
>
> Hmm.  I suppose yes, but better to just have
> another instance of Identification.  Why not?

When the data are structured that way at the source, yes.  But a number of
DwC terms exist because many content sources have not parsed/normalized all
their data	to the full extent of the DwC classes.  Therefore, I think
previousIdentifications should be kept, and if so, it should be part of the
Individual class.

> > associatedSequences
> 	  
> I suppose you won't agree on this, but I don't see sequences 
> as any different than other tokens/evidence types that I 
> think we should allow to document Occurrences.  I would like 
> this term to eventually go away, at least for people using 
> RDF who will explicitly create resources for tokens and then 
> type them.

OK, well I guess "Sequences" per se are functionally equivalent to images,
in that they are not the organism themselves, but rather a representation of
some aspect of the organisms (in this case, a representation of the
molecular structure of the DNA molecules contained within the cells of the
organism, rather than a representation of light waves reflected off the
exterior of an organism in the case of an image, or of x-ray waves
transmitted through an organism in the case of a radiograph image).  I was
thinking more in terms of tissue samples -- which I will much more
stubbornly defend as being in the Individual class -- but I guess more in
terms of "individualScope".

Here's one thing I'm not so certain about, though.  An in-situ image of an
organism is clearly a token of an Occurrence, because it is evidence of the
organism at the place/time.  An image of the preserved specimen in a Museum,
or an x-ray, etc., is not really a token of an Occurrence, because it's not
evidence of the organism at the place/time of its capture.  Same goes for
Sequences -- they are a token of the Individual organism, not of the
occurrence of the organism at a place and time.  This is why I have a hard
time thinking of such things as tokens of an Occurrence, when they are
really more tokens of the Individual.

> This also assumes that dwc:catalogNumber and dwc:otherCatalogNumbers be
> re-assigned to Record-level terms. Was there some reason this isn't
> appropriate?
>		  
> I think it is appropriate because they should be usable with at 
> least two classes: Individual (for living specimens) and 
> Occurrences (e.g. preserved specimens, images)

Hmmm...on that basis, should individualCount and the various tokens also be
Record-level terms -- on the basis that they can apply either to an
Occurrence, or to an Individual? Actually, in the case of DNA Barcodes and
such, isn't it possible to also represent a DNA Sequence as an attribute of
a Taxon as well?  If the purpose of Record-level terms is to aggregate terms
that apply to more than one class, then perhaps that is the solution for a
number of these things (including disposition, and maybe even preparation --
depending on how broadly those things are defined)?

> I haven't said this before, but are we allowing Individuals 
> to be dead?  

Errr....fossils? Preserved specimens? Are they not Individuals?  I know you
think of them in terms of tracking a living organism over time.  But that's
only one of the reasons why I support an Individual class (not even the main
reason).  To me, the main reason is that an "Individual" represents the
actual organism(s), separate from an Occurrence, which represents the
presence of an organism at a particular place and time.

> If we put it in a jar of alcohol 
> and cut it into many separately-cataloged pieces, are 
> all of the pieces still some of the Individual? 

This is why we need two things for an Individual: 

individualScope (which can range anywhere from the aggregates of multiple
individuals, all the way down to the smallest parts of individuals)

And, a mechanism to track series of "derived from" Individuals.  The ASC
model covered this, I think (right, Stan?)

> I think Pete might have been suggesting modeling 
> things that way with "partOf".  What if we cut a 
> branch from a tree, glue part of it to a page 
> and turn part of it into a DNA sample that 
> get sequenced.  Are those all a part of an 
> Individual?  

It seems to me that each unit could represent a separate instance of
Individual, but the "parts" need to be clearly aggregated around the
single-organism parent Individual, which itself may be a part of another
Individual instance that is an aggregate lot of specimens, which itself
could be a subsampling of another Individual instance that represents a
population in nature.

In my mind, we parse all of these things as separate instances of
Individuals, but join them via a hierarchical (parent/child) relationship.
If I'm not mistaken, this is how the ASC model managed instances of
BiologicalObject (again....right, Stan?)

> I don't really want them to be, but maybe I must?  
> Somehow we need to be able to handle road-kill, 
> which will be dead when we make the 
> observation/collection.  If we cut a branch from 
> a tree (an Individual), root it, and grow it in 
> a botanical garden, do we call the resulting tree 
> in the garden the same Individual?  I would assign 
> it a new identifier and call it a new Individual.  
> I guess my point is that I would only apply the 
> term Individual to dead stuff, pieces of dead stuff, 
> and living pieces of things with extreme caution.

Why extreme caution?  What are the risks that we are cautioning ourselves
against?

These are some principles that I always try to keep in mind when discussing
these things:

- DwC is a data exchange standard, not so much a physical data model.
- There is a necessary balance between structuring DwC around how data
actually exist in content-provider databases, and how data *should* be
represented in a normalised world
- When in doubt, DwC should be accomodating, rather than restrictive --
especially when more restrictive needs can be met via associated data
filtering

There are other principles as well, but these are the ones I keep having to
remind myself of.

> I think maybe so.  Maybe the appropriate course 
> of action here as well is to let people try 
> different approaches out and if they turn out 
> to work and be needed, then we talk about 
> applying them to Darwin Core.

Ultimately, I think people will use it in accordance to what terms are
nested within it -- which is why I think it's important to have this
conversation we're having now.

Aloha,
Rich