[tdwg-content] Why it matters what kind of things we include in the definition of Individual

Richard Pyle deepreef at bishopmuseum.org
Sun Nov 7 07:25:22 CET 2010


> I have done a couple read-throughs of your posts and I had two 
> immediate comments.  The first is that I think we want to 
> accomplish many of the same things here and that the problem 
> really is what we want to call things, not what we want to 
> accomplish.  So I'm encouraged by that.  I think that what I 
> need to do is to get a piece of paper out and try to map out 
> what you are saying and what I'm saying.  I think they will 
> turn out to be mostly congruent but with different labels.  

Yes, I have the same sense. Maybe if we come to a consensus, we can create a
summary description of everything, for the benefit of those who do not have
time to read and digest all these posts.
	
> The second thing is that the point of recounting that story 
> had nothing to do with the rank of the tree (species, 
> subspecies, or whatever).  

OK, sorry.  The parts that threw me on that were:

> If we call mixtures of biological individuals of different 
> lower-level taxa Individuals, then we loose the certainty 
> of that all instances of Occurrences arising from that 
> Individual are the same taxa.

and

> I have accepted the broadening of the definition 
> to include "taxa" at any level, but I'm thinking 
> that may have been a mistake.

I (mis)interpreted this to mean that we shouldn't try to regard organisms
identified to infra-specific ranks as "Individuals".  Sorry that I
misunderstood.

> The point I was trying to make with the story was that 
> on the scale we've been talking about from the entire 
> biosphere to populations to individual organisms to 
> parts of organisms to molecules, the individual organism 
> is the point at which we no longer have to worry that 
> further subdivisions might not share a common Identification.  

OK, I understand that, and despite my seemingly passionate please to
maintain the scope of "Individual" to include sub-WholeOrganism units, I'm
keeping an open mind on that.  I (maybe) could be persuaded that the
"Individual" ends with a single WholeOrganism, and parts may be dealt with
in some other way ("associatedParts"? "associatedSubunits"? As members of
the Individual class?) 

I guess part of the passion of my fight stems from my hope that the pendulum
doesn't swing so far that biological collections objects can no longer be
represented as records unto themselves through DwC (as opposed to only
represented as attributes of some sort of semi-abstract unit of "Individual"
or "Occurrence" that isn't directly represented in many/most real-world
databases).

> That is what I'm saying is "special" about the whole organism 
> level (vs. parts).  If you  known that pieces came from the 
> same whole organism, then you can be confident that an 
> identification that is assigned to any of the pieces down to 
> any level of further subdivision will be the same as an 
> identification assigned to any other piece.  

I agree that along the continuum from "population" down to "single molecule
extracted from an organism", there is a pretty clear (though not perfect)
inflection point at the level of singel whole organism; and I can see
reconising that in some way in DwC.  But I just have this sense that the
demarcation should be at the level of our controlled vocabulary for
something like "individualScope", rather than what is considered "in" vs
"out" of scope for the class "Individual".

> Thus it is superfluous to assign separate identifications 
> to every piece when you can simply assign a single 
> identification to the whole organism and infer that that 
> identification applies to all of the pieces.   

Yes, but DwC is, by its nature, denormalized.  There is unnecessary
repetition of information built into it.  As long as Identification
instances have proper GUIDs (or even LUIDs within a defined dataset), which
DwC encourages via "identificationID", then I see no reason why an
identification instance cannot be simultaneously shared by instances of
"Individual" at the scope of "WholeOrganism" and below.  In fact, it
logically works above the scope of "WholeOrganism".  For example, our fish
collection assigns catalog numbers to "lots", which contain 1...n
WholeSpecimens (or sometimes parts of a whole organism).  The
Identifications apply to the lots.  So if there are 15 whole specimens
(dwc:individualCount=15) within a Lot identified as "Aus bus", then the
identification is implied for each of the 15 individual whole organisms.
Sometimes we have reason to recognize attributes of individual whole
organisms (e.g., individual lengths, or other morphological characters), in
such cases, I would imagine establishing child "Individual" instances (each
with dwc:individualCount=1), but I don't see why I would have to replicate
the Identification 15 times (each with a separate identificationID).  I
would rather have the 15 wholespecimens inherit the single Identification
instance thrgough an appropriate relationship link between parent "Lot" and
child "WholeSpecimen".

I think there is a difference, though.  Whereas in the case of WholeOrganism
and its derived parts, the inheritance of Identification instances is
bidirectional.  That is, if a tissue sample is sequenced, and evidence from
that sequence leads to an Identification, then surely the WholeOrganism
Individual instance would inherit this Identification.

However, at scopes of Individual broader than "WholeOrganism", the
inheritance of Identification is unidirectional.  That is, a child can
inherit the Identifications of the Parent, but the parent cannot necessarily
inherit the Identifications of the child.  This it the point that nags at
the back of my brain and tries to persuade me to throw in the towel and
agree with you that "Individual" does not extend below the level of
"WholeOrganism".

> This is assuming that you have all of the pieces from that 
> organism.  If you have some of the pieces and somebody 
> else has some of them, of course the two of you would 
> assign separate identifications to your sets of pieces 
> (unless you had synchronized databases that "knew" that 
> you were both talking about the same organism - one of 
> the points of having an identifier for individual 
> organisms is so you can do that).  

The problem is, at least for the billions of specimen records already
extant, this information is not known beforehand.  Obviously, going forward
we want to capture this information at the outset (ideally, in the field at
the time of specimen acquisition).  Indeed, this is one of the *key* goals
of the NSF BiSciCol grant -- to facilitate exactly this.

So, what I think it all boils down to is, when we discover multiple
specimens that each represent a part of a single "WholeOrganism" individual,
how do we map the pre-existing dwc:individualID values assigned to each of
the multiple parts to each other?

In my world-view, those parts are themselves individuals, so we would
generate a new "parent" instance of Individual, with its own
dwc:individualID and scope of "WholeOrganism", linked to all the parts via
the appropriate parent/child semantics.

If I undertsand your world-view on this, all of those existing
dwc:individualID values would have been implicitly established *for* the
WholeOrganism (of which only a part is represented as "evidence" in a
herbarium), and thus they would all be aggregated via some sort of "sameAs"
semantic relationship.

Is that a fair distinction between our respective perspectives?

> I'm going to digest what you wrote for a while before 
> I make further comments.

Likewise for me on your earlier posts.

Thanks for keeping this discussion interesting and thought-provoking!

And, apologies to everyone for the volume of email on this topic...not that
any of those people have got this far into this message.....

Aloha,
Rich




More information about the tdwg-content mailing list