comments inline

Richard Pyle wrote:
function.  I think the only issue on scoping Individual boils down to this:

A whole specimen is collected. Two tissue samples are taken from it.  The
(remainder of the) whole specimen is preserved as a voucher, and the two
tissue samples are sent to two different institutions that specialize in DNA
sequencing and analysis.  A morphologist establishes an identification
instance for the whole specimen, and each of the molecular-based
institutions assign their own Identification instances to their respective
tissue samples, based on DNA sequencs of different genes. (in this example,
it doesn't matter whether the Identifications are all the same and
reinforcing; or different and competing).

To me, this is the crux of the issue:  Given that each of the three
institutions will generate a record in their respective database to
represent the physical object that they are managing, should we [Option A]
treat these as three separate instances of the "Individual" class, which
link to each other (and hence share Identification instances) via
parent/child relationships (e.g., the two tissue-sample Individuals as
"child" instances of the whole-specimen Individual)?

Or, [Option B] do we establish an abstract "Individual" instance to which
all Identification instances are linked, and for which these three objects
are representative "tokens"?

In my mind, both allow us to represent the information we want to represent.
I think Option B might be better in terms of representing the "reality" we
wish to model; but I think Option A is overwhelmingly more practical in the
"reality" of how data of this sort are actually managed by the people who
mange them.
  
You are denormalizing a more general model.  What if the specimen is from a tree?  I collect flowers in May and return to collect fruit in September.  I have a hard time making those two Occurrences be one.  According to the definition on the table for a vote, an Individual is to permit resampling over time.  Read the definition of dwc:IndividualID.
  
I think we're both in agreement on this. The only real question is what are
If I think about all of the kinds of things that I would like
to put into the spot on the ASC diagram labeled "Collecting
Unit" (including things like the Bicentennial Oak that was
never "collected" by anybody), the one thing that they all
seem to have in common is this aspect of being "accessioned".
 So I would assert that in a general model, "AccessionedUnit"
would be a better name than "CollectingUnit".  Some of the
terms that I think should come out of Occurrence (such as
preparations and disposition) could apply to any AccessionedUnit.
    

I don't think that's quite right.  Certainly, the ASC model was designed
specifically for DeadSpecimen data management; but I think conceptually a
CollectingUnit is not intrinsically defined as an Accessioned object
(definitely don't use the word "Accession" as part of the term -- this is
too loaded of a term).  I think of a CollectingUnit as what I called
"BiologicalObject".  That is, a physical object consisting of biological
material (or mineralized representations of biological material, in the case
of fossils).  Whether it is alive or dead, accessioned or not, captive or in
the wild ... are all attributes of a BiologicalObject, but do not *define*
what a BiologicalObject is.  The definition of a BiologicalObject is that it
is an object, primarily consisting of biological stuff (or mineralized
biological stuff).

  
The decision on this should
not be made based on what we "think" an Individual should be,
but rather on what we need it to be to fulfill the role that
we have assigned it in our model.
    

I completely agree!!  But the problem is that there are two, sometimes
competing, sets of needs: the needs of the data consumers, who want to use
this information for answering biological questions; and the needs of the
data providers, who are constrined by the nature of the data they manage.
The "art" in DwC is in reconciling gaps between these needs in an elegant
and simple way.

  
With that in mind, it
might be better for the moment to change the name
dwc:Individual to dwc:ResamplingUnitHavingDetermination
because that is what it needs to do according to its current
definition and location in the model diagram (I'm considering
resampling to be the documentation of multiple Occurrences).
    

As we've discussed before, I think "Resampling" is only one benefit of an
Individual class, and should not be the defining metric of an Individual.
I'd be more comfortable following along with
dwc:SampledUnitHavingDetermination; because some of these Units that need to
have Determinations are resampled multiple times, and some are sampled only
once.
  
You are denormalizing a more general model.  We both confessed to this sin in an earlier series of emails.  If you don't want to read the first post in the series, just click on the links on order and look at what happens to the diagram.  I want (no NEED) the fully normalized model for what I do and so do others.  You may not need it, but I do and that's why I made the proposal.  If I'm the only person who ever needs to resample anything then vote the proposal down and I'll just use the definition of Individual that I have in RDF at Bioimages.  I'm already doing that.  I would just prefer it to be a "well known" Darwin Core term, not an ad hoc one that I made up.  If it was important enough to put individualID into the Darwin Core standard to facilitate resampling, then why is it suddenly not very important to make that term usable in RDF?
  
The question them becomes: should AccessionedUnit be
considered the same as ResamplingUnitHavingDetermination
because they share the same properties (i.e. are described by
the same terms)?  To me the answer is clearly "no".
    

I would agree.  But I don't think that's the right question.  To me, the
right question is:

Should the scope of SampledUnitHavingDetermination include all instances of
BiologicalObject; or only a subset of them? (i.e., Should
SampledUnitHavingDetermination *include* BiologicalObject, or *overlap with*
BiologicalObject?)

  
It is
very likely that an AccessionedUnit will never be associated
with more than one Occurrence (i.e. be resampled),
    

Agreed! But the vast, vast majority of things that have Determinations and
are associated with Occurrences are only sampled once.  Even if they
represented the vast, vast minority (but non-trivial), there would still be
solid rationale for not restricting the scope of Individual to only those
things that are Sampled more than once.
  
I will repeat my charge of denormalization again.
  
Having made a decision about this based on functional need
and shared properties, it is still helpful for me to try to
develop a mental image of what these two things are.  In my
mind, I imagine the ResamplingUnitHavingDetermination (which
I will henceforth return to calling dwc:Individual) to be an
entity having a homogeneous taxonomic identity.  It has some
moment when it came into existance as a living thing (by
being born, planted, or founded) although we will never know
when that moment was unless an Occurrence happens that allows
us to document that Event.  The Individual remains an entity
as long as it has the potential to be documented as an
Occurrence.  That doesn't necessarily means that it must be
alive.  But if it decomposes, or is preserved and put into a
collection, it no longer is capable of being resampled (i.e.
documented by an Occurrence).  Thus a fossil that is dead for
a million years and is sitting in some stratum still fits my
mental image of an Individual.  If it gets chipped out of the
rock and put in a museum, there would no longer be any point
in documenting another Occurrence for it since there would be
no useful Location or GeologicalContext information to be
gained from that.  A roadside population of herbaceous plants
having homogenous taxonomic identity would be an Individual
from the first time it was capable of being sampled (when it
was founded) and would end being an Individual when it was
extirpated by some road construction crew and was no longer
capable of being documented by an Occurrence.  A wolf pack
would be a similar case.

My mental image of AccessionedUnit is an entity that comes
into existence when some human person or institution takes
control of it, assigns it an identifier, and keeps records of
it.  I think I would never see it as coming to an end.  Even
if it is lost or destroyed, it would continue to exist as
long as the person or institution maintains its record.  It
would just have dwc:disposition "lost" or "destroyed".
It could be a dead, preserved specimen in a jar or glued to a
sheet of paper, a living wildebeest calf in a zoo, or even a
field sampling plot in a park as long as the park exerts
control and ownership over it and maintains records about it.
 It could not be any wild, free-ranging animal or plant.  It
could not be roadkill left on the side of the road to
decompose.  It could not be a photograph of a wildebeest calf
in the zoo, or the sound recording of the wildebeest calf's
grunt. It COULD be a tissue sample from the wildebeest calf
or from the roadkill.  The critical thing is that it is a
physical artifact originating from a living thing that has
been cataloged and placed under human control.  I think this
is the kind of thing that Rich wanted to be able to define
when he wanted to broaden the definition of Individual.

For any entity having an origin as a living thing (in my
mental image), its status as an Individual is independent of
its status as an AccessionedUnit.  If the entity is removed
and preserved in its entirety (fish killed and put in a jar
of formaldehyde), it ceases to exist as a dwc:Individual and
begins to exist as an AccessionedUnit.  If a branch is
removed from a tree or one plant pulled from a roadside
population to become specimens, the removed part becomes an
AccessionedUnit while the dwc:Individual continues to exist.
In the case of the Bicentennial Oak or a permanent sampling
plot, the entity simultaneously exists as both an
AccessionedUnit and a dwc:Individual.  In terms of metadata
records, the establishment of any AccessionedUnit is an
Occurrence (grouped under the Individual) having a property
of recordedBy.  Whether or not subsequent Occurrences are
possible for the Individual depends on whether the act of
creating the AccessionedUnit has rendered subsequent sampling
irrelevant.

I agree with the point that was made previously that no
specific taxonomic level should be placed in the definition
of Individual.  That would allow for the possibility that
Individuals could contain several different lower level taxa
as long as the Individual is homogeneous at the taxonomic
level at with the determination is applied.  I am open to
suggestion for how this could be accomplished.  Somehow there
needs to be a value for a term like "individualScope" that
allows one to make the kind of inferences about duplicates
that I described previously.  Maybe one controlled value for
"individualScope" should be "DuplicateLevel"
meaning that the Individual is homogeneous in taxonomic
identity to the level at which a taxonomist would collect
multiple specimens and call them duplicates.  That would get
us out of the problem of deciding whether the several grass
stems we collect and send off to different herbaria are
actually the same biological individual or clones connected
by underground stems.  Other possible levels could be
"BiologicalIndividual" for things known to be single
biological individuals, and "Heterogeneous" for things that
are know or suspect to be mixtures of lower level taxa but
for which it is convenient to assign a determination at a
higher taxonomic level at which we know the mixture to be homogeneous.

For AccessionedUnit, I think there should also be an
accessionedUnitScope term.  I defer to the museum people on
this, but the boxes in the ASC diagram (unsorted lot, lot
(presumably homogeneous), specimen (presumably one biological
individual), and specimen component) could be a starting
point.  The "partOf" and "hasPart" properties could be used
to related AccessionedUnits that are related to each other.
Relating these various levels of AccessionedUnits to levels
of Individual above "DuplicateLevel" is going to be tricky,
but if people want to do this, I'm sure there is a way to
represent the relationships in RDF.

THE BOTTOM LINE
I believe that the proposed definition for the DwC class
Individual should stand as it is (i.e. as a node to connect
multiple Occurrences to multiple Identifications).  To allow
Identifications for Individuals that are homogeneous at
higher taxonomic levels, we also need a term like
dwc:individualScope.  I believe that there needs to be a
separate class that represents what I've described here as
"AccessionedUnit"
which also has some kind of scope property.  I am not going
to propose a name for this thing or propose what properties
belong with it.  Rich and the herbarium/museum/botanical
garden/zoo people need to decide and propose that.
AccessionedUnit then becomes one of several types of evidence
that can be used to support an Occurrence, with
dctype:StillImage, dctype:Sound, dctype:Text as other possibilities.
Darwin Core does not need to define their properties and
types since others (MRTG, DCMI) have already done so.  We
then need two more terms:
one to relate the evidence to the Occurrence and one to
relate the Occurrence to the evidence (I would suggest
"hasEvidence" and "isEvidenceFor" as possibilities).  If we
can do these things, I think we could say that a general
(i.e. denormalized enough to satisfy everyone who is
dissatisfied at the present moment) Darwin Core model is
"complete" to the "left" of Identification on the
http://bioimages.vanderbilt.edu/pages/full-model.jpg diagram.
 I'm not going to touch the Taxon side right now.

Whether or not action is taken on creating a class for what
I'm calling "AccessionedUnit", there is no reason to hold up
action on my Individual class proposal if people agree with
the points I've made here.

Steve

--
Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt
University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu

_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
    


.

  

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu