I suppose that Rich and Rob W. have already looked at
http://code.google.com/p/darwin-sw/wiki/TaxonomicHeterogeneity . I
think it pretty much encapsulates what they are talking about. I
should note that the way DSW defines dsw:IndividualOrganism does not
require it to be a single organism. It can be a collection of
organisms (herd, colony, school) or part of an organism (tissue). The
basic requirement is that it is a "taxonomically homogeneous entity".
In a variant form of DSW (dsw_alt.owl) we included "taxonomically
heterogeneous entity" (THeE) which would basically include what Rich
and Rob W. are talking about (lots of organisms which are seperatable
and aren't necessarily from the same lowest taxonomic level). It
should be no surprise that THeE does what Rich wants because we
included it in DSW because during the preceding discussion Rich said he
wanted something like it. In dsw_alt.owl, properties like "hasPart"
and "isPartOf" are used to connect physical entities whose properties
can be inferred by inheritance. What this diagram includes that Rich
did not mention are "tokens" (evidence). We defined a class for
evidence, but we also considered not having evidence being an explicit
class. Not defining an explicit Token class would have simplified the
diagram at the bottom of the page - one could just say that there
should be evidence and it should be linked to the resource it
documents. Token and THeE/IndividualOrganism are not disjoint classes
- the physical entity can be the evidence if somebody "owns" it and
makes it available for people to examine. However, in DSW, Token and
THeE are not synonymous because we allow evidence to include things
that are not physically derived from the entity (e.g. images, sounds,
string data records) in addition to physical specimens.
I think that we have to be careful when we say "we don't need X",
"there is pressing value for X but not for Y", "X is too vaguely
defined", etc. MaterialSample does exactly what the metagenomics
people need because they invented it to serve the purposes they want it
to serve (handle material samples in which one may or may not ever know
what all organisms are included or even if there are organisms in it).
Individual (sensu Pyle/Whitton)/THeH is just vague enough to do what
Rich and Rob W. want it to do with their lots and specimens, but is too
vague for Rob G. IndividualOrganism (sensu DSW) and Token does exactly
what Cam Webb and I want it to do with our images, specimens, DNA
samples, and data records, and the requirement that IndividualOrganism
be taxonomically homogeneous allows us to infer that a determination
applied to one resource also applies to other resources which are
derived from the same IndividualOrganism (a requirement not stated by
the others) but it's too restrictive for both Rob G and Rich. If we
start in on the game of saying "WE need the features that I think are
important but not the features that YOU think are important" then we
are in for another month of massive email traffic on this list and will
end up no better off than we were when we started.
I think that it is clear from this and preceding discussions that there
is a need for some system of tracking things that are like
individuals/organisms/samples/lots. It is my believe that what needs
to happen is:
1. define clearly what the various stakeholders want to accomplish by
their version of individuals/organisms/samples/lots (i.e. use
cases/competency questions)
2. use set theory or some other kind of logical system to describe
clearly how the various versions of individuals/organisms/samples/lots
are related to each other
3. examine alternative mechanisms for defining the relationships among
the variously defined individuals/organisms/samples/lots terms and
determine how each approach can or cannot satisfy the use
cases/competency questions.
4. use one or more mechanisms which pass test #3 to define the terms
that are deemed necessary and include them in some TDWG standard which
may or may not be Darwin Core.
In September 2011, John Wieczorek had packaged several of the proposed
class additions to Darwin Core into a concrete proposal:
http://lists.tdwg.org/pipermail/tdwg-content/2011-September/002727.html
. This proposal was deferred by the Executive Committee (see the last
comment at
http://code.google.com/p/darwincore/issues/detail?id=117 )
"... until we can further examine broader changes including the new
classes and any insights that might come out of the RDF Interest
Group." So the RDF Task Group has specifically been charged with the
task of examining the addition of additional classes to Darwin Core and
their implications. The RDF TG has assembled competency questions
http://code.google.com/p/tdwg-rdf/wiki/CompetencyQuestions and use
cases
http://code.google.com/p/tdwg-rdf/wiki/UseCases but has not moved
beyond that. So that's a start on Item #1 in the list above. However,
the process has not moved beyond that. I recently made an appeal to
the TG for someone to take up work on delivering some concrete progress
on deliverables, but got no responses. I cannot be the person to move
this forward for two reasons. One is that I already have my hands full
with the DwC RDF guide (which doesn't address these issues) and the
other is that I have reached the limits of my technical skills and am
not able to take leadership on items #2-#4. Who will champion this?
At the risk of making this email too long, I will add one more
comment. There seems to be a developing consensus that an OWL ontology
structured according to the OBO Foundry (
http://www.obofoundry.org/)
principles is the answer to #2 and #3 above. However, I have yet to
see the evidence that the complexity introduced by a formal OWL
ontology is necessary or any actual concrete examples of how an
OBO-style ontology would be used to satisfy the use cases. We have
shown with DSW that some use cases can be met using only simple RDF and
SPARQL (i.e. no actual reasoner involved). I presume that Rich and Rob
W. have in hand a technical solution to their use cases that doesn't
involve RDF at all. So I think that there need to be some iterations
of defining and testing before we adopt a technology by acclimation.
We've been down that road before with the TDWG Ontology and look how
that turned out.
Steve
Robert Guralnick wrote:
I agree with John and Gregor. The term "individual"
doesn't quite seem to capture the concept or usage. However, I think
there is more general agreement that there is a pressing need - and
immediate value - for a term to represent "material sample" and
derivaties. It seems that the proposal on the table serves that need
with the right definition, that is explicit, and that provides
necessary linkages to other related domains.
Best, Rob
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
PMB 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu