[tdwg-content] New Darwin Core terms proposed relating to material samples

Richard Pyle deepreef at bishopmuseum.org
Sun May 26 23:34:02 CEST 2013

Thanks for the *EXCELLENT* post -- this gets to the heart of what I was
trying to ask. I don't have time to respond in detail, but will come back to
this in a bit.






From: tdwg-content-bounces at lists.tdwg.org
[mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Steve Baskauf
Sent: Sunday, May 26, 2013 3:09 AM
To: Robert Guralnick
Cc: TDWG Content Mailing List; Robert Whitton; John Deck;
rlwalls2008 at gmail.com
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples


I suppose that Rich and Rob W. have already looked at
http://code.google.com/p/darwin-sw/wiki/TaxonomicHeterogeneity .  I think it
pretty much encapsulates what they are talking about.  I should note that
the way DSW defines dsw:IndividualOrganism does not require it to be a
single organism.  It can be a collection of organisms (herd, colony, school)
or part of an organism (tissue).  The basic requirement is that it is a
"taxonomically homogeneous entity".  In a variant form of DSW (dsw_alt.owl)
we included "taxonomically heterogeneous entity" (THeE) which would
basically include what Rich and Rob W. are talking about (lots of organisms
which are seperatable and aren't necessarily from the same lowest taxonomic
level).  It should be no surprise that THeE does what Rich wants because we
included it in DSW because during the preceding discussion Rich said he
wanted something like it.  In dsw_alt.owl, properties like "hasPart" and
"isPartOf" are used to connect physical entities whose properties can be
inferred by inheritance.  What this diagram includes that Rich did not
mention are "tokens" (evidence).  We defined a class for evidence, but we
also considered not having evidence being an explicit class.  Not defining
an explicit Token class would have simplified the diagram at the bottom of
the page - one could just say that there should be evidence and it should be
linked to the resource it documents.  Token and THeE/IndividualOrganism are
not disjoint classes - the physical entity can be the evidence if somebody
"owns" it and makes it available for people to examine.  However, in DSW,
Token and THeE are not synonymous because we allow evidence to include
things that are not physically derived from the entity (e.g. images, sounds,
string data records) in addition to physical specimens.

I think that we have to be careful when we say "we don't need X", "there is
pressing value for X but not for Y", "X is too vaguely defined", etc.
MaterialSample does exactly what the metagenomics people need because they
invented it to serve the purposes they want it to serve (handle material
samples in which one may or may not ever know what all organisms are
included or even if there are organisms in it).  Individual (sensu
Pyle/Whitton)/THeH is just vague enough to do what Rich and Rob W. want it
to do with their lots and specimens, but is too vague for Rob G.
IndividualOrganism (sensu DSW) and Token does exactly what Cam Webb and I
want it to do with our images, specimens, DNA samples, and data records, and
the requirement that IndividualOrganism be taxonomically homogeneous allows
us to infer that a determination applied to one resource also applies to
other resources which are derived from the same IndividualOrganism (a
requirement not stated by the others) but it's too restrictive for both Rob
G and Rich.  If we start in on the game of saying "WE need the features that
I think are important but not the features that YOU think are important"
then we are in for another month of massive email traffic on this list and
will end up no better off than we were when we started.

I think that it is clear from this and preceding discussions that there is a
need for some system of tracking things that are like
individuals/organisms/samples/lots.  It is my believe that what needs to
happen is:
1. define clearly what the various stakeholders want to accomplish by their
version of individuals/organisms/samples/lots (i.e. use cases/competency
2. use set theory or some other kind of logical system to describe clearly
how the various versions of individuals/organisms/samples/lots are related
to each other
3. examine alternative mechanisms for defining the relationships among the
variously defined individuals/organisms/samples/lots terms and determine how
each approach can or cannot satisfy the use cases/competency questions.
4. use one or more mechanisms which pass test #3 to define the terms that
are deemed necessary and include them in some TDWG standard which may or may
not be Darwin Core.

In September 2011, John Wieczorek had packaged several of the proposed class
additions to Darwin Core into a concrete proposal:
http://lists.tdwg.org/pipermail/tdwg-content/2011-September/002727.html .
This proposal was deferred by the Executive Committee (see the last comment
at http://code.google.com/p/darwincore/issues/detail?id=117 ) "... until we
can further examine broader changes including the new classes and any
insights that might come out of the RDF Interest Group."  So the RDF Task
Group has specifically been charged with the task of examining the addition
of additional classes to Darwin Core and their implications.  The RDF TG has
assembled competency questions
http://code.google.com/p/tdwg-rdf/wiki/CompetencyQuestions and use cases
http://code.google.com/p/tdwg-rdf/wiki/UseCases but has not moved beyond
that.  So that's a start on Item #1 in the list above.  However, the process
has not moved beyond that.  I recently made an appeal to the TG for someone
to take up work on delivering some concrete progress on deliverables, but
got no responses.  I cannot be the person to move this forward for two
reasons.  One is that I already have my hands full with the DwC RDF guide
(which doesn't address these issues) and the other is that I have reached
the limits of my technical skills and am not able to take leadership on
items #2-#4.  Who will champion this?

At the risk of making this email too long, I will add one more comment.
There seems to be a developing consensus that an OWL ontology structured
according to the OBO Foundry (http://www.obofoundry.org/) principles is the
answer to #2 and #3 above.  However, I have yet to see the evidence that the
complexity introduced by a formal OWL ontology is necessary or any actual
concrete examples of how an OBO-style ontology would be used to satisfy the
use cases.  We have shown with DSW that some use cases can be met using only
simple RDF and SPARQL (i.e. no actual reasoner involved).  I presume that
Rich and Rob W. have in hand a technical solution to their use cases that
doesn't involve RDF at all.  So I think that there need to be some
iterations of defining and testing before we adopt a technology by
acclimation.  We've been down that road before with the TDWG Ontology and
look how that turned out.


Robert Guralnick wrote: 


I agree with John and Gregor.  The term "individual" doesn't quite seem to
capture the concept or usage.  However, I think there is more general
agreement that there is a pressing need - and immediate value - for a term
to represent "material sample" and derivaties.  It seems that the proposal
on the table serves that need with the right definition, that is explicit,
and that provides necessary linkages to other related domains.  


Best, Rob



On Sun, May 26, 2013 at 1:47 AM, Gregor Hagedorn <g.m.hagedorn at gmail.com>

> Basically, we've been running with the idea of an "Individual" class - as
> originally proposed by Steve and discussed at some length on this list a
> while ago.  This has been documented for DSW:
> https://code.google.com/p/darwin-sw/wiki/ClassIndividual

> We define an "Individual" as the physical "something" that underpins an
> Occurrence.  In the case of organisms, this can be a group (herd, school,
> flock, etc.), specimen (either a single specimen, or a lot of multiple
> specimens), or any sort of derivative of a specimen (part, tissue sample,
> dna extraction, etc.).  It corresponds to the intended meaning of

I disagree with using "Individual" for sets of objects. It is
surprising, and lacking any clear definition when to stop, that means
a taxon is an individual, a collection is an individual, etc.

tdwg-content mailing list
tdwg-content at lists.tdwg.org


Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
PMB 351634
Nashville, TN  37235-1634,  U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20130526/d205ea4b/attachment-0001.html 

More information about the tdwg-content mailing list