[tdwg-content] Occurrences, Organisms, and CollectionObjects: a review
Steve Baskauf
steve.baskauf at vanderbilt.edu
Thu Sep 8 17:22:24 CEST 2011
Well, this point brings to mind the previous discussions about
"collapsing" or "denormalizing" complex models into simpler models. In
those discussions, it was noted that many if not most people
"denormalize" out of existence parts of a more complex model when they
don't need them in their particular situation. People who have whole
dead organisms in a jar do not care about the distinction between an
Organism and a CollectionObject. However, people who cut five branches
from the same tree and send them to five different herbaria do care, as
do people who both photograph and collect samples from an organism,
people who make observations of the same organism over time, and people
who make measurements of a temporarily captured animal at the same time
they collect a DNA sample. I don't consider the latter four examples to
be very, very, very few cases. If it seems like there are very few
cases, that's just because TDWG was started by (and currently run
mostly) by people who run museums. If TDWG wants to broaden the
biodiversity informatics tent to include people outside of the museum
community, then it is important to create the classes and terms
necessary to accommodate those other people.
Simple DwC does not require that users include classes that they don't
need in their databases. If a museum only has specimens that are whole
dead organisms, then there is no need for them to have a table in their
database for Organism; they only need a table for CollectionObjects and
the Organism represented by that CollectionObject can be inferred. This
is no different than flattening the relationships among Location, Event,
and Occurrence by having a 1:1:1 relationship among those three classes,
which might be done in a Darwin Core Archive. The fact that some people
prefer to "flatten" those three classes into a single database table
doesn't mean that other people will not have the need to have several
events at a single Location, or several Occurrences at one Event.
So I guess I would say that I don't agree that there are very, very,
very few cases where there are multiple CollectionObjects per Organism.
One of the major points of the BiSciCol initiative is that there needs
to be some way for connecting diverse collection objects that originate
from the same organism but end up being scattered around in different
institutions. The lack of an Organism class is a big hole in modeling
these kinds of situations and I haven't heard of an alternative way to
fix that hole. There is a strong desire among a diverse group of TDWG
participants to move forward on more precisely modeling relationships
among classes using RDF. If we do not make the changes John has
suggested, it is difficult to see how this will be accomplished without
people "making up" classes on the fly to accommodate the more precise
modeling that is sought. That's what Cam and I had to do when we
created darwin-sw.
Steve
Chuck Miller wrote:
> "in the many, many, many cases where there will be a 1:1 relationship between an Organism and a CollectionObject..."
>
> Rich has hit on an important point. The discussion continues to focus on perfecting logic that is all encompassing. But, is it the best thing to do for the community as a whole to implement solutions that are more complex in order to accommodate the very, very, very few cases (using the counter description to Rich's).
>
> Maybe we should consider keeping the solutions simple (ie current DwC ) for the many, many, many cases and introduce complex extensions only for the very, very, very few.
>
> Chuck
>
> -----Original Message-----
> From: tdwg-content-bounces at lists.tdwg.org [mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Richard Pyle
> Sent: Thursday, September 08, 2011 3:52 AM
> To: 'Gregor Hagedorn'; tuco at berkeley.edu
> Cc: 'TDWG Content Mailing List'
> Subject: Re: [tdwg-content] Occurrences, Organisms,and CollectionObjects: a review
>
>
>> I see a problem with the "taxonomically homogeneous" since many taxa
>> are not.
>> All obligatory mutualistically symbiontic organisms are excluded (you
>>
> mention
>
>> symbiont, but the symbiont is the part of a symbiontic relation, e.g.
>> both
>>
> the
>
>> algae taxon and fungus taxon each are a symbiont in a lichen.
>>
>
> I don't understand the problem. Isn't this simply two instances of "Organism" (one symbiont and one host)?
>
> Together, they may comprise a single collectionObject (e.g., specimen); but I see no trouble treating obligatory mutualistic symbionts and their host(s) as distinct instances of "Organism".
>
>
>> Definition: The information class pertaining to a specific
>> instance or
>>
> set of
>
>> instances of a life form or organism (virus, bacteria, symbiontic life
>>
> forms,
>
>> individual, colony, group, population). Sets must reliably be known to
>> taxonomically homogeneous (including obligatory symbiontic associations).
>>
>
> I guess it could be defined that way, but I've come around to Steve's view that "taxonomically homogeneous" implies that in cases where more than one individual is involved (colonies, small groups, populations), all such individuals belong to a single species (independently of whether or not we can identify what that species is). When more than one species is discovered amongst a multi-individual instance of "Organism", then one would create additional instances of Organism to accommodate the heterogeneous taxa.
>
> All of my original examples of things that I would want to be taxonomically heterogeneous (e.g., a single fossil rock with multiple phyla/kingdoms, or a single rock with multiple phyla of invertebrates attached) can be easily aggregated via a single instance of collectionObject, associated with multiple instances of Organism (one for each species[ish] level taxon).
>
> I originally thought that both "Organism" and "collectionObject" would be
> redundant, and that only one was really needed. But I have now been
> convinced by Steve (and others) that this would be an unnecessary over-loading of "Organism". Now that we are contemplating two distinct classes, I have no problem with the more "refined" definition of "Organism".
>
> One concern I do have, however, is in the many, many, many cases where there will be a 1:1 relationship between an Organism and a CollectionObject (i.e., the vast majority of all Museum specimens). Does that mean that data providers will need to generate two separate Ids (one organismID and one
> collectionObjectID) to represent all of these specimens?
>
> Aloha,
> Rich
>
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
> .
>
>
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
http://bioimages.vanderbilt.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110908/befb13a7/attachment.html
More information about the tdwg-content
mailing list