I just want to say that Steve *PERFECTLY* captured my own (current) perspective on this in his message below.
Rich
From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Steve Baskauf Sent: Thursday, September 08, 2011 5:22 AM To: Chuck Miller Cc: TDWG Content Mailing List Subject: Re: [tdwg-content] Occurrences, Organisms, and CollectionObjects: a review
Well, this point brings to mind the previous discussions about "collapsing" or "denormalizing" complex models into simpler models. In those discussions, it was noted that many if not most people "denormalize" out of existence parts of a more complex model when they don't need them in their particular situation. People who have whole dead organisms in a jar do not care about the distinction between an Organism and a CollectionObject. However, people who cut five branches from the same tree and send them to five different herbaria do care, as do people who both photograph and collect samples from an organism, people who make observations of the same organism over time, and people who make measurements of a temporarily captured animal at the same time they collect a DNA sample. I don't consider the latter four examples to be very, very, very few cases. If it seems like there are very few cases, that's just because TDWG was started by (and currently run mostly) by people who run museums. If TDWG wants to broaden the biodiversity informatics tent to include people outside of the museum community, then it is important to create the classes and terms necessary to accommodate those other people.
Simple DwC does not require that users include classes that they don't need in their databases. If a museum only has specimens that are whole dead organisms, then there is no need for them to have a table in their database for Organism; they only need a table for CollectionObjects and the Organism represented by that CollectionObject can be inferred. This is no different than flattening the relationships among Location, Event, and Occurrence by having a 1:1:1 relationship among those three classes, which might be done in a Darwin Core Archive. The fact that some people prefer to "flatten" those three classes into a single database table doesn't mean that other people will not have the need to have several events at a single Location, or several Occurrences at one Event.
So I guess I would say that I don't agree that there are very, very, very few cases where there are multiple CollectionObjects per Organism. One of the major points of the BiSciCol initiative is that there needs to be some way for connecting diverse collection objects that originate from the same organism but end up being scattered around in different institutions. The lack of an Organism class is a big hole in modeling these kinds of situations and I haven't heard of an alternative way to fix that hole. There is a strong desire among a diverse group of TDWG participants to move forward on more precisely modeling relationships among classes using RDF. If we do not make the changes John has suggested, it is difficult to see how this will be accomplished without people "making up" classes on the fly to accommodate the more precise modeling that is sought. That's what Cam and I had to do when we created darwin-sw.
Steve
Chuck Miller wrote:
"in the many, many, many cases where there will be a 1:1 relationship between an Organism and a CollectionObject..."
Rich has hit on an important point. The discussion continues to focus on perfecting logic that is all encompassing. But, is it the best thing to do for the community as a whole to implement solutions that are more complex in order to accommodate the very, very, very few cases (using the counter description to Rich's).
Maybe we should consider keeping the solutions simple (ie current DwC ) for the many, many, many cases and introduce complex extensions only for the very, very, very few.
Chuck
-----Original Message----- From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Richard Pyle Sent: Thursday, September 08, 2011 3:52 AM To: 'Gregor Hagedorn'; tuco@berkeley.edu Cc: 'TDWG Content Mailing List' Subject: Re: [tdwg-content] Occurrences, Organisms,and CollectionObjects: a review
I see a problem with the "taxonomically homogeneous" since many taxa are not. All obligatory mutualistically symbiontic organisms are excluded (you
mention
symbiont, but the symbiont is the part of a symbiontic relation, e.g. both
the
algae taxon and fungus taxon each are a symbiont in a lichen.
I don't understand the problem. Isn't this simply two instances of "Organism" (one symbiont and one host)?
Together, they may comprise a single collectionObject (e.g., specimen); but I see no trouble treating obligatory mutualistic symbionts and their host(s) as distinct instances of "Organism".
Definition: The information class pertaining to a specific instance or
set of
instances of a life form or organism (virus, bacteria, symbiontic life
forms,
individual, colony, group, population). Sets must reliably be known to taxonomically homogeneous (including obligatory symbiontic associations).
I guess it could be defined that way, but I've come around to Steve's view that "taxonomically homogeneous" implies that in cases where more than one individual is involved (colonies, small groups, populations), all such individuals belong to a single species (independently of whether or not we can identify what that species is). When more than one species is discovered amongst a multi-individual instance of "Organism", then one would create additional instances of Organism to accommodate the heterogeneous taxa.
All of my original examples of things that I would want to be taxonomically heterogeneous (e.g., a single fossil rock with multiple phyla/kingdoms, or a single rock with multiple phyla of invertebrates attached) can be easily aggregated via a single instance of collectionObject, associated with multiple instances of Organism (one for each species[ish] level taxon).
I originally thought that both "Organism" and "collectionObject" would be redundant, and that only one was really needed. But I have now been convinced by Steve (and others) that this would be an unnecessary over-loading of "Organism". Now that we are contemplating two distinct classes, I have no problem with the more "refined" definition of "Organism".
One concern I do have, however, is in the many, many, many cases where there will be a 1:1 relationship between an Organism and a CollectionObject (i.e., the vast majority of all Museum specimens). Does that mean that data providers will need to generate two separate Ids (one organismID and one collectionObjectID) to represent all of these specimens?
Aloha, Rich
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
.