On Thu, Nov 4, 2010 at 1:14 AM, Peter DeVries pete.devries@gmail.comwrote:
Uncertainty: http://arctos.database.museum/guid/KWP:Ento:1703 => This is a Genus Erebia species undetermined.
No, it isn't. We know more than that. It's not* Erebia embla*http://arctos.database.museum/name/Erebia%20embla, for example.
On Thu, Nov 4, 2010 at 2:10 AM, Richard Pyle deepreef@bishopmuseum.orgwrote:
Hi Dusty,
Collections contain things that do not map nicely to a single taxon name of any (or no) rank. It's not clear to me if this proposal will support those kinds of data or not. A few examples:
Uncertainty: http://arctos.database.museum/guid/KWP:Ento:1703
This is an excellent example of something I have to deal with occassionally, and was going to be part of my never-sent post on dealing with ambiguous identifications. In the context of DwC, my feeling is that this taxon should be represented as "Erebia" in dwc:scientificName, and the two possible species epithets included in dwc:identificationRemarks.
But that's not the data.
Composite specimens: http://arctos.database.museum/guid/UAM:Herb:12718
This one could be represented as "Bupleurum" for the Individual instance representing the sheet, but then I would be inclined to establish two "child" individuals (semantically related to the "parent" sheet), one each identified to the two different taxa.
So I picked an easy example. Here's a slightly harder one: http://arctos.database.museum/guid/MVZ:Egg:2355.
I think a lot of data models (including GNUB) treat hybrid formulae as though they are separate "taxa", with the hybrid formula as the name. Although it doesn't seem to be addressed in the DwC documentation, I would put "Canis latrans x Canis lupus familiaris" in dwc:scientificName.
Now....this may be one of those semantics-breaking pseudo-conventions that the RDF'ers will pull their hair out over (along the lines of Bob's post concerning different kinds of aggregations), in which case we should probably have an0other thread on this topic.
Things that aren't taxonomy at all:
http://arctos.database.museum/guid/UAM:ES:3405
Outside the scope of DwC?
Maybe so, but there it is: http://data.gbif.org/occurrences/242032297/. Excluding that would, I think, force you to exclude things like http://arctos.database.museum/guid/UAM:ES:3359 as well - it's all from the same administrative unit. I don't have or want any control over what Curators enter - any scope-limiting filter will have to happen elsewhere.
The point is simply that these are real data. We won't change them to some approximation of themselves or stuff them into a remarks field somewhere. They'll get more complicated before we're done. Anything that's to be useful to us must acknowledge the realities of collections data.
If anyone is interested, we accomplish the above by separating Identifications and Taxonomy. Arctos has roots deep in the ASC model discussed recently, but the link between specimens and taxonomy was one of our early divergences from that model. Assigning TaxonIDs directly to specimens is a no-win game - you either end up with the really valuable data buried in a remarks field somewhere, or you end up with an infinite list of strings that you must pretend are taxon names. Neither is acceptable. A fairly recent ER diagram can be had from http://arctos.googlecode.com/files/arctos_erd_20100129_single.pdf. Taxonomy and Identifications are in dark purple.
--D