[tdwg-content] tdwg-content Digest, Vol 20, Issue 17

Steve Baskauf steve.baskauf at vanderbilt.edu
Thu Nov 4 02:38:04 CET 2010


I got Rich's several emails just before I left work and had several 
hours in the car to think about it before responding (probably always a 
good idea - the hours of thinking, not the car ride). Since I got home, 
I've read the subsequent emails from various people and they confirm my 
"car thinking".  I get the point Rich is making and after pondering, I'm 
now OK with just leaving the definition at "taxon".  As in the other 
classes he mentions (Location,time) we have the freedom to define the 
scale and I suppose we should have that here as well. 


What we need to realize is that in doing this, we are establishing 
establishing dwc:Individual as the de facto "aggregation" unit of the 
sort we have talked about previously.  I suppose that is OK as long as 
we provide sufficient means for humans or semantic reasoners to be able 
to know what they are getting information about when they retrieve 
metadata about an Individual.  That can be done for humans by making 
appropriate comments in the proposed dwc:individualRemarks .  For 
machines, I have been re-thinking the idea of having dwc:individualCount 
being a property of an Individual.  In an earlier post, I suggested that 
it should remain a property of Occurrences, since the number of 
Individuals can change over time (think wolf pack or plant deme over 
time).  However, in the current context, something like 
dwc:individualCount (if not individualCount itself) that is a property 
of Individual could have controlled values of "1" and ">1".  If the 
value is "1", then the Individual is a biological individual (still 
subject to some uncertainty of definition in the case of clones) and we 
can safely assume that if taxonRank has a high level value then the 
determiner holds out the possibility that the same named biological 
individual (i.e. possessing a certain GUID) could be identified to a 
lower level.  If the value is ">1" then the Individual is an 
aggregation, bag, cluster, or whatever we want to call it.  The 
biological individuals involved could have a common identification on 
any level from "life" for a photo of a eubacterium and archaebacterium, 
to variety, subspecies, forma or whatever.  In the case of ">1" it would 
be understood that it may or may not be possible to refine the taxonomic 
level of the Identification without creating new Individuals from 
subsets of the original one.  If the Individual had no associated 
Identification, then it would just have to be assumed that there were 
more than one living thing included in the Individual and nothing more 
than that could be inferred about what kind of taxa were included (e.g. 
the raw marine samples right out of the scoop, unanalyzed satellite 
photo, etc.).   I suppose the value could even be "0", indicating that 
an unsuccessful effort was made to find the taxon linked through the 
identification during the Event associated with the Occurrence 
(something that came up earlier).


The other issue would be how to indicate that one of these "aggregate" 
(individualCount>1) Individuals was segregated into several Individuals 
at a lower taxonomic level (e.g. partially sorted lots of marine 
organisms get more sorted, fossil aggregations get separated into 
individual fossils, an image of a forest has individual trees 
delineated).  Rich has stated that this would be necessary any time 
subsets of the original Individual were given divergent Identifications 
and I agree totally.  Assigning new GUIDs to these new Individuals would 
be no problem and I suppose Pete's suggestion of using "hasPart" and 
"isPartOf" could be used to establish the relationships between the 
original Individual and its "children".


I feel like there would be quite a bit of work that would be required to 
make this all work, but I guess I now believe that it COULD work, so I'm 
going to officially drop my objection to just inserting "taxon" into the 
definition as was proposed since what other have said have swayed me to 
believing that the broader definition won't stop me from using 
Individual as I had originally intended.  If the proposed addition goes 
through, we need to have a really good Google Code entry that summarizes 
the understanding that we have come to in this discussion (if we have 
come to one! :-).

Steve


Richard Pyle wrote:
> Hi Dean,
>  
> I agree with you -- but the question is, would you want to require that your
> "rough sorts" create Individual instances on a 1:1 basis with taxa at the
> rank of species or lower?  I was using the "Animalia" example as an extreme
> case.  In the real world, it's more like you describe.  That is, it's
> usually realtively easy, and extremely useful, to pre-sort such lots down
> tot he level of, say, family, and then generate instances of Individuals
> accordingly.  But the implication of the "species" requirement would be that
> you'd have to guestimate how many distinct species are represented in the
> lot, then generate an Individual instance for each one of them.  This, I
> think, would not be so easy in many cases.
>  
> I guess in your case you've divided Indentifications into two subclasses:
> "real", and....err...."artifical"? "temporary"? "provisiolnal"?
>  
> I'm not sure that's a solution that will work for most data
> providers/consumers (i.e., defining subclasses of Identifications), so it
> may not be ideal to "go there" in DwC.
>  
> I guess what I'm still struggling with is why species-level identifications
> are somehow fundamentally different from higher-rank identifications
> (fundamental enough that the scope of the Individual class is constained by
> them).  Especially when it's easy enough to filter out the
> higher-than-species-rank Identifications in cases where you want to exlcude
> them.
>  
> I think what we're circling around is something to the effect of: "An
> individual should be circumscibed/scoped in such a way that it can be
> Identified to a single Taxon".  I'm just saying there should be no
> constraints on the scope of that taxon.
>  
> For example, let's take the example of a lot that includes ophiuroids,
> gastropods, isopods, algae, and fish larvae.  I think there is consensus
> that we would *not* want to establish a single Individual instance that has
> multiple and concurrently legitimate Identification instances applied to it.
> In other words, we don't want an Individual instance that has one
> Identification instance for "ophiuroid", one for "gastropod", one for
> "isopod", one for "algae", and one for "fish", all simultaneously and
> legitimately applied to the single Individual instance. Stated another way,
> we don't want to allow for an Identification instance to apply to only part
> of an Individual -- an Identification instance should apply to the *entire*
> Individual.
>  
> So, in this example, we either would want no Identification instance at all
> applied to the aggregated lot, or if we really wanted an Identification, the
> best we could do (depending on the kind of algae) would be "Eukarya".  But
> if you want to acknowledge that there are five different subsets within that
> lot, then you generate five "child" Individual instances, each of which has
> a single legitimate Identification instance to its respective higher-rank
> taxon.
>  
> So, I think what we're really after here is that, when there are multiple
> Identification instances linking an instance of Individual to more than one
> distinct Taxon, these should always be mutually exclusive -- they should
> never be simultaneously legitimate (i.e., they should never apply to
> different "parts" of a single Individual instance).
>
> Looking at the big picture....we seem to agree that the maximum scope of an
> Event instance in terms of place is "Planet Earth" (for now, anyway); the
> maximum scope for time is "any window of time since there has been life on
> the planet" (i.e., up to 4 billion years or so).  We have discussed the
> ideal of establishing scoping metadata for both place (e.g.,
> coordinateUncertaintyInMeters, coordinatePrecision), and Time (not yet
> defined in DwC).  I think the same logic should apply to Indetifications.
> That is, the scope of Taxon that can be assigned to an Individual via an
> Identification should span anything from "forma" (or whatever the most
> precise taxon identification there is),  all the way up to Domain (or even
> simply "Life").  DwC:taxonRank already serves as a rough guide to defining
> the scope of any particular taxon Identification (with apologies to
> Nico-the-other-one).
>
> So the point is, I think the general rule for DwC should be "maximum
> flexibility for allowing breadth of usage, with sufficient metadata to allow
> for filtering to more restrictive usage".
>
> I'm guessing that most of that doesn't make much sense....sorry for that.
>
> Aloha,
> Rich
>
> Richard L. Pyle, PhD
> Database Coordinator for Natural Sciences
> Associate Zoologist in Ichthyology
> Dive Safety Officer
> Department of Natural Sciences, Bishop Museum
> 1525 Bernice St., Honolulu, HI 96817
> Ph: (808)848-4115, Fax: (808)847-8252
> email: deepreef at bishopmuseum.org
> http://hbs.bishopmuseum.org/staff/pylerichard.html
>
>
>
>
> ________________________________
>
> 	From: tdwg-content-bounces at lists.tdwg.org
> [mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Dean Pentcheff
> 	Sent: Wednesday, November 03, 2010 9:52 AM
> 	To: tdwg-content at lists.tdwg.org
> 	Subject: Re: [tdwg-content] tdwg-content Digest, Vol 20, Issue 17
> 	
> 	
> 	Leveraging off my earlier toss-in of the parent-child collection
> scheme, let me toss in this observation.
> 	
> 	I'll preface it by saying that although it's a situation we deal
> with in reality, my gut impulse is that it should probably _not_ be
> accomodated by the "Individual" concept under development.
> 	
> 	We have many records for un- or partially-sorted lots of marine
> invertebrate samples. Often we can make very rough determinations of what
> are in those lots (e.g., we can see that a jar contains ophiuroids,
> gastropods, sphaeromatid isopods, red algae, and larval fish). Critically,
> these are multiple particular and disjunct parts of the taxonomic hierarchy,
> not just a single "highest containing rank" determination.
> 	
> 	It turns out to be super-useful to record that very rough
> determination because (as alluded to by Rich) we can then appropriately make
> that jar available to visitors seeking particular taxa (and save them the
> trouble of grubbing through shelves of jars where we already "know" there's
> nothing of interest to them).
> 	
> 	Right now, we do _not_ conflate this rough determination with a Real
> Taxonomic Determination (RT and all that): they are two completely separate
> fields. So to find all the jars we know have ophiuroids, one does indeed
> have to search both the real taxonomic determination field as well as the
> rough-determination (text) field (if one wants to include unsorted lots in
> the quest).
> 	
> 	I'm introducing this case more with the idea that it may usefully
> help define the outer limits for "Individual" -- something that the
> "Individual" concept should _not_ accomodate. I can't really wrap my head
> around how the developing "Individual" concept can usefully be mutilated to
> accomodate this case.
> 	
> 	-Dean
> 	-- 
> 	Dean Pentcheff
> 	pentcheff at gmail.com
> 	dpentche at nhm.org
> 	
> 	
> 	
> 	On Wed, Nov 3, 2010 at 11:53 AM, Richard Pyle
> <deepreef at bishopmuseum.org> wrote:
> 	
>
> 		> I think if I'm understanding what John wrote,
> 		> he was going to substitute "taxon" for "species
> 		> (or lower taxonomic rank if it exists)" with
> 		> the understanding that Individual is not
> 		> intended to be used for aggregates of
> 		> different taxa.  That would solve this problem, right?
> 		
> 		
> 		It depends on what you mean by "different taxa".  If you are
> using the word
> 		"taxa" here to imply "species or lower ranks", than I don't
> think it would
> 		solve the problem.  But if you mean it in a generic way,
> then I'm OK with
> 		that.  By "in a generic way", suppose I had a trawl sample
> or a plankton tow
> 		sample that included unidentified organisms from multiple
> phyla, all of
> 		which are animals.  I should not be prevented from
> representing this
> 		aggregate as an "Individual", with an identification
> instances linked to a
> 		taxon concept labelled as "Animalia".  This means the
> contents of the
> 		Individual all belong to a single taxon (Animalia), and
> therefore it does
> 		not violate the condition excluding aggregates of different
> taxa. An
> 		instance of Individual so identified would be almost useless
> for many
> 		purposes, I agree -- but it's easy enough to filter such
> Individuals out by
> 		looking at dwc:taxonRank of the Taxon to which the
> Individual was
> 		identified. Also, it's not useless for all purposes, because
> a botanist
> 		would like to know that s/he doesn't have to look through
> that sample to
> 		find stuff of interest.
> 		
> 		I guess my point is, there should not be any rank-based
> requirement for the
> 		implied taxon circumscription of an "Individual".
> 		
> 		Rich
> 		
>
>
> 		_______________________________________________
> 		tdwg-content mailing list
> 		tdwg-content at lists.tdwg.org
> 		http://lists.tdwg.org/mailman/listinfo/tdwg-content
> 		
>
>
>
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> .
>
>   

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu



More information about the tdwg-content mailing list