[tdwg-content] tdwg-content Digest, Vol 20, Issue 17
Steve Baskauf
steve.baskauf at vanderbilt.edu
Thu Nov 4 02:38:04 CET 2010
I got Rich's several emails just before I left work and had several
hours in the car to think about it before responding (probably always a
good idea - the hours of thinking, not the car ride). Since I got home,
I've read the subsequent emails from various people and they confirm my
"car thinking". I get the point Rich is making and after pondering, I'm
now OK with just leaving the definition at "taxon". As in the other
classes he mentions (Location,time) we have the freedom to define the
scale and I suppose we should have that here as well.
What we need to realize is that in doing this, we are establishing
establishing dwc:Individual as the de facto "aggregation" unit of the
sort we have talked about previously. I suppose that is OK as long as
we provide sufficient means for humans or semantic reasoners to be able
to know what they are getting information about when they retrieve
metadata about an Individual. That can be done for humans by making
appropriate comments in the proposed dwc:individualRemarks . For
machines, I have been re-thinking the idea of having dwc:individualCount
being a property of an Individual. In an earlier post, I suggested that
it should remain a property of Occurrences, since the number of
Individuals can change over time (think wolf pack or plant deme over
time). However, in the current context, something like
dwc:individualCount (if not individualCount itself) that is a property
of Individual could have controlled values of "1" and ">1". If the
value is "1", then the Individual is a biological individual (still
subject to some uncertainty of definition in the case of clones) and we
can safely assume that if taxonRank has a high level value then the
determiner holds out the possibility that the same named biological
individual (i.e. possessing a certain GUID) could be identified to a
lower level. If the value is ">1" then the Individual is an
aggregation, bag, cluster, or whatever we want to call it. The
biological individuals involved could have a common identification on
any level from "life" for a photo of a eubacterium and archaebacterium,
to variety, subspecies, forma or whatever. In the case of ">1" it would
be understood that it may or may not be possible to refine the taxonomic
level of the Identification without creating new Individuals from
subsets of the original one. If the Individual had no associated
Identification, then it would just have to be assumed that there were
more than one living thing included in the Individual and nothing more
than that could be inferred about what kind of taxa were included (e.g.
the raw marine samples right out of the scoop, unanalyzed satellite
photo, etc.). I suppose the value could even be "0", indicating that
an unsuccessful effort was made to find the taxon linked through the
identification during the Event associated with the Occurrence
(something that came up earlier).
The other issue would be how to indicate that one of these "aggregate"
(individualCount>1) Individuals was segregated into several Individuals
at a lower taxonomic level (e.g. partially sorted lots of marine
organisms get more sorted, fossil aggregations get separated into
individual fossils, an image of a forest has individual trees
delineated). Rich has stated that this would be necessary any time
subsets of the original Individual were given divergent Identifications
and I agree totally. Assigning new GUIDs to these new Individuals would
be no problem and I suppose Pete's suggestion of using "hasPart" and
"isPartOf" could be used to establish the relationships between the
original Individual and its "children".
I feel like there would be quite a bit of work that would be required to
make this all work, but I guess I now believe that it COULD work, so I'm
going to officially drop my objection to just inserting "taxon" into the
definition as was proposed since what other have said have swayed me to
believing that the broader definition won't stop me from using
Individual as I had originally intended. If the proposed addition goes
through, we need to have a really good Google Code entry that summarizes
the understanding that we have come to in this discussion (if we have
come to one! :-).
Steve
Richard Pyle wrote:
> Hi Dean,
>
> I agree with you -- but the question is, would you want to require that your
> "rough sorts" create Individual instances on a 1:1 basis with taxa at the
> rank of species or lower? I was using the "Animalia" example as an extreme
> case. In the real world, it's more like you describe. That is, it's
> usually realtively easy, and extremely useful, to pre-sort such lots down
> tot he level of, say, family, and then generate instances of Individuals
> accordingly. But the implication of the "species" requirement would be that
> you'd have to guestimate how many distinct species are represented in the
> lot, then generate an Individual instance for each one of them. This, I
> think, would not be so easy in many cases.
>
> I guess in your case you've divided Indentifications into two subclasses:
> "real", and....err...."artifical"? "temporary"? "provisiolnal"?
>
> I'm not sure that's a solution that will work for most data
> providers/consumers (i.e., defining subclasses of Identifications), so it
> may not be ideal to "go there" in DwC.
>
> I guess what I'm still struggling with is why species-level identifications
> are somehow fundamentally different from higher-rank identifications
> (fundamental enough that the scope of the Individual class is constained by
> them). Especially when it's easy enough to filter out the
> higher-than-species-rank Identifications in cases where you want to exlcude
> them.
>
> I think what we're circling around is something to the effect of: "An
> individual should be circumscibed/scoped in such a way that it can be
> Identified to a single Taxon". I'm just saying there should be no
> constraints on the scope of that taxon.
>
> For example, let's take the example of a lot that includes ophiuroids,
> gastropods, isopods, algae, and fish larvae. I think there is consensus
> that we would *not* want to establish a single Individual instance that has
> multiple and concurrently legitimate Identification instances applied to it.
> In other words, we don't want an Individual instance that has one
> Identification instance for "ophiuroid", one for "gastropod", one for
> "isopod", one for "algae", and one for "fish", all simultaneously and
> legitimately applied to the single Individual instance. Stated another way,
> we don't want to allow for an Identification instance to apply to only part
> of an Individual -- an Identification instance should apply to the *entire*
> Individual.
>
> So, in this example, we either would want no Identification instance at all
> applied to the aggregated lot, or if we really wanted an Identification, the
> best we could do (depending on the kind of algae) would be "Eukarya". But
> if you want to acknowledge that there are five different subsets within that
> lot, then you generate five "child" Individual instances, each of which has
> a single legitimate Identification instance to its respective higher-rank
> taxon.
>
> So, I think what we're really after here is that, when there are multiple
> Identification instances linking an instance of Individual to more than one
> distinct Taxon, these should always be mutually exclusive -- they should
> never be simultaneously legitimate (i.e., they should never apply to
> different "parts" of a single Individual instance).
>
> Looking at the big picture....we seem to agree that the maximum scope of an
> Event instance in terms of place is "Planet Earth" (for now, anyway); the
> maximum scope for time is "any window of time since there has been life on
> the planet" (i.e., up to 4 billion years or so). We have discussed the
> ideal of establishing scoping metadata for both place (e.g.,
> coordinateUncertaintyInMeters, coordinatePrecision), and Time (not yet
> defined in DwC). I think the same logic should apply to Indetifications.
> That is, the scope of Taxon that can be assigned to an Individual via an
> Identification should span anything from "forma" (or whatever the most
> precise taxon identification there is), all the way up to Domain (or even
> simply "Life"). DwC:taxonRank already serves as a rough guide to defining
> the scope of any particular taxon Identification (with apologies to
> Nico-the-other-one).
>
> So the point is, I think the general rule for DwC should be "maximum
> flexibility for allowing breadth of usage, with sufficient metadata to allow
> for filtering to more restrictive usage".
>
> I'm guessing that most of that doesn't make much sense....sorry for that.
>
> Aloha,
> Rich
>
> Richard L. Pyle, PhD
> Database Coordinator for Natural Sciences
> Associate Zoologist in Ichthyology
> Dive Safety Officer
> Department of Natural Sciences, Bishop Museum
> 1525 Bernice St., Honolulu, HI 96817
> Ph: (808)848-4115, Fax: (808)847-8252
> email: deepreef at bishopmuseum.org
> http://hbs.bishopmuseum.org/staff/pylerichard.html
>
>
>
>
> ________________________________
>
> From: tdwg-content-bounces at lists.tdwg.org
> [mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Dean Pentcheff
> Sent: Wednesday, November 03, 2010 9:52 AM
> To: tdwg-content at lists.tdwg.org
> Subject: Re: [tdwg-content] tdwg-content Digest, Vol 20, Issue 17
>
>
> Leveraging off my earlier toss-in of the parent-child collection
> scheme, let me toss in this observation.
>
> I'll preface it by saying that although it's a situation we deal
> with in reality, my gut impulse is that it should probably _not_ be
> accomodated by the "Individual" concept under development.
>
> We have many records for un- or partially-sorted lots of marine
> invertebrate samples. Often we can make very rough determinations of what
> are in those lots (e.g., we can see that a jar contains ophiuroids,
> gastropods, sphaeromatid isopods, red algae, and larval fish). Critically,
> these are multiple particular and disjunct parts of the taxonomic hierarchy,
> not just a single "highest containing rank" determination.
>
> It turns out to be super-useful to record that very rough
> determination because (as alluded to by Rich) we can then appropriately make
> that jar available to visitors seeking particular taxa (and save them the
> trouble of grubbing through shelves of jars where we already "know" there's
> nothing of interest to them).
>
> Right now, we do _not_ conflate this rough determination with a Real
> Taxonomic Determination (RT and all that): they are two completely separate
> fields. So to find all the jars we know have ophiuroids, one does indeed
> have to search both the real taxonomic determination field as well as the
> rough-determination (text) field (if one wants to include unsorted lots in
> the quest).
>
> I'm introducing this case more with the idea that it may usefully
> help define the outer limits for "Individual" -- something that the
> "Individual" concept should _not_ accomodate. I can't really wrap my head
> around how the developing "Individual" concept can usefully be mutilated to
> accomodate this case.
>
> -Dean
> --
> Dean Pentcheff
> pentcheff at gmail.com
> dpentche at nhm.org
>
>
>
> On Wed, Nov 3, 2010 at 11:53 AM, Richard Pyle
> <deepreef at bishopmuseum.org> wrote:
>
>
> > I think if I'm understanding what John wrote,
> > he was going to substitute "taxon" for "species
> > (or lower taxonomic rank if it exists)" with
> > the understanding that Individual is not
> > intended to be used for aggregates of
> > different taxa. That would solve this problem, right?
>
>
> It depends on what you mean by "different taxa". If you are
> using the word
> "taxa" here to imply "species or lower ranks", than I don't
> think it would
> solve the problem. But if you mean it in a generic way,
> then I'm OK with
> that. By "in a generic way", suppose I had a trawl sample
> or a plankton tow
> sample that included unidentified organisms from multiple
> phyla, all of
> which are animals. I should not be prevented from
> representing this
> aggregate as an "Individual", with an identification
> instances linked to a
> taxon concept labelled as "Animalia". This means the
> contents of the
> Individual all belong to a single taxon (Animalia), and
> therefore it does
> not violate the condition excluding aggregates of different
> taxa. An
> instance of Individual so identified would be almost useless
> for many
> purposes, I agree -- but it's easy enough to filter such
> Individuals out by
> looking at dwc:taxonRank of the Taxon to which the
> Individual was
> identified. Also, it's not useless for all purposes, because
> a botanist
> would like to know that s/he doesn't have to look through
> that sample to
> find stuff of interest.
>
> I guess my point is, there should not be any rank-based
> requirement for the
> implied taxon circumscription of an "Individual".
>
> Rich
>
>
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>
>
>
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> .
>
>
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
VU Station B 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 343-6707
http://bioimages.vanderbilt.edu
More information about the tdwg-content
mailing list