Re: [tdwg-content] tdwg-content Digest, Vol 20, Issue 17
Steve is using Species as ranks in his definition and I think this is the wrong approach. Let's make all this rank agnostic please! Use the word taxon! What if I have a group of organisms that represents a polyphyletic species and I want to name a lineage (group of organisms) within this traditionally recognized species that I am not recognizing as species per se (as in rank of species). In other words, Identifications and ranks are two different things, so let's abandon ranks for a more objective discussion on taxa. Individuals are definitively not species.
Nico (the other one)
On Wed, Nov 3, 2010 at 7:24 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of mult
iple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote: Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa.
Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Comment 8 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8 by deepreef@hawaii.rr.com http://code.google.com/u/deepreef@hawaii.rr.com/ , Today (8 hours ago) I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Comment 9 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
I think if I'm understanding what John wrote, he was going to substitute "taxon" for "species (or lower taxonomic rank if it exists)" with the understanding that Individual is not intended to be used for aggregates of different taxa. That would solve this problem, right? Steve
Nico Cellinese wrote:
Steve is using Species as ranks in his definition and I think this is the wrong approach. Let's make all this rank agnostic please! Use the word taxon! What if I have a group of organisms that represents a polyphyletic species and I want to name a lineage (group of organisms) within this traditionally recognized species that I am not recognizing as species per se (as in rank of species). In other words, Identifications and ranks are two different things, so let's abandon ranks for a more objective discussion on taxa. Individuals are definitively not species.
Nico (the other one)
On Wed, Nov 3, 2010 at 7:24 AM, Steve Baskauf <steve.baskauf@vanderbilt.edu mailto:steve.baskauf@vanderbilt.edu> wrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the *Individual should only include groups of organisms that are reliably known to be of a single species* - *it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species*). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of mult
iple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote: Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa.
Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 <http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7> by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Delete comment <http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#>
Comment 8 <http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8> by deepreef@hawaii.rr.com mailto:deepreef@hawaii.rr.com http://code.google.com/u/deepreef@hawaii.rr.com/ , Today (8 hours ago) I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Delete comment <http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#>
Comment 9 <http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9> by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
I think if I'm understanding what John wrote, he was going to substitute "taxon" for "species (or lower taxonomic rank if it exists)" with the understanding that Individual is not intended to be used for aggregates of different taxa. That would solve this problem, right?
It depends on what you mean by "different taxa". If you are using the word "taxa" here to imply "species or lower ranks", than I don't think it would solve the problem. But if you mean it in a generic way, then I'm OK with that. By "in a generic way", suppose I had a trawl sample or a plankton tow sample that included unidentified organisms from multiple phyla, all of which are animals. I should not be prevented from representing this aggregate as an "Individual", with an identification instances linked to a taxon concept labelled as "Animalia". This means the contents of the Individual all belong to a single taxon (Animalia), and therefore it does not violate the condition excluding aggregates of different taxa. An instance of Individual so identified would be almost useless for many purposes, I agree -- but it's easy enough to filter such Individuals out by looking at dwc:taxonRank of the Taxon to which the Individual was identified. Also, it's not useless for all purposes, because a botanist would like to know that s/he doesn't have to look through that sample to find stuff of interest.
I guess my point is, there should not be any rank-based requirement for the implied taxon circumscription of an "Individual".
Rich
Thank you Rich! Ditto!
Nico
On Nov 3, 2010, at 2:53 PM, Richard Pyle wrote:
I think if I'm understanding what John wrote, he was going to substitute "taxon" for "species (or lower taxonomic rank if it exists)" with the understanding that Individual is not intended to be used for aggregates of different taxa. That would solve this problem, right?
It depends on what you mean by "different taxa". If you are using the word "taxa" here to imply "species or lower ranks", than I don't think it would solve the problem. But if you mean it in a generic way, then I'm OK with that. By "in a generic way", suppose I had a trawl sample or a plankton tow sample that included unidentified organisms from multiple phyla, all of which are animals. I should not be prevented from representing this aggregate as an "Individual", with an identification instances linked to a taxon concept labelled as "Animalia". This means the contents of the Individual all belong to a single taxon (Animalia), and therefore it does not violate the condition excluding aggregates of different taxa. An instance of Individual so identified would be almost useless for many purposes, I agree -- but it's easy enough to filter such Individuals out by looking at dwc:taxonRank of the Taxon to which the Individual was identified. Also, it's not useless for all purposes, because a botanist would like to know that s/he doesn't have to look through that sample to find stuff of interest.
I guess my point is, there should not be any rank-based requirement for the implied taxon circumscription of an "Individual".
Rich
<><><><><><><><><><><><><><><><><><><><><><><><> Nico Cellinese, Ph.D. Assistant Curator, Herbarium & Informatics Adjunct Assistant Professor, Department of Biology
Florida Museum of Natural History University of Florida 354 Dickinson Hall, PO Box 117800 Gainesville, FL 32611-7800, U.S.A. Tel. 352-273-1979 Fax 352-846-1861 http://cellinese.blogspot.com/
It depends on what you mean by "different taxa". If you are using the word "taxa" here to imply "species or lower ranks", than I don't think it would solve the problem. But if you mean it in a generic way, then I'm OK with that. By "in a generic way", suppose I had a trawl sample or a plankton tow sample that included unidentified organisms from multiple phyla, all of which are animals. I should not be prevented from representing this aggregate as an "Individual", with an identification instances linked to a taxon concept labelled as "Animalia". This means the contents of the Individual all belong to a single taxon (Animalia), and therefore it does not violate the condition excluding aggregates of different taxa. An
This is the circumstance where I would call it an "aggregation" (or whatever term) and not an Individual.
instance of Individual so identified would be almost useless for many purposes, I agree -- but it's easy enough to filter such Individuals out by looking at dwc:taxonRank of the Taxon to which the Individual was
That's not true. Look at this: http://bioimages.vanderbilt.edu/ind-baskauf/04101 Here is an image from an individual that I know is an Metrosideros species. I have identified it to genus but an Metrosideros expert could probably identify it to species at a later time. Your mechanism of looking at the taxon rank would not allow such Individuals to be separated from an "aggregation" that consisted of biological individuals belonging to two different species of Metrosideros. Click on the "Metrosideros [unknown]" link at the top of the page. You now see all of the images I have for Individuals that are in the Metrosideros genus. They are grouped by Individual by the locally unique identifiers under the images. The use case for which I want dwc:Individuals is to keep the images from the same individual (or a specimen and an image, or two specimens) grouped together. If I later were able to determine that the various Individuals on that page to species, I would assign them to a more specific taxon and that identification would automatically be applied to all images (or images and specimens or whatever) that I have grouped under a single Individual identifier. Another example is: http://bioimages.vanderbilt.edu/ind-baskauf/70858 I know its some kind of dicot and therefore have identified it to class Magnoliopsida. When I can identify it to species, I'll apply a more precise identification to that individual. That is different than calling some flower garden containing a bunch of dicots an "Individual" which would have a taxon rank of class. I feel strongly that kind of thing should be called something different (i.e aggregation or something).
I really feel like "watering down" the meaning of Individual in the way that you are suggesting (allowing it to be an aggregation of species or whatever you want to call them) is going to defeat the whole purpose of why I suggested having the class in the first case. So if somebody can think of an inoffensive way to describe a terminal taxon or species (or ssp. or var. if it exists), that is really what I intend for Individual to be restricted to. I'm not proposing the "aggregation" view.
Steve
identified. Also, it's not useless for all purposes, because a botanist would like to know that s/he doesn't have to look through that sample to find stuff of interest.
I guess my point is, there should not be any rank-based requirement for the implied taxon circumscription of an "Individual".
Rich
<><><><><><><><><><><><><><><><><><><><><><><><> Nico Cellinese, Ph.D. Assistant Curator, Herbarium & Informatics Adjunct Assistant Professor, Department of Biology
Florida Museum of Natural History University of Florida 354 Dickinson Hall, PO Box 117800 Gainesville, FL 32611-7800, U.S.A. Tel. 352-273-1979 Fax 352-846-1861 http://cellinese.blogspot.com/
.
Leveraging off my earlier toss-in of the parent-child collection scheme, let me toss in this observation.
I'll preface it by saying that although it's a situation we deal with in reality, my gut impulse is that it should probably _not_ be accomodated by the "Individual" concept under development.
We have many records for un- or partially-sorted lots of marine invertebrate samples. Often we can make very rough determinations of what are in those lots (e.g., we can see that a jar contains ophiuroids, gastropods, sphaeromatid isopods, red algae, and larval fish). Critically, these are multiple particular and disjunct parts of the taxonomic hierarchy, not just a single "highest containing rank" determination.
It turns out to be super-useful to record that very rough determination because (as alluded to by Rich) we can then appropriately make that jar available to visitors seeking particular taxa (and save them the trouble of grubbing through shelves of jars where we already "know" there's nothing of interest to them).
Right now, we do _not_ conflate this rough determination with a Real Taxonomic Determination (®™ and all that): they are two completely separate fields. So to find all the jars we know have ophiuroids, one does indeed have to search both the real taxonomic determination field as well as the rough-determination (text) field (if one wants to include unsorted lots in the quest).
I'm introducing this case more with the idea that it may usefully help define the outer limits for "Individual" -- something that the "Individual" concept should _not_ accomodate. I can't really wrap my head around how the developing "Individual" concept can usefully be mutilated to accomodate this case.
-Dean
Hi Dean,
I agree with you -- but the question is, would you want to require that your "rough sorts" create Individual instances on a 1:1 basis with taxa at the rank of species or lower? I was using the "Animalia" example as an extreme case. In the real world, it's more like you describe. That is, it's usually realtively easy, and extremely useful, to pre-sort such lots down tot he level of, say, family, and then generate instances of Individuals accordingly. But the implication of the "species" requirement would be that you'd have to guestimate how many distinct species are represented in the lot, then generate an Individual instance for each one of them. This, I think, would not be so easy in many cases.
I guess in your case you've divided Indentifications into two subclasses: "real", and....err...."artifical"? "temporary"? "provisiolnal"?
I'm not sure that's a solution that will work for most data providers/consumers (i.e., defining subclasses of Identifications), so it may not be ideal to "go there" in DwC.
I guess what I'm still struggling with is why species-level identifications are somehow fundamentally different from higher-rank identifications (fundamental enough that the scope of the Individual class is constained by them). Especially when it's easy enough to filter out the higher-than-species-rank Identifications in cases where you want to exlcude them.
I think what we're circling around is something to the effect of: "An individual should be circumscibed/scoped in such a way that it can be Identified to a single Taxon". I'm just saying there should be no constraints on the scope of that taxon.
For example, let's take the example of a lot that includes ophiuroids, gastropods, isopods, algae, and fish larvae. I think there is consensus that we would *not* want to establish a single Individual instance that has multiple and concurrently legitimate Identification instances applied to it. In other words, we don't want an Individual instance that has one Identification instance for "ophiuroid", one for "gastropod", one for "isopod", one for "algae", and one for "fish", all simultaneously and legitimately applied to the single Individual instance. Stated another way, we don't want to allow for an Identification instance to apply to only part of an Individual -- an Identification instance should apply to the *entire* Individual.
So, in this example, we either would want no Identification instance at all applied to the aggregated lot, or if we really wanted an Identification, the best we could do (depending on the kind of algae) would be "Eukarya". But if you want to acknowledge that there are five different subsets within that lot, then you generate five "child" Individual instances, each of which has a single legitimate Identification instance to its respective higher-rank taxon.
So, I think what we're really after here is that, when there are multiple Identification instances linking an instance of Individual to more than one distinct Taxon, these should always be mutually exclusive -- they should never be simultaneously legitimate (i.e., they should never apply to different "parts" of a single Individual instance).
Looking at the big picture....we seem to agree that the maximum scope of an Event instance in terms of place is "Planet Earth" (for now, anyway); the maximum scope for time is "any window of time since there has been life on the planet" (i.e., up to 4 billion years or so). We have discussed the ideal of establishing scoping metadata for both place (e.g., coordinateUncertaintyInMeters, coordinatePrecision), and Time (not yet defined in DwC). I think the same logic should apply to Indetifications. That is, the scope of Taxon that can be assigned to an Individual via an Identification should span anything from "forma" (or whatever the most precise taxon identification there is), all the way up to Domain (or even simply "Life"). DwC:taxonRank already serves as a rough guide to defining the scope of any particular taxon Identification (with apologies to Nico-the-other-one).
So the point is, I think the general rule for DwC should be "maximum flexibility for allowing breadth of usage, with sufficient metadata to allow for filtering to more restrictive usage".
I'm guessing that most of that doesn't make much sense....sorry for that.
Aloha, Rich
Richard L. Pyle, PhD Database Coordinator for Natural Sciences Associate Zoologist in Ichthyology Dive Safety Officer Department of Natural Sciences, Bishop Museum 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
________________________________
From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Dean Pentcheff Sent: Wednesday, November 03, 2010 9:52 AM To: tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] tdwg-content Digest, Vol 20, Issue 17 Leveraging off my earlier toss-in of the parent-child collection scheme, let me toss in this observation. I'll preface it by saying that although it's a situation we deal with in reality, my gut impulse is that it should probably _not_ be accomodated by the "Individual" concept under development. We have many records for un- or partially-sorted lots of marine invertebrate samples. Often we can make very rough determinations of what are in those lots (e.g., we can see that a jar contains ophiuroids, gastropods, sphaeromatid isopods, red algae, and larval fish). Critically, these are multiple particular and disjunct parts of the taxonomic hierarchy, not just a single "highest containing rank" determination. It turns out to be super-useful to record that very rough determination because (as alluded to by Rich) we can then appropriately make that jar available to visitors seeking particular taxa (and save them the trouble of grubbing through shelves of jars where we already "know" there's nothing of interest to them). Right now, we do _not_ conflate this rough determination with a Real Taxonomic Determination (RT and all that): they are two completely separate fields. So to find all the jars we know have ophiuroids, one does indeed have to search both the real taxonomic determination field as well as the rough-determination (text) field (if one wants to include unsorted lots in the quest). I'm introducing this case more with the idea that it may usefully help define the outer limits for "Individual" -- something that the "Individual" concept should _not_ accomodate. I can't really wrap my head around how the developing "Individual" concept can usefully be mutilated to accomodate this case. -Dean -- Dean Pentcheff pentcheff@gmail.com dpentche@nhm.org On Wed, Nov 3, 2010 at 11:53 AM, Richard Pyle deepreef@bishopmuseum.org wrote:
> I think if I'm understanding what John wrote, > he was going to substitute "taxon" for "species > (or lower taxonomic rank if it exists)" with > the understanding that Individual is not > intended to be used for aggregates of > different taxa. That would solve this problem, right? It depends on what you mean by "different taxa". If you are using the word "taxa" here to imply "species or lower ranks", than I don't think it would solve the problem. But if you mean it in a generic way, then I'm OK with that. By "in a generic way", suppose I had a trawl sample or a plankton tow sample that included unidentified organisms from multiple phyla, all of which are animals. I should not be prevented from representing this aggregate as an "Individual", with an identification instances linked to a taxon concept labelled as "Animalia". This means the contents of the Individual all belong to a single taxon (Animalia), and therefore it does not violate the condition excluding aggregates of different taxa. An instance of Individual so identified would be almost useless for many purposes, I agree -- but it's easy enough to filter such Individuals out by looking at dwc:taxonRank of the Taxon to which the Individual was identified. Also, it's not useless for all purposes, because a botanist would like to know that s/he doesn't have to look through that sample to find stuff of interest. I guess my point is, there should not be any rank-based requirement for the implied taxon circumscription of an "Individual". Rich
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
I got Rich's several emails just before I left work and had several hours in the car to think about it before responding (probably always a good idea - the hours of thinking, not the car ride). Since I got home, I've read the subsequent emails from various people and they confirm my "car thinking". I get the point Rich is making and after pondering, I'm now OK with just leaving the definition at "taxon". As in the other classes he mentions (Location,time) we have the freedom to define the scale and I suppose we should have that here as well.
What we need to realize is that in doing this, we are establishing establishing dwc:Individual as the de facto "aggregation" unit of the sort we have talked about previously. I suppose that is OK as long as we provide sufficient means for humans or semantic reasoners to be able to know what they are getting information about when they retrieve metadata about an Individual. That can be done for humans by making appropriate comments in the proposed dwc:individualRemarks . For machines, I have been re-thinking the idea of having dwc:individualCount being a property of an Individual. In an earlier post, I suggested that it should remain a property of Occurrences, since the number of Individuals can change over time (think wolf pack or plant deme over time). However, in the current context, something like dwc:individualCount (if not individualCount itself) that is a property of Individual could have controlled values of "1" and ">1". If the value is "1", then the Individual is a biological individual (still subject to some uncertainty of definition in the case of clones) and we can safely assume that if taxonRank has a high level value then the determiner holds out the possibility that the same named biological individual (i.e. possessing a certain GUID) could be identified to a lower level. If the value is ">1" then the Individual is an aggregation, bag, cluster, or whatever we want to call it. The biological individuals involved could have a common identification on any level from "life" for a photo of a eubacterium and archaebacterium, to variety, subspecies, forma or whatever. In the case of ">1" it would be understood that it may or may not be possible to refine the taxonomic level of the Identification without creating new Individuals from subsets of the original one. If the Individual had no associated Identification, then it would just have to be assumed that there were more than one living thing included in the Individual and nothing more than that could be inferred about what kind of taxa were included (e.g. the raw marine samples right out of the scoop, unanalyzed satellite photo, etc.). I suppose the value could even be "0", indicating that an unsuccessful effort was made to find the taxon linked through the identification during the Event associated with the Occurrence (something that came up earlier).
The other issue would be how to indicate that one of these "aggregate" (individualCount>1) Individuals was segregated into several Individuals at a lower taxonomic level (e.g. partially sorted lots of marine organisms get more sorted, fossil aggregations get separated into individual fossils, an image of a forest has individual trees delineated). Rich has stated that this would be necessary any time subsets of the original Individual were given divergent Identifications and I agree totally. Assigning new GUIDs to these new Individuals would be no problem and I suppose Pete's suggestion of using "hasPart" and "isPartOf" could be used to establish the relationships between the original Individual and its "children".
I feel like there would be quite a bit of work that would be required to make this all work, but I guess I now believe that it COULD work, so I'm going to officially drop my objection to just inserting "taxon" into the definition as was proposed since what other have said have swayed me to believing that the broader definition won't stop me from using Individual as I had originally intended. If the proposed addition goes through, we need to have a really good Google Code entry that summarizes the understanding that we have come to in this discussion (if we have come to one! :-).
Steve
Richard Pyle wrote:
Hi Dean,
I agree with you -- but the question is, would you want to require that your "rough sorts" create Individual instances on a 1:1 basis with taxa at the rank of species or lower? I was using the "Animalia" example as an extreme case. In the real world, it's more like you describe. That is, it's usually realtively easy, and extremely useful, to pre-sort such lots down tot he level of, say, family, and then generate instances of Individuals accordingly. But the implication of the "species" requirement would be that you'd have to guestimate how many distinct species are represented in the lot, then generate an Individual instance for each one of them. This, I think, would not be so easy in many cases.
I guess in your case you've divided Indentifications into two subclasses: "real", and....err...."artifical"? "temporary"? "provisiolnal"?
I'm not sure that's a solution that will work for most data providers/consumers (i.e., defining subclasses of Identifications), so it may not be ideal to "go there" in DwC.
I guess what I'm still struggling with is why species-level identifications are somehow fundamentally different from higher-rank identifications (fundamental enough that the scope of the Individual class is constained by them). Especially when it's easy enough to filter out the higher-than-species-rank Identifications in cases where you want to exlcude them.
I think what we're circling around is something to the effect of: "An individual should be circumscibed/scoped in such a way that it can be Identified to a single Taxon". I'm just saying there should be no constraints on the scope of that taxon.
For example, let's take the example of a lot that includes ophiuroids, gastropods, isopods, algae, and fish larvae. I think there is consensus that we would *not* want to establish a single Individual instance that has multiple and concurrently legitimate Identification instances applied to it. In other words, we don't want an Individual instance that has one Identification instance for "ophiuroid", one for "gastropod", one for "isopod", one for "algae", and one for "fish", all simultaneously and legitimately applied to the single Individual instance. Stated another way, we don't want to allow for an Identification instance to apply to only part of an Individual -- an Identification instance should apply to the *entire* Individual.
So, in this example, we either would want no Identification instance at all applied to the aggregated lot, or if we really wanted an Identification, the best we could do (depending on the kind of algae) would be "Eukarya". But if you want to acknowledge that there are five different subsets within that lot, then you generate five "child" Individual instances, each of which has a single legitimate Identification instance to its respective higher-rank taxon.
So, I think what we're really after here is that, when there are multiple Identification instances linking an instance of Individual to more than one distinct Taxon, these should always be mutually exclusive -- they should never be simultaneously legitimate (i.e., they should never apply to different "parts" of a single Individual instance).
Looking at the big picture....we seem to agree that the maximum scope of an Event instance in terms of place is "Planet Earth" (for now, anyway); the maximum scope for time is "any window of time since there has been life on the planet" (i.e., up to 4 billion years or so). We have discussed the ideal of establishing scoping metadata for both place (e.g., coordinateUncertaintyInMeters, coordinatePrecision), and Time (not yet defined in DwC). I think the same logic should apply to Indetifications. That is, the scope of Taxon that can be assigned to an Individual via an Identification should span anything from "forma" (or whatever the most precise taxon identification there is), all the way up to Domain (or even simply "Life"). DwC:taxonRank already serves as a rough guide to defining the scope of any particular taxon Identification (with apologies to Nico-the-other-one).
So the point is, I think the general rule for DwC should be "maximum flexibility for allowing breadth of usage, with sufficient metadata to allow for filtering to more restrictive usage".
I'm guessing that most of that doesn't make much sense....sorry for that.
Aloha, Rich
Richard L. Pyle, PhD Database Coordinator for Natural Sciences Associate Zoologist in Ichthyology Dive Safety Officer Department of Natural Sciences, Bishop Museum 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Dean Pentcheff Sent: Wednesday, November 03, 2010 9:52 AM To: tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] tdwg-content Digest, Vol 20, Issue 17
Leveraging off my earlier toss-in of the parent-child collection scheme, let me toss in this observation.
I'll preface it by saying that although it's a situation we deal with in reality, my gut impulse is that it should probably _not_ be accomodated by the "Individual" concept under development.
We have many records for un- or partially-sorted lots of marine invertebrate samples. Often we can make very rough determinations of what are in those lots (e.g., we can see that a jar contains ophiuroids, gastropods, sphaeromatid isopods, red algae, and larval fish). Critically, these are multiple particular and disjunct parts of the taxonomic hierarchy, not just a single "highest containing rank" determination.
It turns out to be super-useful to record that very rough determination because (as alluded to by Rich) we can then appropriately make that jar available to visitors seeking particular taxa (and save them the trouble of grubbing through shelves of jars where we already "know" there's nothing of interest to them).
Right now, we do _not_ conflate this rough determination with a Real Taxonomic Determination (RT and all that): they are two completely separate fields. So to find all the jars we know have ophiuroids, one does indeed have to search both the real taxonomic determination field as well as the rough-determination (text) field (if one wants to include unsorted lots in the quest).
I'm introducing this case more with the idea that it may usefully help define the outer limits for "Individual" -- something that the "Individual" concept should _not_ accomodate. I can't really wrap my head around how the developing "Individual" concept can usefully be mutilated to accomodate this case.
-Dean
Dean Pentcheff pentcheff@gmail.com dpentche@nhm.org
On Wed, Nov 3, 2010 at 11:53 AM, Richard Pyle deepreef@bishopmuseum.org wrote:
> I think if I'm understanding what John wrote, > he was going to substitute "taxon" for "species > (or lower taxonomic rank if it exists)" with > the understanding that Individual is not > intended to be used for aggregates of > different taxa. That would solve this problem, right?
It depends on what you mean by "different taxa". If you are using the word "taxa" here to imply "species or lower ranks", than I don't think it would solve the problem. But if you mean it in a generic way, then I'm OK with that. By "in a generic way", suppose I had a trawl sample or a plankton tow sample that included unidentified organisms from multiple phyla, all of which are animals. I should not be prevented from representing this aggregate as an "Individual", with an identification instances linked to a taxon concept labelled as "Animalia". This means the contents of the Individual all belong to a single taxon (Animalia), and therefore it does not violate the condition excluding aggregates of different taxa. An instance of Individual so identified would be almost useless for many purposes, I agree -- but it's easy enough to filter such Individuals out by looking at dwc:taxonRank of the Taxon to which the Individual was identified. Also, it's not useless for all purposes, because a botanist would like to know that s/he doesn't have to look through that sample to find stuff of interest. I guess my point is, there should not be any rank-based requirement for the implied taxon circumscription of an "Individual". Rich
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content .
My turn to respond after traffic-jam reflections....
What we need to realize is that in doing this, we are establishing establishing dwc:Individual as the de facto "aggregation" unit of the sort we have talked about previously. I suppose that is OK as long as we provide sufficient means for humans or semantic reasoners to be able to know what they are getting information about when they retrieve metadata about an Individual. That can be done for humans by making appropriate comments in the proposed dwc:individualRemarks . For machines, I have been re-thinking the idea of having dwc:individualCount being a property of an Individual. In an earlier post, I suggested that it should remain a property of Occurrences, since the number of Individuals can change over time (think wolf pack or plant deme over time).
In my mind, if the "Individual" is something like a wolf pack, and the wolf pack changes composition over time, then you're really dealing with different instances of a "wolf pack" (not just different occurrences of the same individual). There may be a desire to relate the two Wolf Packs together in some semantic way, of course.
However, in the current context, something like dwc:individualCount (if not individualCount itself) that is a property of Individual could have controlled values of "1" and ">1".
Hmmm....I'd rather keep individualCount as a numeric value (rather than a controleld vocabulary), to allow specific values to be provided for numbers of specimens in a lot, numbers of individuals seen in an observation, etc.
If you want a controlled vocabulary for semantic purposes, I'd favor something like a new term "individualScope" being used for such purposes. Values in a controlled vocabulary might include things like "Single", "Group", "Aggregate" "Absent", etc., each with carefully worded definitions.
The other issue would be how to indicate that one of these "aggregate" (individualCount>1) Individuals was segregated into several Individuals at a lower taxonomic level (e.g. partially sorted lots of marine organisms get more sorted, fossil aggregations get separated into individual fossils, an image of a forest has individual trees delineated). Rich has stated that this would be necessary any time subsets of the original Individual were given divergent Identifications and I agree totally. Assigning new GUIDs to these new Individuals would be no problem and I suppose Pete's suggestion of using "hasPart" and "isPartOf" could be used to establish the relationships between the original Individual and its "children".
Yup, I think we're on the same page on this one.
If the proposed addition goes through, we need to have a really good Google Code entry that summarizes the understanding that we have come to in this discussion (if we have come to one! :-).
Agreed. We also need to be clear on what terms that currently fall within the Occurrence class should properly move to a new Individual Class. My vote would be for:
individualID (unless moved to Record-level terms) individualCount preparations disposition previousIdentifications associatedSequences
This also assumes that dwc:catalogNumber and dwc:otherCatalogNumbers be re-assigned to Record-level terms. Was there some reason this isn't appropriate?
Some of these are certainly debatable. For example, the PreservedSpecimen-centric "preparations" and "disposition" really do seem to me to be properties of the Individual, not of the Occurrence at which the Individual was extracted from nature. But I can see a problem when the same Individual is first seen in nature, then is extracted later as a specimen; in which case it seems weired to include these properties at the level of Individual.
I think if we were going to be really pure about this, we should generate two instances of Individual for each PreservedSpecimen: one representing effectively an observation at the moment of capture (which would technically have as basisOfRecord "LivingSpecimen" or "HumanObservation", and is linked to the Occurrence), and the other representing the preserved specimen after it is curated and processed (linked appropriately to the first Individual instance).
But I think that's going too far.
OK, enough for a while....
Aloha, Rich
Comments inline
Richard Pyle wrote:
My turn to respond after traffic-jam reflections....
What we need to realize is that in doing this, we are establishing establishing dwc:Individual as the de facto "aggregation" unit of the sort we have talked about previously. I suppose that is OK as long as we provide sufficient means for humans or semantic reasoners to be able to know what they are getting information about when they retrieve metadata about an Individual. That can be done for humans by making appropriate comments in the proposed dwc:individualRemarks . For machines, I have been re-thinking the idea of having dwc:individualCount being a property of an Individual. In an earlier post, I suggested that it should remain a property of Occurrences, since the number of Individuals can change over time (think wolf pack or plant deme over time).
In my mind, if the "Individual" is something like a wolf pack, and the wolf pack changes composition over time, then you're really dealing with different instances of a "wolf pack" (not just different occurrences of the same individual). There may be a desire to relate the two Wolf Packs together in some semantic way, of course.
If we go down the road we seem to be traveling now, I think we must allow a wolf pack to be an Individual. It is a definable aggregation of biological individuals which can be assigned an identifier that can be the subject of repeated Occurrences and it can reliably be identified to some taxon. The definition we have doesn't (at this point) forbid that or say that the number of biological individuals in an Individual must stay the same over time. It is very convenient for the use case that got this all started (having a small patch of plants of the same species) that we don't require that the number change. A major feature of Individual is to allow sampling over time (a la the definition of individualID) and if the number of individuals changes between Occurrences I shouldn't have to redefine a new individual. It is quite likely that I won't know or don't care whether the number has changed. Case in point: http://bioimages.vanderbilt.edu/ind-baskauf/51273 I have photographed this same patch of corn gromwell several times. I have no idea how many individual plants are in it nor do I care.
However, in the current context, something like dwc:individualCount (if not individualCount itself) that is a property of Individual could have controlled values of "1" and ">1".
Hmmm....I'd rather keep individualCount as a numeric value (rather than a controleld vocabulary), to allow specific values to be provided for numbers of specimens in a lot, numbers of individuals seen in an observation, etc.
If you want a controlled vocabulary for semantic purposes, I'd favor something like a new term "individualScope" being used for such purposes. Values in a controlled vocabulary might include things like "Single", "Group", "Aggregate" "Absent", etc., each with carefully worded definitions.
I agree. I think something "like individualCount" is what we need, but its probably best to leave individualCount where it is (in the Occurrence class) for now so that the count can change over time. I would suggest that if Individual is accepted, those who want to use it can experiment with something like individualScope and find out what works. Then bring it up as a possible term addition.
The other issue would be how to indicate that one of these "aggregate" (individualCount>1) Individuals was segregated into several Individuals at a lower taxonomic level (e.g. partially sorted lots of marine organisms get more sorted, fossil aggregations get separated into individual fossils, an image of a forest has individual trees delineated). Rich has stated that this would be necessary any time subsets of the original Individual were given divergent Identifications and I agree totally. Assigning new GUIDs to these new Individuals would be no problem and I suppose Pete's suggestion of using "hasPart" and "isPartOf" could be used to establish the relationships between the original Individual and its "children".
Yup, I think we're on the same page on this one.
If the proposed addition goes through, we need to have a really good Google Code entry that summarizes the understanding that we have come to in this discussion (if we have come to one! :-).
Agreed. We also need to be clear on what terms that currently fall within the Occurrence class should properly move to a new Individual Class. My vote would be for:
individualID (unless moved to Record-level terms)
Move to record level terms
individualCount
leave (see above)
preparations disposition
I think leave in Occurrence for now. I think that if we come up with a DwC that will work with RDF, we will be forced to create a PhysicalSpecimen class and these things will be in it. Time will tell... But they are closer to an Occurrence than an Individual.
previousIdentifications
Hmm. I suppose yes, but better to just have another instance of Identification. Why not?
associatedSequences
I suppose you won't agree on this, but I don't see sequences as any different than other tokens/evidence types that I think we should allow to document Occurrences. I would like this term to eventually go away, at least for people using RDF who will explicitly create resources for tokens and then type them.
This also assumes that dwc:catalogNumber and dwc:otherCatalogNumbers be re-assigned to Record-level terms. Was there some reason this isn't appropriate?
I think it is appropriate because they should be usable with at least two classes: Individual (for living specimens) and Occurrences (e.g. preserved specimens, images)
Some of these are certainly debatable. For example, the PreservedSpecimen-centric "preparations" and "disposition" really do seem to me to be properties of the Individual, not of the Occurrence at which the Individual was extracted from nature. But I can see a problem when the same Individual is first seen in nature, then is extracted later as a specimen; in which case it seems weired to include these properties at the level of Individual.
I haven't said this before, but are we allowing Individuals to be dead? I would have said "no", but now I'm not sure. I have convinced myself that if John's proverbial wildebeast calf (or whatever it was) gets captured and becomes a LivingSpecimen in a zoo, it is still an Individual but now one with curation (i.e. there is no such thing as a LivingSpecimen, it's just an Individual with curation). However, I'm having trouble doing the same thing with PreservedSpecimen. If we put it in a jar of alcohol and cut it into many separately-cataloged pieces, are all of the pieces still some of the Individual? I think Pete might have been suggesting modeling things that way with "partOf". What if we cut a branch from a tree, glue part of it to a page and turn part of it into a DNA sample that get sequenced. Are those all a part of an Individual? I don't really want them to be, but maybe I must? Somehow we need to be able to handle road-kill, which will be dead when we make the observation/collection. If we cut a branch from a tree (an Individual), root it, and grow it in a botanical garden, do we call the resulting tree in the garden the same Individual? I would assign it a new identifier and call it a new Individual. I guess my point is that I would only apply the term Individual to dead stuff, pieces of dead stuff, and living pieces of things with extreme caution.
I think if we were going to be really pure about this, we should generate two instances of Individual for each PreservedSpecimen: one representing effectively an observation at the moment of capture (which would technically have as basisOfRecord "LivingSpecimen" or "HumanObservation", and is linked to the Occurrence), and the other representing the preserved specimen after it is curated and processed (linked appropriately to the first Individual instance).
But I think that's going too far.
I think maybe so. Maybe the appropriate course of action here as well is to let people try different approaches out and if they turn out to work and be needed, then we talk about applying them to Darwin Core. Steve
OK, enough for a while....
Aloha, Rich
.
If we go down the road we seem to be traveling now, I think we must allow a wolf pack to be an Individual. It is a definable aggregation of biological individuals which can be assigned an identifier that can be the subject of repeated Occurrences and it can reliably be identified to some taxon. The definition we have doesn't (at this point) forbid that or say that the number of biological individuals in an Individual must stay the same over time. It is very convenient for the use case that got this all started (having a small patch of plants of the same species) that we don't require that the number change. A major feature of Individual is to allow sampling over time (a la the definition of individualID) and if the number of individuals changes between Occurrences I shouldn't have to redefine a new individual. It is quite likely that I won't know or don't care whether the number has changed.
Fair enough. Obviously, if we're going to allow for a population to be an Individual, and we're going to want to track that population over time, then the composition of the populaition is going to change over time. It just seems weird from a specimen perspective to associate the individualCount with the Occurrence, rather than with the Individual. But I have to agree with your points above.
previousIdentifications
Hmm. I suppose yes, but better to just have another instance of Identification. Why not?
When the data are structured that way at the source, yes. But a number of DwC terms exist because many content sources have not parsed/normalized all their data to the full extent of the DwC classes. Therefore, I think previousIdentifications should be kept, and if so, it should be part of the Individual class.
associatedSequences
I suppose you won't agree on this, but I don't see sequences as any different than other tokens/evidence types that I think we should allow to document Occurrences. I would like this term to eventually go away, at least for people using RDF who will explicitly create resources for tokens and then type them.
OK, well I guess "Sequences" per se are functionally equivalent to images, in that they are not the organism themselves, but rather a representation of some aspect of the organisms (in this case, a representation of the molecular structure of the DNA molecules contained within the cells of the organism, rather than a representation of light waves reflected off the exterior of an organism in the case of an image, or of x-ray waves transmitted through an organism in the case of a radiograph image). I was thinking more in terms of tissue samples -- which I will much more stubbornly defend as being in the Individual class -- but I guess more in terms of "individualScope".
Here's one thing I'm not so certain about, though. An in-situ image of an organism is clearly a token of an Occurrence, because it is evidence of the organism at the place/time. An image of the preserved specimen in a Museum, or an x-ray, etc., is not really a token of an Occurrence, because it's not evidence of the organism at the place/time of its capture. Same goes for Sequences -- they are a token of the Individual organism, not of the occurrence of the organism at a place and time. This is why I have a hard time thinking of such things as tokens of an Occurrence, when they are really more tokens of the Individual.
This also assumes that dwc:catalogNumber and dwc:otherCatalogNumbers be re-assigned to Record-level terms. Was there some reason this isn't appropriate? I think it is appropriate because they should be usable with at least two classes: Individual (for living specimens) and Occurrences (e.g. preserved specimens, images)
Hmmm...on that basis, should individualCount and the various tokens also be Record-level terms -- on the basis that they can apply either to an Occurrence, or to an Individual? Actually, in the case of DNA Barcodes and such, isn't it possible to also represent a DNA Sequence as an attribute of a Taxon as well? If the purpose of Record-level terms is to aggregate terms that apply to more than one class, then perhaps that is the solution for a number of these things (including disposition, and maybe even preparation -- depending on how broadly those things are defined)?
I haven't said this before, but are we allowing Individuals to be dead?
Errr....fossils? Preserved specimens? Are they not Individuals? I know you think of them in terms of tracking a living organism over time. But that's only one of the reasons why I support an Individual class (not even the main reason). To me, the main reason is that an "Individual" represents the actual organism(s), separate from an Occurrence, which represents the presence of an organism at a particular place and time.
If we put it in a jar of alcohol and cut it into many separately-cataloged pieces, are all of the pieces still some of the Individual?
This is why we need two things for an Individual:
individualScope (which can range anywhere from the aggregates of multiple individuals, all the way down to the smallest parts of individuals)
And, a mechanism to track series of "derived from" Individuals. The ASC model covered this, I think (right, Stan?)
I think Pete might have been suggesting modeling things that way with "partOf". What if we cut a branch from a tree, glue part of it to a page and turn part of it into a DNA sample that get sequenced. Are those all a part of an Individual?
It seems to me that each unit could represent a separate instance of Individual, but the "parts" need to be clearly aggregated around the single-organism parent Individual, which itself may be a part of another Individual instance that is an aggregate lot of specimens, which itself could be a subsampling of another Individual instance that represents a population in nature.
In my mind, we parse all of these things as separate instances of Individuals, but join them via a hierarchical (parent/child) relationship. If I'm not mistaken, this is how the ASC model managed instances of BiologicalObject (again....right, Stan?)
I don't really want them to be, but maybe I must? Somehow we need to be able to handle road-kill, which will be dead when we make the observation/collection. If we cut a branch from a tree (an Individual), root it, and grow it in a botanical garden, do we call the resulting tree in the garden the same Individual? I would assign it a new identifier and call it a new Individual. I guess my point is that I would only apply the term Individual to dead stuff, pieces of dead stuff, and living pieces of things with extreme caution.
Why extreme caution? What are the risks that we are cautioning ourselves against?
These are some principles that I always try to keep in mind when discussing these things:
- DwC is a data exchange standard, not so much a physical data model. - There is a necessary balance between structuring DwC around how data actually exist in content-provider databases, and how data *should* be represented in a normalised world - When in doubt, DwC should be accomodating, rather than restrictive -- especially when more restrictive needs can be met via associated data filtering
There are other principles as well, but these are the ones I keep having to remind myself of.
I think maybe so. Maybe the appropriate course of action here as well is to let people try different approaches out and if they turn out to work and be needed, then we talk about applying them to Darwin Core.
Ultimately, I think people will use it in accordance to what terms are nested within it -- which is why I think it's important to have this conversation we're having now.
Aloha, Rich
For those of you who triage emails and don't read long emails, the bottom line is that although I agree with some of Rich's points, I think that the suggestion that parts of Individuals should be classified as Individuals does not fit the definition that is on the table for the proposed class dwc:Individual. I argue that allowing pieces of organisms to be called Individuals defeats the purpose of having the Individual class. I suggest an alternative approach that I think is the most straightforward method of separating tokens from the occurrences they document. The acceptance or rejection of Individual as a new class does not hinge on my suggested approach. Development of a system to handle more complicated resource relationships can take place independently of the proposal for the Individual class.
Responses inline below:
Richard Pyle wrote:
...
previousIdentifications
Hmm. I suppose yes, but better to just have another instance of Identification. Why not?
When the data are structured that way at the source, yes. But a number of DwC terms exist because many content sources have not parsed/normalized all their data to the full extent of the DwC classes. Therefore, I think previousIdentifications should be kept, and if so, it should be part of the Individual class.
Got it. It should go with Individual.
associatedSequences
I suppose you won't agree on this, but I don't see sequences as any different than other tokens/evidence types that I think we should allow to document Occurrences. I would like this term to eventually go away, at least for people using RDF who will explicitly create resources for tokens and then type them.
OK, well I guess "Sequences" per se are functionally equivalent to images, in that they are not the organism themselves, but rather a representation of some aspect of the organisms (in this case, a representation of the molecular structure of the DNA molecules contained within the cells of the organism, rather than a representation of light waves reflected off the exterior of an organism in the case of an image, or of x-ray waves transmitted through an organism in the case of a radiograph image). I was thinking more in terms of tissue samples -- which I will much more stubbornly defend as being in the Individual class -- but I guess more in terms of "individualScope".
OK, as usual you are warping my brain into thinking about things in a different way. I'm going to separate the issue of "dead" from the issue of "pieces" (for the moment I'm going to accept that it doesn't matter if a whole organism is dead or not). The advantage of letting pieces of the organism be considered as a type of Individual is that it allows us to avoid creating another class of things called "PreservedSpecimen" (although in a sense we already have it because of dwctype:PreservedSpecimen, which when used as a rdf:type would imply membership in some rdfs:Class called "PreservedSpecimen"). The pieces could share properties that one might want to also apply to the whole organism. One could differentiate among the two by the value of "individualScope".
But after another long commute to think about this, I'm realizing that pieces of organisms really must not be Individuals. First of all, the definition that is under consideration is "The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single taxon." [the Google Code entry, with substitution of "taxon" for "species (or lower taxonomic rank if it exists)" as was discussed]. That definition as it stands applies to an organism or group of organisms, but does not include parts of organisms. Obviously the definition could be changed, but if you consider the comment, which describes the primary function of Individual: "Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification" it becomes clear that making parts of organisms Individuals defeats this primary purpose for the term.
The major selling point for having Individuals at all is to get out of the business of applying determinations to all of the pieces of evidence such as specimens, images, sounds, etc. that get collected from the same biological individual through multiple Occurrences. This has the benefit that if one applies an Identification to the Individual, all physical and information resources that are derived from the individual automatically get associated with the Identification and hence the taxonomic informations referenced by the Identification. If we call preserved specimens that are pieces of organism Individuals having a value of individualScope="part", then do we do the same thing to them as we do with Individuals at higher levels, namely apply Identifications to them? If so, then we are back in the business of assigning Identifications to all of our derivative resources rather than the biological individuals from which they came. If we just say that we'll skip assigning separate Identifications to the derivative resources, then we have something that doesn't fit the functional role for which Individual was designed. In that case an "Individual" which is an organism part is such a different thing that one might as well call it as something else (i.e. a PreservedSpecimen).
The case of a whole organism (live as a LivingSpecimen or dead as a PreservedSpecimen) is different because in that case we would have a single resource serving as the evidence (the whole organism itself). By definition, there can't be many of those (there would just be one) and it would already have an Identification assigned to it, because it is the same Individual that it is providing evidence for. So there is no superfluous assignment of Identifications in that case.
Here's one thing I'm not so certain about, though. An in-situ image of an organism is clearly a token of an Occurrence, because it is evidence of the organism at the place/time. An image of the preserved specimen in a Museum, or an x-ray, etc., is not really a token of an Occurrence, because it's not evidence of the organism at the place/time of its capture. Same goes for Sequences -- they are a token of the Individual organism, not of the occurrence of the organism at a place and time. This is why I have a hard time thinking of such things as tokens of an Occurrence, when they are really more tokens of the Individual.
I think the solution to this is to not call it a "token of the Occurrence". Let's say that the token is derived from the Individual and that it MAY serve as evidence for an Occurrence.
I think that the solution is something like you suggest: link the chain of derivation of tokens to the Individual and not to the Occurrence. Then have a reference in the Occurrence record to the particular token that was created or collected during the event of the Occurrence. See http://bioimages.vanderbilt.edu/pages/tree-branch.gif . I have had the tendency of thinking that the tokens supported the Occurrence, but there does not need to be just one purpose for the token. They also support the existence of the Individual. This should probably make you happy, because the pieces of the Individual (preserved specimens, tissue samples) would be derived from the Individual. The "provenance" if you want to call it that, traces the connection of the tokens to the Individual. The chain of derivation can be traced using the property that I've called "derivedFrom". The branch specimen is "derivedFrom" the Individual and the specimen image is "derivedFrom" the specimen. Your desire to differentiate between things that are physically derived from the Individual vs. things that aren't can be handled by the "isPartOf" property. The branch specimen "isPartOf" the Individual tree, but the image is not a part of the branch. A token could have both the isPartOf property and the derivedFrom property (if it's a piece of the Individual), or only the derivedFrom property (if it's not).
In this diagram, the term "hasEvidence" is a property of the Occurrence. It has the branch specimen as its object, but not the image of the specimen because as you note, the event marking the creation of the image is not the same as the event documenting the Occurrence of the Individual (i.e. the collection of the branch specimen). Either of the "tokens" (the specimen or the image) could be used as evidence for the Identification (we could have a property of the Identification called "basedOn" that could have the specimen, the image, or both as its object - I did something similar to this in the Biodiversity Informatics paper).
Please note that for each of the properties I've listed on the diagram, there could and probably should be inverse properties (not shown): hasDerivative for derivedFrom, hasPart for isPartOf, isEvidenceFor for hasEvidence, and usedIn for basedOn. All of the "tokens" and the Occurrence could have the property individualID which would relate the resource directly to the Individual and its Identifications.
I have created a number of similar charts showing how these relationships could apply to various types of tokens: http://bioimages.vanderbilt.edu/pages/tree-branch.gif (tree branch PreservedSpecimen) http://bioimages.vanderbilt.edu/pages/tree-image.gif (image of a live tree) http://bioimages.vanderbilt.edu/pages/whale-dna.gif (tissue sample and DNA sequence from a whale) http://bioimages.vanderbilt.edu/pages/bird-observation.gif (bird observation) http://bioimages.vanderbilt.edu/pages/wildebeest.gif (wildebeest calf captured and put in zoo) http://bioimages.vanderbilt.edu/pages/botanical-garden.gif (twig removed and turned into a living specimen in a botanical garden)
Note that in every case, the "token" is typed based on the kind of thing that it is. We don't try to make it an Occurrence (my previous mistake) or an Individual (what I'm saying is Rich's mistake). Physical things that are a part of the Individual have the special status of "isPartOf", electronic representations never do. Only the token that was created during the event associated with the Occurrence record is connected to the Occurrence record. The token serving as evidence for the Occurrence can be anything - there is no special class called "token". In fact, the "token" can be the organism itself if the organism is curated (John's favorite wildebeest calf in the zoo or a whole dead fish in a jar). The token can be another individual such as a living specimen that originates as a clone (maybe also seed) from the Individual being documented in the Occurrence. We (DwC) only get into the business of creating types and properties of tokens if they don't already exist in other vocabularies. DwC needs to do that for specimens, but not for images that are already covered by MRTG. An observation may or may not have a token depending on whether there is some kind of evidence that can be referred to (see bird example).
In these diagrams a single Occurrence and a single "line" of derived tokens is shown. But there can be many tokens per Occurrence and many tokens per Individual. There can also be many Occurrences per Individual. I didn't try to show this on the diagram because it would be too complicated. Obviously many users will want to make this "flatter" and less complicated. But I think this model allows for just about any kind of relationship among occurrence-documenting resources that people want to handle. It was the kind of thing I was trying to do in the Biodiversity Informatics paper (e.g. http://bioimages.vanderbilt.edu/pages/conceptual-scheme-insect.gif) but better because I'm letting the tokens be what they are rather than trying to force them all to be Occurrences.
This also assumes that dwc:catalogNumber and dwc:otherCatalogNumbers be re-assigned to Record-level terms. Was there some reason this isn't appropriate? I think it is appropriate because they should be usable with at least two classes: Individual (for living specimens) and Occurrences (e.g. preserved specimens, images)
Hmmm...on that basis, should individualCount and the various tokens also be Record-level terms -- on the basis that they can apply either to an Occurrence, or to an Individual? Actually, in the case of DNA Barcodes and such, isn't it possible to also represent a DNA Sequence as an attribute of a Taxon as well? If the purpose of Record-level terms is to aggregate terms that apply to more than one class, then perhaps that is the solution for a number of these things (including disposition, and maybe even preparation -- depending on how broadly those things are defined)?
individualCount would be metadata that results from an Occurrence, so I think that's the only place it belongs. Tokens aren't properties of anything, they are resources in their own right that are connected by some property term (e.g. hasDerivative/derivedFrom) to an Occurrence in which they were collected/recorded and to the Individual from which they derived (e.g. derivedFrom and hasDerivative). A DNA sequence is another resource that isn't an attribute of anything. It could be the object of numerous properties that could have a variety of subjects.
I haven't said this before, but are we allowing Individuals to be dead?
Errr....fossils? Preserved specimens? Are they not Individuals? I know you think of them in terms of tracking a living organism over time. But that's only one of the reasons why I support an Individual class (not even the main reason). To me, the main reason is that an "Individual" represents the actual organism(s), separate from an Occurrence, which represents the presence of an organism at a particular place and time.
I think I am prepared to accept this as long as they are the whole thing and not pieces. There could be some issues with a fossil since in many cases the tissues of the organism are replaced by minerals. But there is still a one-to-one relationship, so the problem I described in the long paragraph above doesn't apply.
If we put it in a jar of alcohol and cut it into many separately-cataloged pieces, are all of the pieces still some of the Individual?
This is why we need two things for an Individual:
individualScope (which can range anywhere from the aggregates of multiple individuals, all the way down to the smallest parts of individuals)
See above.
And, a mechanism to track series of "derived from" Individuals. The ASC model covered this, I think (right, Stan?)
I didn't see it in the flow chart, but it could be there somewhere. I had something like this in sernec:derivativeOccurrence and sernec:derivedFrom (http://bioimages.vanderbilt.edu/rdf/terms.htm) when I was making every token an Occurrence. But it's better to do as you suggest here which is what I did in the examples above.
I think Pete might have been suggesting modeling things that way with "partOf". What if we cut a branch from a tree, glue part of it to a page and turn part of it into a DNA sample that get sequenced. Are those all a part of an Individual?
It seems to me that each unit could represent a separate instance of Individual, but the "parts" need to be clearly aggregated around the single-organism parent Individual, which itself may be a part of another Individual instance that is an aggregate lot of specimens, which itself could be a subsampling of another Individual instance that represents a population in nature.
Let the supporting evidence (tokens) be whatever type of thing they are rather than call them Individuals. Link them together through hierarchical relationships to the single-organism parent Individual.
In my mind, we parse all of these things as separate instances of Individuals, but join them via a hierarchical (parent/child) relationship. If I'm not mistaken, this is how the ASC model managed instances of BiologicalObject (again....right, Stan?)
I don't really want them to be, but maybe I must? Somehow we need to be able to handle road-kill, which will be dead when we make the observation/collection. If we cut a branch from a tree (an Individual), root it, and grow it in a botanical garden, do we call the resulting tree in the garden the same Individual? I would assign it a new identifier and call it a new Individual. I guess my point is that I would only apply the term Individual to dead stuff, pieces of dead stuff, and living pieces of things with extreme caution.
Why extreme caution? What are the risks that we are cautioning ourselves against?
The risk that we make the definition of Individual so broad that it can't perform any of the functions it was defined to serve. We've already lost one of them (the ability to infer duplicates) when I agreed to the broader definition, but that's the subject of another post.
These are some principles that I always try to keep in mind when discussing these things:
- DwC is a data exchange standard, not so much a physical data model.
- There is a necessary balance between structuring DwC around how data
actually exist in content-provider databases, and how data *should* be represented in a normalised world
- When in doubt, DwC should be accomodating, rather than restrictive --
especially when more restrictive needs can be met via associated data filtering
There are other principles as well, but these are the ones I keep having to remind myself of.
I think that what I I have suggested above is very unrestrictive. We let evidence be the type of things that they are (PreservedSpecimens, Individuals, StillImages, SoundRecordings, DNA sequences, etc.). We don't determine their type by what we want to use them for. That was the mistake that I made in the Biodiversity Informatics paper. If we follow this approach, then a StillImage can fill any role that we want: evidence that an Occurrence happened, information to support an Identification, a character for a visual key, a logo, etc. We let it fulfill those roles by giving it an identifier and connecting it to other resources using appropriate terms (hasEvidence, derivedFrom, mrtg:attributionLogoURL, etc.
I think maybe so. Maybe the appropriate course of action here as well is to let people try different approaches out and if they turn out to work and be needed, then we talk about applying them to Darwin Core.
Ultimately, I think people will use it in accordance to what terms are nested within it -- which is why I think it's important to have this conversation we're having now.
As I indicated at an earlier time, I think that there are very few terms that should be properties of Individual since it is primarily a node that connects Occurrences to Identifications (and I guess now to derived tokens).
Aloha, Rich
Looking forward to responses! But I don't think development of these ideas should hold up the proposal for the class Individual, which can stand on its own with its current (revised) definition. Steve
.
OK. I think this makes sense to me.
Seeing how my "jar of semi-identified scunge" example plays now:
Setting: A jar full of stuff collected from a coral reef. In it I can see larval fish, sphaeromatid isopods, one "Diadema antillarum" urchin, and a green alga.
"Individual"ization: 1. The record for the jarful of scunge is _not_ eligible to be an "Individual". 2. I can create four additional records, each of which is designated as "partOf" the jar record, each of which can be an "Individual" with a higher-level or species-level determination attached to it.
Two years from now, I sort and ID the sphaeromatid isopods and determine that they are indeed all in the Family Sphaeromatidae, and split them out to three separate jars, each of which is a different genus. Each of those jars can be referred to as an "Individual" (one or more objects from a single taxon collected at a single time/place). Has the older family-level record now lost its "Individual"ness, now that it's clear it's a composition of three lower-level taxa? Or does it still record the correct fact that a group of independently locomoting bugs from one collecting instance all belonged to the same family, and hence is still an "Individual" record, indicating that higher-level taxon group of bugs?
[As a side issue, note that I have IDed them to genus, not species, to sidestep the debate on the specialness of the species taxon.]
-Dean -- Dean Pentcheff pentcheff@gmail.com dpentche@nhm.org
On Fri, Nov 5, 2010 at 12:29 PM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
For those of you who triage emails and don't read long emails, the bottom line is that although I agree with some of Rich's points, I think that the suggestion that parts of Individuals should be classified as Individuals does not fit the definition that is on the table for the proposed class dwc:Individual. I argue that allowing pieces of organisms to be called Individuals defeats the purpose of having the Individual class. I suggest an alternative approach that I think is the most straightforward method of separating tokens from the occurrences they document. The acceptance or rejection of Individual as a new class does not hinge on my suggested approach. Development of a system to handle more complicated resource relationships can take place independently of the proposal for the Individual class.
Responses inline below:
Richard Pyle wrote:
...
previousIdentifications
Hmm. I suppose yes, but better to just have another instance of Identification. Why not?
When the data are structured that way at the source, yes. But a number of DwC terms exist because many content sources have not parsed/normalized all their data to the full extent of the DwC classes. Therefore, I think previousIdentifications should be kept, and if so, it should be part of the Individual class.
Got it. It should go with Individual.
associatedSequences
I suppose you won't agree on this, but I don't see sequences as any different than other tokens/evidence types that I think we should allow to document Occurrences. I would like this term to eventually go away, at least for people using RDF who will explicitly create resources for tokens and then type them.
OK, well I guess "Sequences" per se are functionally equivalent to images, in that they are not the organism themselves, but rather a representation of some aspect of the organisms (in this case, a representation of the molecular structure of the DNA molecules contained within the cells of the organism, rather than a representation of light waves reflected off the exterior of an organism in the case of an image, or of x-ray waves transmitted through an organism in the case of a radiograph image). I was thinking more in terms of tissue samples -- which I will much more stubbornly defend as being in the Individual class -- but I guess more in terms of "individualScope".
OK, as usual you are warping my brain into thinking about things in a different way. I'm going to separate the issue of "dead" from the issue of "pieces" (for the moment I'm going to accept that it doesn't matter if a whole organism is dead or not). The advantage of letting pieces of the organism be considered as a type of Individual is that it allows us to avoid creating another class of things called "PreservedSpecimen" (although in a sense we already have it because of dwctype:PreservedSpecimen, which when used as a rdf:type would imply membership in some rdfs:Class called "PreservedSpecimen"). The pieces could share properties that one might want to also apply to the whole organism. One could differentiate among the two by the value of "individualScope".
But after another long commute to think about this, I'm realizing that pieces of organisms really must not be Individuals. First of all, the definition that is under consideration is "The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single taxon." [the Google Code entry, with substitution of "taxon" for "species (or lower taxonomic rank if it exists)" as was discussed]. That definition as it stands applies to an organism or group of organisms, but does not include parts of organisms. Obviously the definition could be changed, but if you consider the comment, which describes the primary function of Individual: "Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification" it becomes clear that making parts of organisms Individuals defeats this primary purpose for the term.
The major selling point for having Individuals at all is to get out of the business of applying determinations to all of the pieces of evidence such as specimens, images, sounds, etc. that get collected from the same biological individual through multiple Occurrences. This has the benefit that if one applies an Identification to the Individual, all physical and information resources that are derived from the individual automatically get associated with the Identification and hence the taxonomic informations referenced by the Identification. If we call preserved specimens that are pieces of organism Individuals having a value of individualScope="part", then do we do the same thing to them as we do with Individuals at higher levels, namely apply Identifications to them? If so, then we are back in the business of assigning Identifications to all of our derivative resources rather than the biological individuals from which they came. If we just say that we'll skip assigning separate Identifications to the derivative resources, then we have something that doesn't fit the functional role for which Individual was designed. In that case an "Individual" which is an organism part is such a different thing that one might as well call it as something else (i.e. a PreservedSpecimen).
The case of a whole organism (live as a LivingSpecimen or dead as a PreservedSpecimen) is different because in that case we would have a single resource serving as the evidence (the whole organism itself). By definition, there can't be many of those (there would just be one) and it would already have an Identification assigned to it, because it is the same Individual that it is providing evidence for. So there is no superfluous assignment of Identifications in that case.
Here's one thing I'm not so certain about, though. An in-situ image of an organism is clearly a token of an Occurrence, because it is evidence of the organism at the place/time. An image of the preserved specimen in a Museum, or an x-ray, etc., is not really a token of an Occurrence, because it's not evidence of the organism at the place/time of its capture. Same goes for Sequences -- they are a token of the Individual organism, not of the occurrence of the organism at a place and time. This is why I have a hard time thinking of such things as tokens of an Occurrence, when they are really more tokens of the Individual.
I think the solution to this is to not call it a "token of the Occurrence". Let's say that the token is derived from the Individual and that it MAY serve as evidence for an Occurrence.
I think that the solution is something like you suggest: link the chain of derivation of tokens to the Individual and not to the Occurrence. Then have a reference in the Occurrence record to the particular token that was created or collected during the event of the Occurrence. See http://bioimages.vanderbilt.edu/pages/tree-branch.gif . I have had the tendency of thinking that the tokens supported the Occurrence, but there does not need to be just one purpose for the token. They also support the existence of the Individual. This should probably make you happy, because the pieces of the Individual (preserved specimens, tissue samples) would be derived from the Individual. The "provenance" if you want to call it that, traces the connection of the tokens to the Individual. The chain of derivation can be traced using the property that I've called "derivedFrom". The branch specimen is "derivedFrom" the Individual and the specimen image is "derivedFrom" the specimen. Your desire to differentiate between things that are physically derived from the Individual vs. things that aren't can be handled by the "isPartOf" property. The branch specimen "isPartOf" the Individual tree, but the image is not a part of the branch. A token could have both the isPartOf property and the derivedFrom property (if it's a piece of the Individual), or only the derivedFrom property (if it's not).
In this diagram, the term "hasEvidence" is a property of the Occurrence. It has the branch specimen as its object, but not the image of the specimen because as you note, the event marking the creation of the image is not the same as the event documenting the Occurrence of the Individual (i.e. the collection of the branch specimen). Either of the "tokens" (the specimen or the image) could be used as evidence for the Identification (we could have a property of the Identification called "basedOn" that could have the specimen, the image, or both as its object - I did something similar to this in the Biodiversity Informatics paper).
Please note that for each of the properties I've listed on the diagram, there could and probably should be inverse properties (not shown): hasDerivative for derivedFrom, hasPart for isPartOf, isEvidenceFor for hasEvidence, and usedIn for basedOn. All of the "tokens" and the Occurrence could have the property individualID which would relate the resource directly to the Individual and its Identifications.
I have created a number of similar charts showing how these relationships could apply to various types of tokens: http://bioimages.vanderbilt.edu/pages/tree-branch.gif%C2%A0 (tree branch PreservedSpecimen) http://bioimages.vanderbilt.edu/pages/tree-image.gif%C2%A0 (image of a live tree) http://bioimages.vanderbilt.edu/pages/whale-dna.gif%C2%A0 (tissue sample and DNA sequence from a whale) http://bioimages.vanderbilt.edu/pages/bird-observation.gif (bird observation) http://bioimages.vanderbilt.edu/pages/wildebeest.gif%C2%A0 (wildebeest calf captured and put in zoo) http://bioimages.vanderbilt.edu/pages/botanical-garden.gif%C2%A0 (twig removed and turned into a living specimen in a botanical garden)
Note that in every case, the "token" is typed based on the kind of thing that it is. We don't try to make it an Occurrence (my previous mistake) or an Individual (what I'm saying is Rich's mistake). Physical things that are a part of the Individual have the special status of "isPartOf", electronic representations never do. Only the token that was created during the event associated with the Occurrence record is connected to the Occurrence record. The token serving as evidence for the Occurrence can be anything - there is no special class called "token". In fact, the "token" can be the organism itself if the organism is curated (John's favorite wildebeest calf in the zoo or a whole dead fish in a jar). The token can be another individual such as a living specimen that originates as a clone (maybe also seed) from the Individual being documented in the Occurrence. We (DwC) only get into the business of creating types and properties of tokens if they don't already exist in other vocabularies. DwC needs to do that for specimens, but not for images that are already covered by MRTG. An observation may or may not have a token depending on whether there is some kind of evidence that can be referred to (see bird example).
In these diagrams a single Occurrence and a single "line" of derived tokens is shown. But there can be many tokens per Occurrence and many tokens per Individual. There can also be many Occurrences per Individual. I didn't try to show this on the diagram because it would be too complicated. Obviously many users will want to make this "flatter" and less complicated. But I think this model allows for just about any kind of relationship among occurrence-documenting resources that people want to handle. It was the kind of thing I was trying to do in the Biodiversity Informatics paper (e.g. http://bioimages.vanderbilt.edu/pages/conceptual-scheme-insect.gif) but better because I'm letting the tokens be what they are rather than trying to force them all to be Occurrences.
This also assumes that dwc:catalogNumber and dwc:otherCatalogNumbers be re-assigned to Record-level terms. Was there some reason this isn't appropriate? I think it is appropriate because they should be usable with at least two classes: Individual (for living specimens) and Occurrences (e.g. preserved specimens, images)
Hmmm...on that basis, should individualCount and the various tokens also be Record-level terms -- on the basis that they can apply either to an Occurrence, or to an Individual? Actually, in the case of DNA Barcodes and such, isn't it possible to also represent a DNA Sequence as an attribute of a Taxon as well? If the purpose of Record-level terms is to aggregate terms that apply to more than one class, then perhaps that is the solution for a number of these things (including disposition, and maybe even preparation -- depending on how broadly those things are defined)?
individualCount would be metadata that results from an Occurrence, so I think that's the only place it belongs. Tokens aren't properties of anything, they are resources in their own right that are connected by some property term (e.g. hasDerivative/derivedFrom) to an Occurrence in which they were collected/recorded and to the Individual from which they derived (e.g. derivedFrom and hasDerivative). A DNA sequence is another resource that isn't an attribute of anything. It could be the object of numerous properties that could have a variety of subjects.
I haven't said this before, but are we allowing Individuals to be dead?
Errr....fossils? Preserved specimens? Are they not Individuals? I know you think of them in terms of tracking a living organism over time. But that's only one of the reasons why I support an Individual class (not even the main reason). To me, the main reason is that an "Individual" represents the actual organism(s), separate from an Occurrence, which represents the presence of an organism at a particular place and time.
I think I am prepared to accept this as long as they are the whole thing and not pieces. There could be some issues with a fossil since in many cases the tissues of the organism are replaced by minerals. But there is still a one-to-one relationship, so the problem I described in the long paragraph above doesn't apply.
If we put it in a jar of alcohol and cut it into many separately-cataloged pieces, are all of the pieces still some of the Individual?
This is why we need two things for an Individual:
individualScope (which can range anywhere from the aggregates of multiple individuals, all the way down to the smallest parts of individuals)
See above.
And, a mechanism to track series of "derived from" Individuals. The ASC model covered this, I think (right, Stan?)
I didn't see it in the flow chart, but it could be there somewhere. I had something like this in sernec:derivativeOccurrence and sernec:derivedFrom (http://bioimages.vanderbilt.edu/rdf/terms.htm) when I was making every token an Occurrence. But it's better to do as you suggest here which is what I did in the examples above.
I think Pete might have been suggesting modeling things that way with "partOf". What if we cut a branch from a tree, glue part of it to a page and turn part of it into a DNA sample that get sequenced. Are those all a part of an Individual?
It seems to me that each unit could represent a separate instance of Individual, but the "parts" need to be clearly aggregated around the single-organism parent Individual, which itself may be a part of another Individual instance that is an aggregate lot of specimens, which itself could be a subsampling of another Individual instance that represents a population in nature.
Let the supporting evidence (tokens) be whatever type of thing they are rather than call them Individuals. Link them together through hierarchical relationships to the single-organism parent Individual.
In my mind, we parse all of these things as separate instances of Individuals, but join them via a hierarchical (parent/child) relationship. If I'm not mistaken, this is how the ASC model managed instances of BiologicalObject (again....right, Stan?)
I don't really want them to be, but maybe I must? Somehow we need to be able to handle road-kill, which will be dead when we make the observation/collection. If we cut a branch from a tree (an Individual), root it, and grow it in a botanical garden, do we call the resulting tree in the garden the same Individual? I would assign it a new identifier and call it a new Individual. I guess my point is that I would only apply the term Individual to dead stuff, pieces of dead stuff, and living pieces of things with extreme caution.
Why extreme caution? What are the risks that we are cautioning ourselves against?
The risk that we make the definition of Individual so broad that it can't perform any of the functions it was defined to serve. We've already lost one of them (the ability to infer duplicates) when I agreed to the broader definition, but that's the subject of another post.
These are some principles that I always try to keep in mind when discussing these things:
- DwC is a data exchange standard, not so much a physical data model.
- There is a necessary balance between structuring DwC around how data
actually exist in content-provider databases, and how data *should* be represented in a normalised world
- When in doubt, DwC should be accomodating, rather than restrictive --
especially when more restrictive needs can be met via associated data filtering
There are other principles as well, but these are the ones I keep having to remind myself of.
I think that what I I have suggested above is very unrestrictive. We let evidence be the type of things that they are (PreservedSpecimens, Individuals, StillImages, SoundRecordings, DNA sequences, etc.). We don't determine their type by what we want to use them for. That was the mistake that I made in the Biodiversity Informatics paper. If we follow this approach, then a StillImage can fill any role that we want: evidence that an Occurrence happened, information to support an Identification, a character for a visual key, a logo, etc. We let it fulfill those roles by giving it an identifier and connecting it to other resources using appropriate terms (hasEvidence, derivedFrom, mrtg:attributionLogoURL, etc.
I think maybe so. Maybe the appropriate course of action here as well is to let people try different approaches out and if they turn out to work and be needed, then we talk about applying them to Darwin Core.
Ultimately, I think people will use it in accordance to what terms are nested within it -- which is why I think it's important to have this conversation we're having now.
As I indicated at an earlier time, I think that there are very few terms that should be properties of Individual since it is primarily a node that connects Occurrences to Identifications (and I guess now to derived tokens).
Aloha, Rich
Looking forward to responses! But I don't think development of these ideas should hold up the proposal for the class Individual, which can stand on its own with its current (revised) definition. Steve
.
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Dean Pentcheff wrote:
OK. I think this makes sense to me.
Seeing how my "jar of semi-identified scunge" example plays now:
Setting: A jar full of stuff collected from a coral reef. In it I can see larval fish, sphaeromatid isopods, one "Diadema antillarum" urchin, and a green alga.
"Individual"ization:
- The record for the jarful of scunge is _not_ eligible to be an "Individual".
I think under the definition as it stands and the way the discussion has been going, it could be an Individual with an Identification of urkingdom=Eukarya (or however taxonomists prefer to call it) because since there is are animals and algae in it, that would be the lowest taxon common to all organisms known to be in the jar. Whether this would be wise or whether you would actually want to do it or not is for you to decide. But I think it's allowed in the definition.
- I can create four additional records, each of which is designated
as "partOf" the jar record, each of which can be an "Individual" with a higher-level or species-level determination attached to it.
Yes, as long as each of those four Individuals isn't Identified to a level below the lowest taxon common to all organisms known to be in that lot.
Two years from now, I sort and ID the sphaeromatid isopods and determine that they are indeed all in the Family Sphaeromatidae, and split them out to three separate jars, each of which is a different genus. Each of those jars can be referred to as an "Individual" (one or more objects from a single taxon collected at a single time/place). Has the older family-level record now lost its "Individual"ness, now that it's clear it's a composition of three lower-level taxa? Or does it still record the correct fact that a group of independently locomoting bugs from one collecting instance all belonged to the same family, and hence is still an "Individual" record, indicating that higher-level taxon group of bugs?
The older jar Identified to family=Sphaeromatidae is still an Individual. It has three "child" Individuals, each of which isPartOf the Sphaeromatidae Individual. Each of the "child" Individuals can have an Identification to genus. You could continue this process until you have every individual isopod in a separate jar identified to species or whatever the lowest taxon is that you believe in. What I'm saying in my post is that it is NOT allowed by the definition (or useful based on the reason why the Individual class was proposed) to cut the isopods into pieces and then call the pieces Individuals (for the reasons I gave and won't repeat here). You can call the pieces PreservedSpecimens, choppedOrganismParts, derivedLifeBits, or whatever the community decides this kind of thing should be. But my assertion is that you can't call them Individuals.
Steve
[As a side issue, note that I have IDed them to genus, not species, to sidestep the debate on the specialness of the species taxon.]
-Dean
Dean Pentcheff pentcheff@gmail.com dpentche@nhm.org
On Fri, Nov 5, 2010 at 12:29 PM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
For those of you who triage emails and don't read long emails, the bottom line is that although I agree with some of Rich's points, I think that the suggestion that parts of Individuals should be classified as Individuals does not fit the definition that is on the table for the proposed class dwc:Individual. I argue that allowing pieces of organisms to be called Individuals defeats the purpose of having the Individual class. I suggest an alternative approach that I think is the most straightforward method of separating tokens from the occurrences they document. The acceptance or rejection of Individual as a new class does not hinge on my suggested approach. Development of a system to handle more complicated resource relationships can take place independently of the proposal for the Individual class.
Responses inline below:
Richard Pyle wrote:
...
previousIdentifications
Hmm. I suppose yes, but better to just have another instance of Identification. Why not?
When the data are structured that way at the source, yes. But a number of DwC terms exist because many content sources have not parsed/normalized all their data to the full extent of the DwC classes. Therefore, I think previousIdentifications should be kept, and if so, it should be part of the Individual class.
Got it. It should go with Individual.
associatedSequences
I suppose you won't agree on this, but I don't see sequences as any different than other tokens/evidence types that I think we should allow to document Occurrences. I would like this term to eventually go away, at least for people using RDF who will explicitly create resources for tokens and then type them.
OK, well I guess "Sequences" per se are functionally equivalent to images, in that they are not the organism themselves, but rather a representation of some aspect of the organisms (in this case, a representation of the molecular structure of the DNA molecules contained within the cells of the organism, rather than a representation of light waves reflected off the exterior of an organism in the case of an image, or of x-ray waves transmitted through an organism in the case of a radiograph image). I was thinking more in terms of tissue samples -- which I will much more stubbornly defend as being in the Individual class -- but I guess more in terms of "individualScope".
OK, as usual you are warping my brain into thinking about things in a different way. I'm going to separate the issue of "dead" from the issue of "pieces" (for the moment I'm going to accept that it doesn't matter if a whole organism is dead or not). The advantage of letting pieces of the organism be considered as a type of Individual is that it allows us to avoid creating another class of things called "PreservedSpecimen" (although in a sense we already have it because of dwctype:PreservedSpecimen, which when used as a rdf:type would imply membership in some rdfs:Class called "PreservedSpecimen"). The pieces could share properties that one might want to also apply to the whole organism. One could differentiate among the two by the value of "individualScope".
But after another long commute to think about this, I'm realizing that pieces of organisms really must not be Individuals. First of all, the definition that is under consideration is "The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single taxon." [the Google Code entry, with substitution of "taxon" for "species (or lower taxonomic rank if it exists)" as was discussed]. That definition as it stands applies to an organism or group of organisms, but does not include parts of organisms. Obviously the definition could be changed, but if you consider the comment, which describes the primary function of Individual: "Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification" it becomes clear that making parts of organisms Individuals defeats this primary purpose for the term.
The major selling point for having Individuals at all is to get out of the business of applying determinations to all of the pieces of evidence such as specimens, images, sounds, etc. that get collected from the same biological individual through multiple Occurrences. This has the benefit that if one applies an Identification to the Individual, all physical and information resources that are derived from the individual automatically get associated with the Identification and hence the taxonomic informations referenced by the Identification. If we call preserved specimens that are pieces of organism Individuals having a value of individualScope="part", then do we do the same thing to them as we do with Individuals at higher levels, namely apply Identifications to them? If so, then we are back in the business of assigning Identifications to all of our derivative resources rather than the biological individuals from which they came. If we just say that we'll skip assigning separate Identifications to the derivative resources, then we have something that doesn't fit the functional role for which Individual was designed. In that case an "Individual" which is an organism part is such a different thing that one might as well call it as something else (i.e. a PreservedSpecimen).
The case of a whole organism (live as a LivingSpecimen or dead as a PreservedSpecimen) is different because in that case we would have a single resource serving as the evidence (the whole organism itself). By definition, there can't be many of those (there would just be one) and it would already have an Identification assigned to it, because it is the same Individual that it is providing evidence for. So there is no superfluous assignment of Identifications in that case.
Here's one thing I'm not so certain about, though. An in-situ image of an organism is clearly a token of an Occurrence, because it is evidence of the organism at the place/time. An image of the preserved specimen in a Museum, or an x-ray, etc., is not really a token of an Occurrence, because it's not evidence of the organism at the place/time of its capture. Same goes for Sequences -- they are a token of the Individual organism, not of the occurrence of the organism at a place and time. This is why I have a hard time thinking of such things as tokens of an Occurrence, when they are really more tokens of the Individual.
I think the solution to this is to not call it a "token of the Occurrence". Let's say that the token is derived from the Individual and that it MAY serve as evidence for an Occurrence.
I think that the solution is something like you suggest: link the chain of derivation of tokens to the Individual and not to the Occurrence. Then have a reference in the Occurrence record to the particular token that was created or collected during the event of the Occurrence. See http://bioimages.vanderbilt.edu/pages/tree-branch.gif . I have had the tendency of thinking that the tokens supported the Occurrence, but there does not need to be just one purpose for the token. They also support the existence of the Individual. This should probably make you happy, because the pieces of the Individual (preserved specimens, tissue samples) would be derived from the Individual. The "provenance" if you want to call it that, traces the connection of the tokens to the Individual. The chain of derivation can be traced using the property that I've called "derivedFrom". The branch specimen is "derivedFrom" the Individual and the specimen image is "derivedFrom" the specimen. Your desire to differentiate between things that are physically derived from the Individual vs. things that aren't can be handled by the "isPartOf" property. The branch specimen "isPartOf" the Individual tree, but the image is not a part of the branch. A token could have both the isPartOf property and the derivedFrom property (if it's a piece of the Individual), or only the derivedFrom property (if it's not).
In this diagram, the term "hasEvidence" is a property of the Occurrence. It has the branch specimen as its object, but not the image of the specimen because as you note, the event marking the creation of the image is not the same as the event documenting the Occurrence of the Individual (i.e. the collection of the branch specimen). Either of the "tokens" (the specimen or the image) could be used as evidence for the Identification (we could have a property of the Identification called "basedOn" that could have the specimen, the image, or both as its object - I did something similar to this in the Biodiversity Informatics paper).
Please note that for each of the properties I've listed on the diagram, there could and probably should be inverse properties (not shown): hasDerivative for derivedFrom, hasPart for isPartOf, isEvidenceFor for hasEvidence, and usedIn for basedOn. All of the "tokens" and the Occurrence could have the property individualID which would relate the resource directly to the Individual and its Identifications.
I have created a number of similar charts showing how these relationships could apply to various types of tokens: http://bioimages.vanderbilt.edu/pages/tree-branch.gif (tree branch PreservedSpecimen) http://bioimages.vanderbilt.edu/pages/tree-image.gif (image of a live tree) http://bioimages.vanderbilt.edu/pages/whale-dna.gif (tissue sample and DNA sequence from a whale) http://bioimages.vanderbilt.edu/pages/bird-observation.gif (bird observation) http://bioimages.vanderbilt.edu/pages/wildebeest.gif (wildebeest calf captured and put in zoo) http://bioimages.vanderbilt.edu/pages/botanical-garden.gif (twig removed and turned into a living specimen in a botanical garden)
Note that in every case, the "token" is typed based on the kind of thing that it is. We don't try to make it an Occurrence (my previous mistake) or an Individual (what I'm saying is Rich's mistake). Physical things that are a part of the Individual have the special status of "isPartOf", electronic representations never do. Only the token that was created during the event associated with the Occurrence record is connected to the Occurrence record. The token serving as evidence for the Occurrence can be anything - there is no special class called "token". In fact, the "token" can be the organism itself if the organism is curated (John's favorite wildebeest calf in the zoo or a whole dead fish in a jar). The token can be another individual such as a living specimen that originates as a clone (maybe also seed) from the Individual being documented in the Occurrence. We (DwC) only get into the business of creating types and properties of tokens if they don't already exist in other vocabularies. DwC needs to do that for specimens, but not for images that are already covered by MRTG. An observation may or may not have a token depending on whether there is some kind of evidence that can be referred to (see bird example).
In these diagrams a single Occurrence and a single "line" of derived tokens is shown. But there can be many tokens per Occurrence and many tokens per Individual. There can also be many Occurrences per Individual. I didn't try to show this on the diagram because it would be too complicated. Obviously many users will want to make this "flatter" and less complicated. But I think this model allows for just about any kind of relationship among occurrence-documenting resources that people want to handle. It was the kind of thing I was trying to do in the Biodiversity Informatics paper (e.g. http://bioimages.vanderbilt.edu/pages/conceptual-scheme-insect.gif) but better because I'm letting the tokens be what they are rather than trying to force them all to be Occurrences.
This also assumes that dwc:catalogNumber and dwc:otherCatalogNumbers be re-assigned to Record-level terms. Was there some reason this isn't appropriate?
I think it is appropriate because they should be usable with at least two classes: Individual (for living specimens) and Occurrences (e.g. preserved specimens, images)
Hmmm...on that basis, should individualCount and the various tokens also be Record-level terms -- on the basis that they can apply either to an Occurrence, or to an Individual? Actually, in the case of DNA Barcodes and such, isn't it possible to also represent a DNA Sequence as an attribute of a Taxon as well? If the purpose of Record-level terms is to aggregate terms that apply to more than one class, then perhaps that is the solution for a number of these things (including disposition, and maybe even preparation -- depending on how broadly those things are defined)?
individualCount would be metadata that results from an Occurrence, so I think that's the only place it belongs. Tokens aren't properties of anything, they are resources in their own right that are connected by some property term (e.g. hasDerivative/derivedFrom) to an Occurrence in which they were collected/recorded and to the Individual from which they derived (e.g. derivedFrom and hasDerivative). A DNA sequence is another resource that isn't an attribute of anything. It could be the object of numerous properties that could have a variety of subjects.
I haven't said this before, but are we allowing Individuals to be dead?
Errr....fossils? Preserved specimens? Are they not Individuals? I know you think of them in terms of tracking a living organism over time. But that's only one of the reasons why I support an Individual class (not even the main reason). To me, the main reason is that an "Individual" represents the actual organism(s), separate from an Occurrence, which represents the presence of an organism at a particular place and time.
I think I am prepared to accept this as long as they are the whole thing and not pieces. There could be some issues with a fossil since in many cases the tissues of the organism are replaced by minerals. But there is still a one-to-one relationship, so the problem I described in the long paragraph above doesn't apply.
If we put it in a jar of alcohol and cut it into many separately-cataloged pieces, are all of the pieces still some of the Individual?
This is why we need two things for an Individual:
individualScope (which can range anywhere from the aggregates of multiple individuals, all the way down to the smallest parts of individuals)
See above.
And, a mechanism to track series of "derived from" Individuals. The ASC model covered this, I think (right, Stan?)
I didn't see it in the flow chart, but it could be there somewhere. I had something like this in sernec:derivativeOccurrence and sernec:derivedFrom (http://bioimages.vanderbilt.edu/rdf/terms.htm) when I was making every token an Occurrence. But it's better to do as you suggest here which is what I did in the examples above.
I think Pete might have been suggesting modeling things that way with "partOf". What if we cut a branch from a tree, glue part of it to a page and turn part of it into a DNA sample that get sequenced. Are those all a part of an Individual?
It seems to me that each unit could represent a separate instance of Individual, but the "parts" need to be clearly aggregated around the single-organism parent Individual, which itself may be a part of another Individual instance that is an aggregate lot of specimens, which itself could be a subsampling of another Individual instance that represents a population in nature.
Let the supporting evidence (tokens) be whatever type of thing they are rather than call them Individuals. Link them together through hierarchical relationships to the single-organism parent Individual.
In my mind, we parse all of these things as separate instances of Individuals, but join them via a hierarchical (parent/child) relationship. If I'm not mistaken, this is how the ASC model managed instances of BiologicalObject (again....right, Stan?)
I don't really want them to be, but maybe I must? Somehow we need to be able to handle road-kill, which will be dead when we make the observation/collection. If we cut a branch from a tree (an Individual), root it, and grow it in a botanical garden, do we call the resulting tree in the garden the same Individual? I would assign it a new identifier and call it a new Individual. I guess my point is that I would only apply the term Individual to dead stuff, pieces of dead stuff, and living pieces of things with extreme caution.
Why extreme caution? What are the risks that we are cautioning ourselves against?
The risk that we make the definition of Individual so broad that it can't perform any of the functions it was defined to serve. We've already lost one of them (the ability to infer duplicates) when I agreed to the broader definition, but that's the subject of another post.
These are some principles that I always try to keep in mind when discussing these things:
- DwC is a data exchange standard, not so much a physical data model.
- There is a necessary balance between structuring DwC around how data
actually exist in content-provider databases, and how data *should* be represented in a normalised world
- When in doubt, DwC should be accomodating, rather than restrictive --
especially when more restrictive needs can be met via associated data filtering
There are other principles as well, but these are the ones I keep having to remind myself of.
I think that what I I have suggested above is very unrestrictive. We let evidence be the type of things that they are (PreservedSpecimens, Individuals, StillImages, SoundRecordings, DNA sequences, etc.). We don't determine their type by what we want to use them for. That was the mistake that I made in the Biodiversity Informatics paper. If we follow this approach, then a StillImage can fill any role that we want: evidence that an Occurrence happened, information to support an Identification, a character for a visual key, a logo, etc. We let it fulfill those roles by giving it an identifier and connecting it to other resources using appropriate terms (hasEvidence, derivedFrom, mrtg:attributionLogoURL, etc.
I think maybe so. Maybe the appropriate course of action here as well is to let people try different approaches out and if they turn out to work and be needed, then we talk about applying them to Darwin Core.
Ultimately, I think people will use it in accordance to what terms are nested within it -- which is why I think it's important to have this conversation we're having now.
As I indicated at an earlier time, I think that there are very few terms that should be properties of Individual since it is primarily a node that connects Occurrences to Identifications (and I guess now to derived tokens).
Aloha, Rich
Looking forward to responses! But I don't think development of these ideas should hold up the proposal for the class Individual, which can stand on its own with its current (revised) definition. Steve
.
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Dean, I intended to comment on your earlier post but got caught up in other threads and hadn't yet taken the time. I know on at least one previous occasion this issue of mixed aggregations was discussed on the list but I spent about 15 minutes looking for it in the archives this morning and couldn't find it. I remember somebody pointed out some software that allows places within an image to be demarcated (i.e. call the image an aggregation and define spots in it as Individuals). Also, I tried to handle the issue of tokens (or evidence) that are derived from other tokens in my paper (see http://bioimages.vanderbilt.edu/pages/conceptual-scheme-insect.gif for an example) by using sernec:derivedFrom and sernec:derivativeOccurrence (see http://bioimages.vanderbilt.edu/rdf/terms.htm). But I think my approach needs to be rethought in the context of separating tokens from occurrences. I am very keen to work this out, but I really don't thing that piling this onto Individual is the way to do it. I'm in the middle of running and photographing gels at the moment but look forward to continuing the discussion on these points at a later time.
Steve
Dean Pentcheff wrote:
Leveraging off my earlier toss-in of the parent-child collection scheme, let me toss in this observation.
I'll preface it by saying that although it's a situation we deal with in reality, my gut impulse is that it should probably _not_ be accomodated by the "Individual" concept under development.
We have many records for un- or partially-sorted lots of marine invertebrate samples. Often we can make very rough determinations of what are in those lots (e.g., we can see that a jar contains ophiuroids, gastropods, sphaeromatid isopods, red algae, and larval fish). Critically, these are multiple particular and disjunct parts of the taxonomic hierarchy, not just a single "highest containing rank" determination.
It turns out to be super-useful to record that very rough determination because (as alluded to by Rich) we can then appropriately make that jar available to visitors seeking particular taxa (and save them the trouble of grubbing through shelves of jars where we already "know" there's nothing of interest to them).
Right now, we do _not_ conflate this rough determination with a Real Taxonomic Determination (®™ and all that): they are two completely separate fields. So to find all the jars we know have ophiuroids, one does indeed have to search both the real taxonomic determination field as well as the rough-determination (text) field (if one wants to include unsorted lots in the quest).
I'm introducing this case more with the idea that it may usefully help define the outer limits for "Individual" -- something that the "Individual" concept should _not_ accomodate. I can't really wrap my head around how the developing "Individual" concept can usefully be mutilated to accomodate this case.
-Dean
Dean Pentcheff pentcheff@gmail.com mailto:pentcheff@gmail.com dpentche@nhm.org mailto:dpentche@nhm.org
On Wed, Nov 3, 2010 at 11:53 AM, Richard Pyle <deepreef@bishopmuseum.org mailto:deepreef@bishopmuseum.org> wrote:
> I think if I'm understanding what John wrote, > he was going to substitute "taxon" for "species > (or lower taxonomic rank if it exists)" with > the understanding that Individual is not > intended to be used for aggregates of > different taxa. That would solve this problem, right? It depends on what you mean by "different taxa". If you are using the word "taxa" here to imply "species or lower ranks", than I don't think it would solve the problem. But if you mean it in a generic way, then I'm OK with that. By "in a generic way", suppose I had a trawl sample or a plankton tow sample that included unidentified organisms from multiple phyla, all of which are animals. I should not be prevented from representing this aggregate as an "Individual", with an identification instances linked to a taxon concept labelled as "Animalia". This means the contents of the Individual all belong to a single taxon (Animalia), and therefore it does not violate the condition excluding aggregates of different taxa. An instance of Individual so identified would be almost useless for many purposes, I agree -- but it's easy enough to filter such Individuals out by looking at dwc:taxonRank of the Taxon to which the Individual was identified. Also, it's not useless for all purposes, because a botanist would like to know that s/he doesn't have to look through that sample to find stuff of interest. I guess my point is, there should not be any rank-based requirement for the implied taxon circumscription of an "Individual". Rich _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org <mailto:tdwg-content@lists.tdwg.org> http://lists.tdwg.org/mailman/listinfo/tdwg-content
I think we need to be pretty clear as to what each of these classes represent, otherwise the data associated with them will be useless.
If someone wants to document that they have a jar full of different individuals of several species taken from a particular place and time then perhaps that should be modeled as a separate kind of thing.
Eventually one of the jars will be opened and something will be identified to species.
They will want to relate that specimen back to the jar it came from.
If we think of these jars as a collection set, then there may be some utility in being able to map where various collection sets are from.
However, do we want these collection sets to show up in a search of species occurrence records?
I also question the utility of analyzing occurrence records at clades higher than species.
Species are assumed to be made up of population of interbreeding individuals,
But what is a genus vs, a subgenus, vs a tribe?
If these higher clades were somewhat stable and had some agreed on understanding then this might make sense.
There is no clear reasoning behind why one clade is a family in Mammals and a clade of similar age is a genus in Beetles.
- Pete
On Wed, Nov 3, 2010 at 3:27 PM, Steve Baskauf steve.baskauf@vanderbilt.eduwrote:
Dean, I intended to comment on your earlier post but got caught up in other threads and hadn't yet taken the time. I know on at least one previous occasion this issue of mixed aggregations was discussed on the list but I spent about 15 minutes looking for it in the archives this morning and couldn't find it. I remember somebody pointed out some software that allows places within an image to be demarcated (i.e. call the image an aggregation and define spots in it as Individuals). Also, I tried to handle the issue of tokens (or evidence) that are derived from other tokens in my paper (see http://bioimages.vanderbilt.edu/pages/conceptual-scheme-insect.gif for an example) by using sernec:derivedFrom and sernec:derivativeOccurrence (see http://bioimages.vanderbilt.edu/rdf/terms.htm). But I think my approach needs to be rethought in the context of separating tokens from occurrences. I am very keen to work this out, but I really don't thing that piling this onto Individual is the way to do it. I'm in the middle of running and photographing gels at the moment but look forward to continuing the discussion on these points at a later time.
Steve
Dean Pentcheff wrote:
Leveraging off my earlier toss-in of the parent-child collection scheme, let me toss in this observation.
I'll preface it by saying that although it's a situation we deal with in reality, my gut impulse is that it should probably _not_ be accomodated by the "Individual" concept under development.
We have many records for un- or partially-sorted lots of marine invertebrate samples. Often we can make very rough determinations of what are in those lots (e.g., we can see that a jar contains ophiuroids, gastropods, sphaeromatid isopods, red algae, and larval fish). Critically, these are multiple particular and disjunct parts of the taxonomic hierarchy, not just a single "highest containing rank" determination.
It turns out to be super-useful to record that very rough determination because (as alluded to by Rich) we can then appropriately make that jar available to visitors seeking particular taxa (and save them the trouble of grubbing through shelves of jars where we already "know" there's nothing of interest to them).
Right now, we do _not_ conflate this rough determination with a Real Taxonomic Determination (®™ and all that): they are two completely separate fields. So to find all the jars we know have ophiuroids, one does indeed have to search both the real taxonomic determination field as well as the rough-determination (text) field (if one wants to include unsorted lots in the quest).
I'm introducing this case more with the idea that it may usefully help define the outer limits for "Individual" -- something that the "Individual" concept should _not_ accomodate. I can't really wrap my head around how the developing "Individual" concept can usefully be mutilated to accomodate this case.
-Dean
Dean Pentcheff pentcheff@gmail.com dpentche@nhm.org
On Wed, Nov 3, 2010 at 11:53 AM, Richard Pyle deepreef@bishopmuseum.orgwrote:
I think if I'm understanding what John wrote, he was going to substitute "taxon" for "species (or lower taxonomic rank if it exists)" with the understanding that Individual is not intended to be used for aggregates of different taxa. That would solve this problem, right?
It depends on what you mean by "different taxa". If you are using the word "taxa" here to imply "species or lower ranks", than I don't think it would solve the problem. But if you mean it in a generic way, then I'm OK with that. By "in a generic way", suppose I had a trawl sample or a plankton tow sample that included unidentified organisms from multiple phyla, all of which are animals. I should not be prevented from representing this aggregate as an "Individual", with an identification instances linked to a taxon concept labelled as "Animalia". This means the contents of the Individual all belong to a single taxon (Animalia), and therefore it does not violate the condition excluding aggregates of different taxa. An instance of Individual so identified would be almost useless for many purposes, I agree -- but it's easy enough to filter such Individuals out by looking at dwc:taxonRank of the Taxon to which the Individual was identified. Also, it's not useless for all purposes, because a botanist would like to know that s/he doesn't have to look through that sample to find stuff of interest.
I guess my point is, there should not be any rank-based requirement for the implied taxon circumscription of an "Individual".
Rich
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707http://bioimages.vanderbilt.edu
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Here's the thing:
What is fundamentally different between this statement:
If someone wants to document that they have a jar full of different individuals of several species taken from a particular place and time then perhaps that should be modeled as a separate kind of thing.
...and this statement:
If someone wants to document that they have a jar full of individuals from several different populations of the same species taken from a particular place and time then perhaps that should be modeled as a separate kind of thing.
...and this statement:
If someone wants to document that they have a jar full of different individuals of the same species taken from a particular place and time then perhaps that should be modeled as a separate kind of thing.
My point is that DwC should not single out "species" as being a special kind of Taxon, that is allowed to participate in certain kinds of relationships that other ranks (or or even rankless) taxa are not allowed to participate in. There have been endless debates about the "specialness" on Taxacom, and to my knowledge, none of those debates have ended in any sort of consensus other than "agree to disagree". Let's not "go there" in DwC. Let's treat relationships with taxa in a completely rank-agnostic way.
Eventually one of the jars will be opened and something will be identified to species.
Maybe the "somethings" will all be identified to the same species, and maybe they'll all be identified to different species. When we get to that stage, if more than one taxon is represented within what was previosuly regarded as a single Individual, then we generate new Individual instances that correspond to the different taxon identifications, and tag them accordingly with their own Identifications (maintaining, of course, the approproiate "derived from" relatioships among the Individuals).
They will want to relate that specimen back to the jar it came from.
Yes, definitely! We certainly want to have a mechanism to address Dean's point about Individuals derived from Individuals, whether it be a parent-child sort of derivation, or something more general, is a topic for another thread.
However, do we want these collection sets to show up in a search of species occurrence records?
If we want to do a search of "species" occurrence records, then we add the appropriate filter to the search that limits the results according to dwc:taxonRank. But we shouldn't constrain the underlying architecture to prevent people from doing a search of genus occurrence records, or family occurrence records.
I also question the utility of analyzing occurrence records at clades higher than species.
Really? In the coral-reef fish world, there are some fasinating biogeographic patterns where certain families only occur at continental localities, and others that occcu only at oceanic insular localities. When I do queries of, say, larval fish holdings -- where species-level identifications are ahrd to come by, but family-level identifications are comparatively easy (e.g., a jar of 50 formalin-fixed larval specimens may all be reliably be identified to the same family, but may be unidentifieable ot species) -- I sure would like to know if the larvae of certain insular families occur at contiental regions (and vice versa), because that would be informative of the likely mechanisms behind the family-level geographic distributions.
This is just an example I came up with off the top of my head just now. I seriously doubt that it is especially unique -- I'm sure there are other kinds of biogeographic questions one would like to ask of our collective data, that overalp with situations where identifications may be reliable only to the higher-rank level.
Species are assumed to be made up of population of interbreeding individuals,
But what is a genus vs, a subgenus, vs a tribe?
Again...these debates have happened ad-nauseum (almost literally) on Taxacom, and I think it would be unwise to clutter DwC with presumptions about the outcomes of these unresolved debates (or worse, repeat the deabtes on this forum). By its fundamental nature, DwC is accomodating to a broad spectrum of user needs. Why restrict its utility to a subset of needs -- especially when the subset can be accomodated by adding appropriate filter criteria (e.g., in this case, via dwc:taxonRank)?
If these higher clades were somewhat stable and had some agreed on understanding then this might make sense.
In many groups they do.
There is no clear reasoning behind why one clade is a family in Mammals and a clade of similar age is a genus in Beetles.
Perhaps not. But within Mammals, or within Beetles, or within fishes, they might have more useful meaning.
But that's not the point. The real point is that non-trivial amounts of data exist in our domain that prevents reliable assessments of aggregated organisms to be cirumscribed within the same species-rank taxon concept, but there are many reasons why we want to be able to assign reliable higher-rank taxon identifications to these aggregates. As such, if an Identification is a tuple of Individual and Taxon, then not only do we want to allow Idetifications to link to higher-rank taxa, but we don't want to be forced to restricting such links to be used only in cases where we have confidence that the members of an instance of Individual reliably fall within a species-rank taxon cocnept, even if they are not labelled as such.
I just re-read the preceeding paragraph, and it makes my head spin; so apologies in advance to the dizzy among us....
Aloha, Rich
I'm going to revisit this tomorrow morning, to be able to think carefully about the prior postings on this. But a quick reply to this posting (which had the advantage of crossing my desk when I was painfully in need of a distraction from what I was doing).
I think there may be a useful distinction to be made between these two cases which match Rich's most recent cases 1 and 2+3):
1. A jar of mixed semi-identified organisms from known-to-be disjunct taxa (e.g. two molluscan genera, a green alga, and some fish larvae).
2. A jar of one or more organisms collectively identified down to a single taxon of any rank (e.g. a jar full of ophiuroids only, all of which may or may not be the same species, but they're all "known" to be ophiuroids).
Without (yet) wrapping my head around this entirely, I think I agree with Rich that we definitely find it useful to be able to search taxonomically for higher level taxa as well as searching for species. Without being able to do so, we are forced to ignore all instances that are not identified to species, and that's just not a happy solution.
I've made the claim that we sometimes find it interesting to be able to search taxonomically for (actually, to find) records that describe the mixed-taxon case #1 above. However, without allowing a one-to-many relationship between "Individual" and a determination, I don't see how to accomodate it in a straightforward scheme, and nor do I think it is necessarily a good idea to do so.
Unless a simple solution presents itself to me, I'm willing to accept the idea that my case #1 and case #2 cannot be folded into one tidy model.
-Dean
I think there may be a useful distinction to be made between these two cases which match Rich's most recent cases 1 and 2+3):
- A jar of mixed semi-identified organisms
from known-to-be disjunct taxa (e.g. two molluscan genera, a green alga, and some fish larvae).
- A jar of one or more organisms collectively
identified down to a single taxon of any rank (e.g. a jar full of ophiuroids only, all of which may or may not be the same species, but they're all "known" to be ophiuroids).
That wasn't exactly the distinction I was making. I started writing a separate post on this issue, but only got about half-way through before I discovered that I wasn't even sure what my own point was, so I shelved it for a later time.
What concerns me, though, is that you're still effectively introducing "rank-ism" in the distinction above; as if a set of mollusks, alga and fish were somehow fundamentally different from a set of organisms from seven different families of ophiuroids.
My proposed solution is to rigidly maintain that an instance of "Individual" can not be partitioned to have multiple separate but concurrently legitimate Identifications associated with it. It can have multiple Identifications, but they would be considered to either be competing with each other (when different taxa are asserted) or reinforing each other (when the same taxon is asserted). They would *not* be considered concurrent with each other, and applying to separate "parts" or "members" of a single aggregated Individual instance.
When, we know what the taxonomic breakdown is with some additional precision, we partition out "child" individuals down to the level at which we are able to confidnetly discern distinct taxa (and have time/resources to attach Identifications as such). That way, we keep our basic agreement that an Individual must represent an aggregate of the same taxon, but we don't impose any rank limitiations to what that taxon might be.
So, in your example, we would have one instance of Individual representing the whole jar (assuming we wanted such an instance to record the fact that they're in the same jar, or whatever. That Instance may have no Identification instances associated with it, or maybe only one for "Eukarya". But if you wanted to acknowledge the specific representations of mollusks, algae, and fihses, you would generate (at least) three "child" Individual instances, derived from the one "parent", and then each of those child instances would have its appropriate taxon Identification attached.
Without (yet) wrapping my head around this entirely, I think I agree with Rich that we definitely find it useful to be able to search taxonomically for higher level taxa as well as searching for species. Without being able to do so, we are forced to ignore all instances that are not identified to species, and that's just not a happy solution.
It's a little more subtle than that. As Steve pointed out early on (in his response to John), we can still leave an organism identified only to the level of, say, Order or Family; but still "believe" (confidently?) that when we do make a more careful determination, we will find only a single species among that aggregate.
The real question is: when we *don't* know (with confidence?) that the constituents of an aggregated "Individual" will ultimately be shown to represent a single taxon at the rank of species (or lower) -- or going even further, when we *already know* that more than one species-rank taxon is likely to be involved -- we should not be discouraged from treating the aggregate as a single instance of an Individual, with a single legitimate Taxon Identification, at whatever rank (Animalia, Ophiuroida, whatever) -- IF and WHEN circumstances warrant it. At the moment, the only such circumstances for the latter sort (i.e., we know that more than one species-level is probably involved) that I can imagine are when we have material, but we do not have the time/staff/expertise/whatever to go through and add more taxonomic precision to the Identification(s).
I've made the claim that we sometimes find it interesting to be able to search taxonomically for (actually, to find) records that describe the mixed-taxon case #1 above. However, without allowing a one-to-many relationship between "Individual" and a determination, I don't see how to accomodate it in a straightforward scheme, and nor do I think it is necessarily a good idea to do so.
That was buried in my last big post on this. I agree that it's not a good idea to do so. The solution is, when you are aware of what the subgroups are, you generate instances of "Individual" to represent each one, then attach an appropriate Identification to each. You can still keep them in the same jar; it's just that your database knows that the "parent" Individual instance (the whole jar) has no Identifications associated with it, or maybe only "Eukarya" or "Animalia" associated with it; and further that this one "parent" Individual has multiple (semantically linked) "derived" (or "child") Individuals, each of which has a more precise taxon Identification.
Unless a simple solution presents itself to me, I'm willing to accept the idea that my case #1 and case #2 cannot be folded into one tidy model.
I think the solution I proposed in my earlier email is sufficiently simple in all respects *except* in terms of trying to explain it using the English Language rendered in ASCII.
When I get time, maybe I'll whip up some diagrams and a better articulation of what I mean. But first, read my earlier post and see if I managed to explain it just coherently enough....
Thanks,
Rich
Comment inline
Dean Pentcheff wrote:
I'm going to revisit this tomorrow morning, to be able to think carefully about the prior postings on this. But a quick reply to this posting (which had the advantage of crossing my desk when I was painfully in need of a distraction from what I was doing).
I think there may be a useful distinction to be made between these two cases which match Rich's most recent cases 1 and 2+3):
- A jar of mixed semi-identified organisms from known-to-be disjunct
taxa (e.g. two molluscan genera, a green alga, and some fish larvae).
- A jar of one or more organisms collectively identified down to a
single taxon of any rank (e.g. a jar full of ophiuroids only, all of which may or may not be the same species, but they're all "known" to be ophiuroids).
Without (yet) wrapping my head around this entirely, I think I agree with Rich that we definitely find it useful to be able to search taxonomically for higher level taxa as well as searching for species. Without being able to do so, we are forced to ignore all instances that are not identified to species, and that's just not a happy solution.
I've made the claim that we sometimes find it interesting to be able to search taxonomically for (actually, to find) records that describe the mixed-taxon case #1 above. However, without allowing a one-to-many relationship between "Individual" and a determination, I don't see how to accomodate it in a straightforward scheme, and nor do I think it is necessarily a good idea to do so.
For clarification, in cases like #1, we ARE allowing one-to-many relationships between "composite" Individuals and Identifications as long as those Identifications represent differences of opinion about the common taxon, or refinements to a lower taxonomic level (e.g. I'm not capable or don't have time to determine the lowest taxonomic level common to all of the biological individuals in the jar, but later I am able to find that out). What we AREN'T allowing is for subsets of the composite "Individual" that belong to different lower level taxa to be identified to those taxa without first separating them into different Individuals. I think I'm stating the principle that Rich laid out correctly.
Unless a simple solution presents itself to me, I'm willing to accept the idea that my case #1 and case #2 cannot be folded into one tidy model.
This is where having something like individualScope would help. There could be one model, but it would require (I think) a metadata term to differentiate between case #1 and case #2. Steve
-Dean
Dean Pentcheff pentcheff@gmail.com mailto:pentcheff@gmail.com dpentche@nhm.org mailto:dpentche@nhm.org
On Wed, Nov 3, 2010 at 3:39 PM, Richard Pyle <deepreef@bishopmuseum.org mailto:deepreef@bishopmuseum.org> wrote:
Here's the thing: What is fundamentally different between this statement: > If someone wants to document that they have a jar full > of different individuals of several species taken from > a particular place and time then perhaps that should > be modeled as a separate kind of thing. ...and this statement: > If someone wants to document that they have a jar full > of individuals from several different populations of > the same species taken from > a particular place and time then perhaps that should > be modeled as a separate kind of thing. ...and this statement: > If someone wants to document that they have a jar full > of different individuals of the same species taken from > a particular place and time then perhaps that should > be modeled as a separate kind of thing. My point is that DwC should not single out "species" as being a special kind of Taxon, that is allowed to participate in certain kinds of relationships that other ranks (or or even rankless) taxa are not allowed to participate in. There have been endless debates about the "specialness" on Taxacom, and to my knowledge, none of those debates have ended in any sort of consensus other than "agree to disagree". Let's not "go there" in DwC. Let's treat relationships with taxa in a completely rank-agnostic way. > Eventually one of the jars will be opened and > something will be identified to species. Maybe the "somethings" will all be identified to the same species, and maybe they'll all be identified to different species. When we get to that stage, if more than one taxon is represented within what was previosuly regarded as a single Individual, then we generate new Individual instances that correspond to the different taxon identifications, and tag them accordingly with their own Identifications (maintaining, of course, the approproiate "derived from" relatioships among the Individuals). > They will want to relate that specimen back > to the jar it came from. Yes, definitely! We certainly want to have a mechanism to address Dean's point about Individuals derived from Individuals, whether it be a parent-child sort of derivation, or something more general, is a topic for another thread. > However, do we want these collection sets > to show up in a search of species occurrence records? If we want to do a search of "species" occurrence records, then we add the appropriate filter to the search that limits the results according to dwc:taxonRank. But we shouldn't constrain the underlying architecture to prevent people from doing a search of genus occurrence records, or family occurrence records. > I also question the utility of analyzing > occurrence records at clades higher than species. Really? In the coral-reef fish world, there are some fasinating biogeographic patterns where certain families only occur at continental localities, and others that occcu only at oceanic insular localities. When I do queries of, say, larval fish holdings -- where species-level identifications are ahrd to come by, but family-level identifications are comparatively easy (e.g., a jar of 50 formalin-fixed larval specimens may all be reliably be identified to the same family, but may be unidentifieable ot species) -- I sure would like to know if the larvae of certain insular families occur at contiental regions (and vice versa), because that would be informative of the likely mechanisms behind the family-level geographic distributions. This is just an example I came up with off the top of my head just now. I seriously doubt that it is especially unique -- I'm sure there are other kinds of biogeographic questions one would like to ask of our collective data, that overalp with situations where identifications may be reliable only to the higher-rank level. > Species are assumed to be made up of population > of interbreeding individuals, > > But what is a genus vs, a subgenus, vs a tribe? Again...these debates have happened ad-nauseum (almost literally) on Taxacom, and I think it would be unwise to clutter DwC with presumptions about the outcomes of these unresolved debates (or worse, repeat the deabtes on this forum). By its fundamental nature, DwC is accomodating to a broad spectrum of user needs. Why restrict its utility to a subset of needs -- especially when the subset can be accomodated by adding appropriate filter criteria (e.g., in this case, via dwc:taxonRank)? > If these higher clades were somewhat stable and > had some agreed on understanding then this might > make sense. In many groups they do. > There is no clear reasoning behind why one clade > is a family in Mammals and a clade of similar > age is a genus in Beetles. Perhaps not. But within Mammals, or within Beetles, or within fishes, they might have more useful meaning. But that's not the point. The real point is that non-trivial amounts of data exist in our domain that prevents reliable assessments of aggregated organisms to be cirumscribed within the same species-rank taxon concept, but there are many reasons why we want to be able to assign reliable higher-rank taxon identifications to these aggregates. As such, if an Identification is a tuple of Individual and Taxon, then not only do we want to allow Idetifications to link to higher-rank taxa, but we don't want to be forced to restricting such links to be used only in cases where we have confidence that the members of an instance of Individual reliably fall within a species-rank taxon cocnept, even if they are not labelled as such. I just re-read the preceeding paragraph, and it makes my head spin; so apologies in advance to the dizzy among us.... Aloha, Rich _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org <mailto:tdwg-content@lists.tdwg.org> http://lists.tdwg.org/mailman/listinfo/tdwg-content
For clarification, in cases like #1, we ARE allowing one-to-many relationships between "composite" Individuals and Identifications as long as those Identifications represent differences of opinion about the common taxon, or refinements to a lower taxonomic level (e.g. I'm not capable or don't have time to determine the lowest taxonomic level common to all of the biological individuals in the jar, but later I am able to find that out). What we AREN'T allowing is for subsets of the composite "Individual" that belong to different lower level taxa to be identified to those taxa without first separating them into different Individuals. I think I'm stating the principle that Rich laid out correctly.
Yes, exactly. But see also my reply to Dusty about the "Erebia youngi or Erebia lafontainei" example. I don't think this breaks the rule, because it's still two competing and mutually exclusive assertions of taxonomic identity -- just that happen to have been made by the same person at the same time. Aloha, Rich
Thank you Rich for saying what I want to say (well...almost) so much better than I can.
Just want to add a couple of things:
I also question the utility of analyzing occurrence records at clades higher than species.
So you are assuming that species are clades. I agree, but that is a very controversial statement. Pretty much the reason why the PhyloCode doesn't extend to cover species. Species are not clades to many of us. So, that is to say that many would argue that analyzing occurrence records within/between clades may exclude species altogether.
In very practical terms, when you look at a clade and attach the "species" label to it, you overlay a ranking system to a clade system. That is completely arbitrary, and that is why the species rank is absolutely not different than any of the higher ranks.
Species are assumed to be made up of population of interbreeding individuals,
Well, again that very much depends on who you talk to. As Rich said, there is another forum where these things are discussed ad nauseam. We will never agree of what species are, maybe because we are talking about fiction.
But what is a genus vs, a subgenus, vs a tribe?
Fiction, vs fiction vs fiction.
If these higher clades were somewhat stable and had some agreed on understanding then this might make sense.
A clade is not more stable than a "genus" or "family" or "tribe". Every time we add a species to a genus, or a genus to a family, etc. we change the original taxon concept. That's certainly not stability. Every time we add a taxon to a clade we may change the ingroup content but we do not change the reference to the ancestor. In that regards, clades are much more stable.
There is no clear reasoning behind why one clade is a family in Mammals and a clade of similar age is a genus in Beetles.
Absolutely correct. Ranks are not comparable so two families or genera can't be compared, in fact two species can't be compared. Hennig suggested to start ranking from a specific age, e.g. Cretaceous or so, and that would make ranks more comparable but the entire exercise would not be practicable and we would keep arguing endlessly anyway.
Perhaps not. But within Mammals, or within Beetles, or within fishes, they might have more useful meaning.
Actually, I disagree with Rich on this one. Ranks are useful to the extend that they are compatible (they go well) with any hierarchy, but they do not add any value to our knowledge, e.g. what we really know of a group and its descendants.
Nico
Thank you Rich for saying what I want to say (well...almost) so much better than I can.
Thanks! But....(and I had a feeling this was coming...)
Perhaps not. But within Mammals, or within Beetles, or
within fishes,
they might have more useful meaning.
Actually, I disagree with Rich on this one. Ranks are useful to the extend that they are compatible (they go well) with any hierarchy, but they do not add any value to our knowledge, e.g. what we really know of a group and its descendants.
I can't tell you what a wonderful, reassuring, hope-inspiring pleasure it is that two people with such *profound* differences in how we view the classification of biodiversity (me, an ICZN Commissioner, fully committed to and embracing of the Linnean system of nomenclature, and who philosophically rejects the idea that "species" are somehow "special"; and Nico, an architect of the Phylocode, who rejects the value of ranks in nomenclature altogether, and embraces the notion that "species" are not only "special", but aren't even clades), can still be friends, colleagues on the same grant proposal, in full agreement on most of this thread (albeit for diametrically opposed reasons), and, perhaps most important of all, be completely civil in our often spirited (sometimes *very* spirited -- especially if err... "spirits" are involved) debates.
Jon Stewart would be proud of us both.
As for the utility of ranks -- well, I'm ready to agree to disagree, as long as we can both acknowledge that lots and lots of existing data "out there" are still classified (in a taxonomic sense) against the Linnaean nomenclatural system, for which dwc:taxonRank is a relevant (at least) attribute.
Aloha, Rich
Wonderful to agree and disagree with you, yes indeed, even better with *spirit*!
Jon Stewart would be proud of us both.
I went to the rally in DC last Saturday! We two are indeed sane :-)
As for the utility of ranks -- well, I'm ready to agree to disagree, as long as we can both acknowledge that lots and lots of existing data "out there" are still classified (in a taxonomic sense) against the Linnaean nomenclatural system, for which dwc:taxonRank is a relevant (at least) attribute.
Indeed! As mentioned in a previous email to you, I can live (even happy) with dwc:taxonRank :-)
Nico (who does not believe in species but thinks only clades are real)
<><><><><><><><><><><><><><><><><><><><><><><><> Nico Cellinese, Ph.D. Assistant Curator, Herbarium & Informatics Adjunct Assistant Professor, Department of Biology
Florida Museum of Natural History University of Florida 354 Dickinson Hall, PO Box 117800 Gainesville, FL 32611-7800, U.S.A. Tel. 352-273-1979 Fax 352-846-1861 http://cellinese.blogspot.com/
Ah, but here's the rub when it comes to Propositional or First Order Logic: If there is any assertion of Rich's---or anybody else's---about which you, Nico C, both agree and disagree, then every assertion whatsoever of yours is correct. Now that part, I realize you already accept. The catch is that every assertion whatsoever of yours is also false. See http://en.wikipedia.org/wiki/Principle_of_explosion --Bob
On Wed, Nov 3, 2010 at 10:49 PM, Nico Cellinese ncellinese@flmnh.ufl.eduwrote:
Wonderful to agree and disagree with you, yes indeed, even better with *spirit*!
Jon Stewart would be proud of us both.
I went to the rally in DC last Saturday! We two are indeed sane :-)
As for the utility of ranks -- well, I'm ready to agree to disagree, as long as we can both acknowledge that lots and lots of existing data "out there" are still classified (in a taxonomic sense) against the Linnaean nomenclatural system, for which dwc:taxonRank is a relevant (at least) attribute.
Indeed! As mentioned in a previous email to you, I can live (even happy) with dwc:taxonRank :-)
Nico (who does not believe in species but thinks only clades are real)
<><><><><><><><><><><><><><><><><><><><><><><><> Nico Cellinese, Ph.D. Assistant Curator, Herbarium & Informatics Adjunct Assistant Professor, Department of Biology
Florida Museum of Natural History University of Florida 354 Dickinson Hall, PO Box 117800 Gainesville, FL 32611-7800, U.S.A. Tel. 352-273-1979 Fax 352-846-1861 http://cellinese.blogspot.com/
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Of course Bob!!
Nico (picking up her pieces after exploding)
On Nov 3, 2010, at 11:15 PM, Bob Morris wrote:
Ah, but here's the rub when it comes to Propositional or First Order Logic: If there is any assertion of Rich's---or anybody else's---about which you, Nico C, both agree and disagree, then every assertion whatsoever of yours is correct. Now that part, I realize you already accept. The catch is that every assertion whatsoever of yours is also false. See http://en.wikipedia.org/wiki/Principle_of_explosion --Bob
On Wed, Nov 3, 2010 at 10:49 PM, Nico Cellinese ncellinese@flmnh.ufl.edu wrote: Wonderful to agree and disagree with you, yes indeed, even better with *spirit*!
Jon Stewart would be proud of us both.
I went to the rally in DC last Saturday! We two are indeed sane :-)
As for the utility of ranks -- well, I'm ready to agree to disagree, as long as we can both acknowledge that lots and lots of existing data "out there" are still classified (in a taxonomic sense) against the Linnaean nomenclatural system, for which dwc:taxonRank is a relevant (at least) attribute.
Indeed! As mentioned in a previous email to you, I can live (even happy) with dwc:taxonRank :-)
Nico (who does not believe in species but thinks only clades are real)
<><><><><><><><><><><><><><><><><><><><><><><><> Nico Cellinese, Ph.D. Assistant Curator, Herbarium & Informatics Adjunct Assistant Professor, Department of Biology
Florida Museum of Natural History University of Florida 354 Dickinson Hall, PO Box 117800 Gainesville, FL 32611-7800, U.S.A. Tel. 352-273-1979 Fax 352-846-1861 http://cellinese.blogspot.com/
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram phone (+1) 857 222 7992 (mobile)
Nice pic! I remember when you took it. For those who don't know us personally, Bob is the one standing, waving his arms (as Bob is prone to do). I am the one in the lower right foreground, looking like I've gone without sleep for days (as I am prone to do).
At the time, I was looking over your shoulder and saw that you were tweeting back and forth with Jim Croft (and I think Rod Page as well). What made that funny to me was that Jim was in the room, just out of frame on the left side of the photo, and Rod was (presumably) in Glasgow, and the three of you were having a virtual conversation in real time, during a conference presentation. It's a different world....
Anyway, to bring it back into scope for this email list, I'll note that in the photo I'm wearing my treasured GBIF/TDWG T-shirt, customized with the embroidered caption, "GUID's Rule". I think of "GUID's" as possesive, so it's comparable to something like "God's Rule". Coincidentally, I happen to be wearing that very same shirt right now.
So...where were we?
Rich
P.S. One final off-topic salvo: Nico, you have no idea how jealous I am that you were at the rally!
________________________________
From: Nico Cellinese [mailto:ncellinese@flmnh.ufl.edu] Sent: Wednesday, November 03, 2010 5:21 PM To: Bob Morris Cc: Richard Pyle; tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] A question of Ranks [was: Re: tdwg-content Digest, Vol 20, Issue 17] Of course Bob!!
Nico (picking up her pieces after exploding)
On Nov 3, 2010, at 11:15 PM, Bob Morris wrote:
Ah, but here's the rub when it comes to Propositional or First Order Logic: If there is any assertion of Rich's---or anybody else's---about which you, Nico C, both agree and disagree, then every assertion whatsoever of yours is correct. Now that part, I realize you already accept. The catch is that every assertion whatsoever of yours is also false. See http://en.wikipedia.org/wiki/Principle_of_explosion --Bob On Wed, Nov 3, 2010 at 10:49 PM, Nico Cellinese ncellinese@flmnh.ufl.edu wrote:
Wonderful to agree and disagree with you, yes indeed, even better with *spirit*!
Jon Stewart would be proud of us both.
I went to the rally in DC last Saturday! We two are indeed sane :-)
As for the utility of ranks -- well, I'm ready to agree to disagree, as long as we can both acknowledge that lots and lots of existing data "out there" are still classified (in a taxonomic sense) against the Linnaean nomenclatural system, for which dwc:taxonRank is a relevant (at least) attribute.
Indeed! As mentioned in a previous email to you, I can live (even happy) with dwc:taxonRank :-)
Nico (who does not believe in species but thinks only clades are real)
<><><><><><><><><><><><><><><><><><><><><><><><> Nico Cellinese, Ph.D. Assistant Curator, Herbarium & Informatics Adjunct Assistant Professor, Department of Biology Florida Museum of Natural History University of Florida 354 Dickinson Hall, PO Box 117800 Gainesville, FL 32611-7800, U.S.A. Tel. 352-273-1979 Fax 352-846-1861 http://cellinese.blogspot.com/
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram http://www.cs.umb.edu/%7Eram phone (+1) 857 222 7992 (mobile)
I think the pic was stripped by the listserv. Here it is http://bit.ly/9YQHMf
Nico
On Nov 4, 2010, at 5:10 AM, Richard Pyle wrote:
Nice pic! I remember when you took it. For those who don't know us personally, Bob is the one standing, waving his arms (as Bob is prone to do). I am the one in the lower right foreground, looking like I've gone without sleep for days (as I am prone to do).
At the time, I was looking over your shoulder and saw that you were tweeting back and forth with Jim Croft (and I think Rod Page as well). What made that funny to me was that Jim was in the room, just out of frame on the left side of the photo, and Rod was (presumably) in Glasgow, and the three of you were having a virtual conversation in real time, during a conference presentation. It's a different world....
Anyway, to bring it back into scope for this email list, I'll note that in the photo I'm wearing my treasured GBIF/TDWG T-shirt, customized with the embroidered caption, "GUID's Rule". I think of "GUID's" as possesive, so it's comparable to something like "God's Rule". Coincidentally, I happen to be wearing that very same shirt right now.
So...where were we?
Rich
P.S. One final off-topic salvo: Nico, you have no idea how jealous I am that you were at the rally!
From: Nico Cellinese [mailto:ncellinese@flmnh.ufl.edu] Sent: Wednesday, November 03, 2010 5:21 PM To: Bob Morris Cc: Richard Pyle; tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] A question of Ranks [was: Re: tdwg-content Digest, Vol 20, Issue 17]
Of course Bob!!
Nico (picking up her pieces after exploding)
On Nov 3, 2010, at 11:15 PM, Bob Morris wrote:
Ah, but here's the rub when it comes to Propositional or
First Order Logic: If there is any assertion of Rich's---or anybody else's---about which you, Nico C, both agree and disagree, then every assertion whatsoever of yours is correct. Now that part, I realize you already accept. The catch is that every assertion whatsoever of yours is also false. See http://en.wikipedia.org/wiki/Principle_of_explosion --Bob On Wed, Nov 3, 2010 at 10:49 PM, Nico Cellinese ncellinese@flmnh.ufl.edu wrote:
Wonderful to agree and disagree with you, yes
indeed, even better with *spirit*!
Jon Stewart would be proud of us both. I went to the rally in DC last Saturday! We two are
indeed sane :-)
As for the utility of ranks -- well, I'm
ready to agree to disagree, as long as we can both acknowledge that lots and lots of existing data "out there" are still classified (in a taxonomic sense) against the Linnaean nomenclatural system, for which dwc:taxonRank is a relevant (at least) attribute.
Indeed! As mentioned in a previous email to you, I
can live (even happy) with dwc:taxonRank :-)
Nico (who does not believe in species but thinks
only clades are real)
<><><><><><><><><><><><><><><><><><><><><><><><> Nico Cellinese, Ph.D. Assistant Curator, Herbarium & Informatics Adjunct Assistant Professor, Department of Biology Florida Museum of Natural History University of Florida 354 Dickinson Hall, PO Box 117800 Gainesville, FL 32611-7800, U.S.A. Tel. 352-273-1979 Fax 352-846-1861 http://cellinese.blogspot.com/ _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content -- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram <http://www.cs.umb.edu/%7Eram> phone (+1) 857 222 7992 (mobile)
No need to explode! Explosions happen in classical logics, but as already a link from Bob's link shows, paraconsistent logicshttp://en.wikipedia.org/wiki/Paraconsistent_logiccan deal more gracefully with inconsistencies. I used to converse with some paraconsistent people. There's even a handbook on it: *Handbook of Paraconsistency http://www.amazon.com/dp/1904987737/* (Jean-Yves Béziau http://en.wikipedia.org/wiki/Jean-Yves_B%C3%A9ziau, Walter Carnielli http://en.wikipedia.org/wiki/Walter_Carnielli and Dov Gabbayhttp://en.wikipedia.org/wiki/Dov_Gabbay, eds). London: King's College, 2007. ISBN 978-1-904987-73-4http://en.wikipedia.org/wiki/Special:BookSources/9781904987734
Maybe it should be a companion field guide for the next Bioblitz, when dealing with, well, logical issues in taxonomy ;-)
Bertram
--
Bertram Ludäscher
Professor of Computer Science
Dept of Computer Science & Genome Center
University of California, Davis
ludaesch@ucdavis.edu / daks.ucdavis.edu
Phone: +1-530-554-1800
On Wed, Nov 3, 2010 at 8:15 PM, Bob Morris morris.bob@gmail.com wrote:
Ah, but here's the rub when it comes to Propositional or First Order Logic: If there is any assertion of Rich's---or anybody else's---about which you, Nico C, both agree and disagree, then every assertion whatsoever of yours is correct. Now that part, I realize you already accept. The catch is that every assertion whatsoever of yours is also false. See http://en.wikipedia.org/wiki/Principle_of_explosion --Bob
On Wed, Nov 3, 2010 at 10:49 PM, Nico Cellinese ncellinese@flmnh.ufl.eduwrote:
Wonderful to agree and disagree with you, yes indeed, even better with *spirit*!
Jon Stewart would be proud of us both.
I went to the rally in DC last Saturday! We two are indeed sane :-)
As for the utility of ranks -- well, I'm ready to agree to disagree, as long as we can both acknowledge that lots and lots of existing data "out there" are still classified (in a taxonomic sense) against the Linnaean nomenclatural system, for which dwc:taxonRank is a relevant (at least) attribute.
Indeed! As mentioned in a previous email to you, I can live (even happy) with dwc:taxonRank :-)
Nico (who does not believe in species but thinks only clades are real)
<><><><><><><><><><><><><><><><><><><><><><><><> Nico Cellinese, Ph.D. Assistant Curator, Herbarium & Informatics Adjunct Assistant Professor, Department of Biology
Florida Museum of Natural History University of Florida 354 Dickinson Hall, PO Box 117800 Gainesville, FL 32611-7800, U.S.A. Tel. 352-273-1979 Fax 352-846-1861 http://cellinese.blogspot.com/
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram http://www.cs.umb.edu/%7Eram phone (+1) 857 222 7992 (mobile)
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
See also http://xkcd.com/816/ on the subject.
On Wed, Nov 3, 2010 at 11:26 PM, Bertram Ludaescher ludaesch@ucdavis.edu wrote:
No need to explode! Explosions happen in classical logics, but as already a link from Bob's link shows, paraconsistent logics can deal more gracefully with inconsistencies. I used to converse with some paraconsistent people. There's even a handbook on it: Handbook of Paraconsistency (Jean-Yves Béziau, Walter Carnielli and Dov Gabbay, eds). London: King's College, 2007. ISBN 978-1-904987-73-4
Maybe it should be a companion field guide for the next Bioblitz, when dealing with, well, logical issues in taxonomy ;-) Bertram
--
Bertram Ludäscher
Professor of Computer Science
Dept of Computer Science & Genome Center
University of California, Davis
ludaesch@ucdavis.edu / daks.ucdavis.edu
Phone: +1-530-554-1800
On Wed, Nov 3, 2010 at 8:15 PM, Bob Morris morris.bob@gmail.com wrote:
Ah, but here's the rub when it comes to Propositional or First Order Logic: If there is any assertion of Rich's---or anybody else's---about which you, Nico C, both agree and disagree, then every assertion whatsoever of yours is correct. Now that part, I realize you already accept. The catch is that every assertion whatsoever of yours is also false. See http://en.wikipedia.org/wiki/Principle_of_explosion --Bob
On Wed, Nov 3, 2010 at 10:49 PM, Nico Cellinese ncellinese@flmnh.ufl.edu wrote:
Wonderful to agree and disagree with you, yes indeed, even better with *spirit*!
Jon Stewart would be proud of us both.
I went to the rally in DC last Saturday! We two are indeed sane :-)
As for the utility of ranks -- well, I'm ready to agree to disagree, as long as we can both acknowledge that lots and lots of existing data "out there" are still classified (in a taxonomic sense) against the Linnaean nomenclatural system, for which dwc:taxonRank is a relevant (at least) attribute.
Indeed! As mentioned in a previous email to you, I can live (even happy) with dwc:taxonRank :-) Nico (who does not believe in species but thinks only clades are real)
<><><><><><><><><><><><><><><><><><><><><><><><> Nico Cellinese, Ph.D. Assistant Curator, Herbarium & Informatics Adjunct Assistant Professor, Department of Biology
Florida Museum of Natural History University of Florida 354 Dickinson Hall, PO Box 117800 Gainesville, FL 32611-7800, U.S.A. Tel. 352-273-1979 Fax 352-846-1861 http://cellinese.blogspot.com/
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram phone (+1) 857 222 7992 (mobile)
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
I looked up "hasPart" and "isPartOf" and came up with: http://dublincore.org/documents/dcmi-terms/#terms-hasPart and http://dublincore.org/documents/dcmi-terms/#terms-isPartOf
The RDF associated with the URIs: http://purl.org/dc/terms/hasPart (i.e. dcterms:hasPart) http://purl.org/dc/terms/isPartOf (i.e. dcterms:isPartOf) seems relatively "safe" in that I don't think it implies anything that is not true about the relationship between "composite" Individuals and their child/subset Individuals that have been identified to a more precise taxonomic level. Both terms are subproperties of dcterms:relation which also seems to correctly represent the connection between the parent and child Individuals.
If one wanted to convey more information about the relationship (such as the fact that the child Individuals would be members of all of the taxonomic groups that their parents were members of, or things like that), I suppose one could define more specific DwC properties (perhaps subproperties of dcterms:hasPart and dcterms:isPartOf) that carried more "baggage". Again, careful thought would be needed first.
Steve
Richard Pyle wrote:
Maybe the "somethings" will all be identified to the same species, and maybe they'll all be identified to different species. When we get to that stage, if more than one taxon is represented within what was previosuly regarded as a single Individual, then we generate new Individual instances that correspond to the different taxon identifications, and tag them accordingly with their own Identifications (maintaining, of course, the approproiate "derived from" relatioships among the Individuals).
They will want to relate that specimen back to the jar it came from.
Yes, definitely! We certainly want to have a mechanism to address Dean's point about Individuals derived from Individuals, whether it be a parent-child sort of derivation, or something more general, is a topic for another thread.
Hi Nico,
Are you wanting to model things that you think are actually species within a larger clade or are these more like subspecies?
I think this gets at a bit of what these are to be used for.
Are they for modeling classifications or occurrences and other data about individuals or sets of individuals which represent instances of species?
- Pete
On Wed, Nov 3, 2010 at 1:19 PM, Nico Cellinese ncellinese@flmnh.ufl.eduwrote:
Steve is using Species as ranks in his definition and I think this is the wrong approach. Let's make all this rank agnostic please! Use the word taxon! What if I have a group of organisms that represents a polyphyletic species and I want to name a lineage (group of organisms) within this traditionally recognized species that I am not recognizing as species per se (as in rank of species). In other words, Identifications and ranks are two different things, so let's abandon ranks for a more objective discussion on taxa. Individuals are definitively not species.
Nico (the other one)
On Wed, Nov 3, 2010 at 7:24 AM, Steve Baskauf < steve.baskauf@vanderbilt.edu> wrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the *Individual should only include groups of organisms that are reliably known to be of a single species* - *it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species*). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of mult
iple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote: Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa.
Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Delete comment < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ...
Comment 8 < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... by deepreef@hawaii.rr.com < http://code.google.com/u/deepreef@hawaii.rr.com/%3E , Today (8 hours ago) I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Delete comment < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ...
Comment 9 < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
I want to model things in a way that makes sense to me and everyone else who disagree with me too. If we all disagree of what a species is, then let's not deal with species or even ranks. Individuals = taxa. As you said, species can be sets of individuals, so how's that differ from Steve's aggregates?
Nico
On Nov 3, 2010, at 4:42 PM, Peter DeVries wrote:
Hi Nico,
Are you wanting to model things that you think are actually species within a larger clade or are these more like subspecies?
I think this gets at a bit of what these are to be used for.
Are they for modeling classifications or occurrences and other data about individuals or sets of individuals which represent instances of species?
- Pete
On Wed, Nov 3, 2010 at 1:19 PM, Nico Cellinese ncellinese@flmnh.ufl.edu wrote:
Steve is using Species as ranks in his definition and I think this is the wrong approach. Let's make all this rank agnostic please! Use the word taxon! What if I have a group of organisms that represents a polyphyletic species and I want to name a lineage (group of organisms) within this traditionally recognized species that I am not recognizing as species per se (as in rank of species). In other words, Identifications and ranks are two different things, so let's abandon ranks for a more objective discussion on taxa. Individuals are definitively not species.
Nico (the other one)
On Wed, Nov 3, 2010 at 7:24 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of mult
iple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote: Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa.
Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Comment 8 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8 by deepreef@hawaii.rr.com http://code.google.com/u/deepreef@hawaii.rr.com/ , Today (8 hours ago) I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Comment 9 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
I think that there was something for this kind of thing in the current DarwinCore, something like batchNumber?
Once you let the URI stand for anything then you are unable to make useful valid assertions about that thing.
- Pete
On Wed, Nov 3, 2010 at 3:59 PM, Nico Cellinese ncellinese@flmnh.ufl.eduwrote:
I want to model things in a way that makes sense to me and everyone else who disagree with me too. If we all disagree of what a species is, then let's not deal with species or even ranks. Individuals = taxa. As you said, species can be sets of individuals, so how's that differ from Steve's aggregates?
Nico
On Nov 3, 2010, at 4:42 PM, Peter DeVries wrote:
Hi Nico,
Are you wanting to model things that you think are actually species within a larger clade or are these more like subspecies?
I think this gets at a bit of what these are to be used for.
Are they for modeling classifications or occurrences and other data about individuals or sets of individuals which represent instances of species?
- Pete
On Wed, Nov 3, 2010 at 1:19 PM, Nico Cellinese ncellinese@flmnh.ufl.eduwrote:
Steve is using Species as ranks in his definition and I think this is the wrong approach. Let's make all this rank agnostic please! Use the word taxon! What if I have a group of organisms that represents a polyphyletic species and I want to name a lineage (group of organisms) within this traditionally recognized species that I am not recognizing as species per se (as in rank of species). In other words, Identifications and ranks are two different things, so let's abandon ranks for a more objective discussion on taxa. Individuals are definitively not species.
Nico (the other one)
On Wed, Nov 3, 2010 at 7:24 AM, Steve Baskauf < steve.baskauf@vanderbilt.edu> wrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the *Individual should only include groups of organisms that are reliably known to be of a single species* - *it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species *). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of mult
iple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote: Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa.
Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObjectbecause the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Delete comment < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ...
Comment 8 < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... by deepreef@hawaii.rr.com < http://code.google.com/u/deepreef@hawaii.rr.com/%3E , Today (8 hours ago) I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Delete comment < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ...
Comment 9 < http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
participants (7)
-
Bertram Ludaescher
-
Bob Morris
-
Dean Pentcheff
-
Nico Cellinese
-
Peter DeVries
-
Richard Pyle
-
Steve Baskauf