Request for vote on proposals to add Individual as a Darwin Core class and to add the term individualRemarks as a term within that class
I am pleased with the significant and thoughtful discussion that has taken place on the tdwg-content email list regarding the relationships among Occurrences, Individuals, and other entities that are a part of the community's thinking about biodiversity metadata and the way that those metadata are structured. It appears from the discussion that there is widespread acceptance of the idea that Individual as a concept has a place in the structuring of biodiversity metadata and that there is some consensus of what "Individual" means (i.e. an entity ranging from actual biological individuals to small coherent populations that can reliably be asserted to represent a single taxon). Whether that acceptance and consensus constitutes a compelling need for adding two new terms (the class dwc:Individual and dwc:individualRemarks) to the Darwin Core standard or not is the point of a TAG "vote". Given the discussion that has occurred, it seems to me that there are two reasons why there is an actual need for those terms. One reason is that if members of the Darwin Core constituency intend to structure their metadata in a fully normalized manner that includes grouping Occurrences by Individuals (and it appears that there are at least several who intend to do this), the term dwc:individualRemarks is needed to provide a means indicate the nature of the individual (i.e. is it a biological individual, clonal individuals, a small population, etc.?) and the class dwc:Individual is needed as the category within which to put individualRemarks so as to indicate that individualRemarks is a property of Individuals. The second reason for explicitly recognizing Individual as a class is that it would place a term representing the concept of "Individual" within a "well-known vocabulary". I feel that would be critical for facilitating the ultimate development of a recommendation for the representation of Darwin Core as RDF.
At this point, it is not clear to me that there are any other existing DwC terms that should be moved to a new Individual class. Originally, I suggested that individualCount should be placed in that class, but I no longer think so. Counting the number of individuals is really something that happens when an Occurrence takes place and a small cohesive group of a single taxon (e.g. wolf pack or plant population) could have an individualCount that changes over time. As was discussed earlier in on the email list, the xxxxxxID terms probably really belong in the Record-level terms category rather than being listed within particular classes. So I don't believe that dwc:individualID should be in the proposed class either. As I detailed in my Biodiversity Informatics paper, an Individual is really an entity that serves primarily as a node that allows the grouping of other resources (namely Occurrences and Identifications). As such, it really has few (or no) properties that can be known outside of Occurrences.
Thus I would like to "call the question" on the issue of the proposal. I would suggest that the issue of adding the class dwc:Individual and the term dwc:individualRemarks within it be addressed in a single vote, since there little point in having one term without the other. I would also hope that those on the TAG who choose to vote would review the list discussion carefully first. Given that the question of "what exactly is an Individual?" came up a few times after that question was clearly answered in the thread is an indication that some people entered the thread later on without the benefit of having read some of the earlier posts.
Steve
At this point, it is not clear to me that there are any other existing DwC terms that should be moved to a new Individual class.
My feeling is that, if an Individual class was established, it would be a logical place for dwc:preparations, dwc:previousIdentifications, dwc:associatedSequences, and possibly dwc:disposition (depending on how, exactly, that term is scoped).
I was tempted to also add dwc:catalogNumber and dwc:otherCatalogNumbers; but now I'm wondering why those two terms are not included among the "Record-level Terms". Does anyone know the logic behind why things like dwc:institutionCode and dwc:collectionCode (and related terms) are in "Record-level Terms", while dwc:catalogNumber and dwc:otherCatalogNumbers are in the Occurrence class?
Originally, I suggested that individualCount should be placed in that class, but I no longer think so. Counting the number of individuals is really something that happens when an Occurrence takes place and a small cohesive group of a single taxon (e.g. wolf pack or plant population) could have an individualCount that changes over time.
Hmmm...I'm not sure. If "Individual" can range fron a single organism, to several organisms, to a single colony of organisms, to a defined population of organisms (possibly up to an entire taxon), then how would you indicate the "scope" of organisms represented by a single instance of "Individual"? Wouldn't individualCount serve this function? Or, maybe you're right that individualCount should describe the Occurrence, and another term (e.g., "individualScope") would be needed to define the scope of the "Individual" instance.
As was discussed earlier in on the email list, the xxxxxxID terms probably really belong in the Record-level terms category rather than being listed within particular classes. So I don't believe that dwc:individualID should be in the proposed class either.
I missed that conversation. This doesn't seem right to me, but if the consensus is that the xxxID terms are all best treated as "Record-level Terms", then certainly "individualiD" should be treated accordingly.
Aloha, Rich
Not sure if this has been mentioned as I have struggled to keep up with this thread, but it sounds to me like the benefit of the Individual class/properties is to be able to link together various web resources that refer to data obtained from the same individual in some manner, so we probably need terms that allow the description of how these individuals, or parts of individuals relate to each other. The Scope idea will help, but maybe there is a need for terms like "partOfIndividual", "derivedFromIndividual"?
Then, I suppose we are heading into interaction territory, so we then need to model how interactions between individuals can be defined - eg "foundOnHostIndividual" ...
This is starting to sound quite appealing, as I think the "Individual" resources could be seen as the atoms to defining occurrences, in the same way that NameUsages are the atoms for defining Taxon Names and Concepts.
Kevin
-----Original Message----- From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Richard Pyle Sent: Monday, 1 November 2010 7:46 a.m. To: 'Steve Baskauf'; tdwg-content@lists.tdwg.org; tdwg-tag@lists.tdwg.org Subject: Re: [tdwg-content] Request for vote on proposals to add Individual as a Darwin Core class and to add the term individualRemarks as a term within that class
At this point, it is not clear to me that there are any other existing DwC terms that should be moved to a new Individual class.
My feeling is that, if an Individual class was established, it would be a logical place for dwc:preparations, dwc:previousIdentifications, dwc:associatedSequences, and possibly dwc:disposition (depending on how, exactly, that term is scoped).
I was tempted to also add dwc:catalogNumber and dwc:otherCatalogNumbers; but now I'm wondering why those two terms are not included among the "Record-level Terms". Does anyone know the logic behind why things like dwc:institutionCode and dwc:collectionCode (and related terms) are in "Record-level Terms", while dwc:catalogNumber and dwc:otherCatalogNumbers are in the Occurrence class?
Originally, I suggested that individualCount should be placed in that class, but I no longer think so. Counting the number of individuals is really something that happens when an Occurrence takes place and a small cohesive group of a single taxon (e.g. wolf pack or plant population) could have an individualCount that changes over time.
Hmmm...I'm not sure. If "Individual" can range fron a single organism, to several organisms, to a single colony of organisms, to a defined population of organisms (possibly up to an entire taxon), then how would you indicate the "scope" of organisms represented by a single instance of "Individual"? Wouldn't individualCount serve this function? Or, maybe you're right that individualCount should describe the Occurrence, and another term (e.g., "individualScope") would be needed to define the scope of the "Individual" instance.
As was discussed earlier in on the email list, the xxxxxxID terms probably really belong in the Record-level terms category rather than being listed within particular classes. So I don't believe that dwc:individualID should be in the proposed class either.
I missed that conversation. This doesn't seem right to me, but if the consensus is that the xxxID terms are all best treated as "Record-level Terms", then certainly "individualiD" should be treated accordingly.
Aloha, Rich
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
On Sun, Oct 31, 2010 at 4:21 PM, Kevin Richards < RichardsK@landcareresearch.co.nz> wrote:
Not sure if this has been mentioned as I have struggled to keep up with
this thread, but it sounds to me like the benefit of the Individual class/properties is to be able > to link together various web resources that refer to data obtained from the same individual in some manner, so we probably need terms that allow the description of > how these individuals, or parts of individuals relate to each other. The Scope idea will help, but maybe there is a need for terms like "partOfIndividual",
"derivedFromIndividual"?
Now you're talkin Kevin! Actually, now you're talking about ontology, and I plead: Go slow, develop use cases; develop competency questions; develop tools. I note that Steve was careful to separate the question of adding terms to the normative, representation free, DwC, from the problem(s) of making an RDF representation of same:
On Sun, Oct 31, 2010 at 10:38 AM, Steve Baskauf < steve.baskauf@vanderbilt.edu> wrote: >
I am pleased with the significant and thoughtful discussion that has taken
place on the tdwg-content email list
regarding the relationships among Occurrences, Individuals, and other entities that are a part of the community's
thinking about biodiversity metadata
and the way that those metadata are structured. [...] I feel that > would
be critical for
facilitating the ultimate development of a recommendation for the
representation of Darwin Core as RDF.
[...]
This separation is important, because what use one intends to make of an RDF representation has a lot of bearing on what "gotcha's" one has to take care about.
For example, if there is a desire to exploit formal semantics available for RDF stack ---which will probably emerge as a requirement once one starts talking about relations between properties---then different surprises will emerge from the pitfalls if one "merely" wishes to put SKOS relations on the properties and reason about the SKOS instead of the science. But I guess, for example, that Miranker's Morphster project [1] will benefit most in its current use cases, from good mereological ontologies for descriptive data, not just stuff like "more general than".
There are surprises even in the simplest use of RDFS and formalisms about classes. I've previously whined about premature assignment of rdfs:domain while conceding (did I???) that it can sometimes make a designer's intention clearer to humans. Perhaps more startling is that type assignment automatically "creates" an rdfs:class if one was not already available, due to the formal semantics of rdf:type [3]. Thus, in an earlier posting, Paul Murray has (unintentionally?) introduced a new class apni:TaxonName in 33407.rdf [2] via <rdf:type rdf:resource=" http://biodiversity.org.au/voc/apni/APNI#TaxonName%22/%3E
Then there is the question of adequate tools for the desired style of ontology architecture. The OWL community's important tools are not friendly even to DublinCore, whose style is(?) what DwC follows. (Steve Baskauf has complained to me in private email that the Manchester validators don't seem to even check rdfs vocabulary correctly; Paul complained that Protege4 makes a big mishmash of Properties when importing DwC. (Both of these are probably false positives in cases of insufficient typing of properties themselves, and the OWL community probably doesn't care about the origin or utility of such weak typing [4]. )
So the hard part is yet to come. But I agree with you. It is quite appealing.
Bob Morris
Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram http://www.cs.umb.edu/%7Eram phone (+1) 857 222 7992 (mobile)
[1] http://www.cs.utexas.edu/~miranker/studentWeb/MorphsterHomePage.htmlhttp://www.cs.utexas.edu/%7Emiranker/studentWeb/MorphsterHomePage.html [2] http://biodiversity.org.au/apni.name/33407.rdf [3] http://www.w3.org/TR/rdf-schema/#ch_type [4] https://mailman.stanford.edu/pipermail/p4-feedback/2009-October/002448.html
--
Yes, I agree with statements made by Bob, Kevin, and Rich: if we get the class Individual, we need to think carefully what properties go with it. The only one that is obvious to me at the present is individualRemarks. In line with what Bob said, I think that figuring out how we will use terms and trying to model relationships and properties in RDF will make it more clear what terms can be appropriately used with instances of various classes and whether it "works" to have a term be applied to several classes. I think individualCount is a case in point. At first it seemed to me that it should be in the Individual class, because it's a property of Individual the way I've defined Individual. But since the count may change with time, then maybe it should be a property of Occurrence (which are essentially observations of the Individual at different times). But how often is anyone actually going to track the number of individuals over time? I could see it happening with wolf packs, but in that case one would probably rather track each individual wolf rather than calling the whole pack an "individual". So what makes sense depends on how people need to use individualCount.
Rich commented that he wasn't sure why catalog number wasn't in the Record Level terms. I'm also wondering about that, particularly since I would use a catalog number with a token (e.g. PreservedSpecimen, StillImage) or Individual (if it were a living specimen) but probably not for an Occurrence (although somebody else might). I believe that in a previous discussion of the xxxxxxxID terms on tdwg-content (can't check because Internet is out at my house and I'm working offline) someone said that all of those terms should go into the Record Level terms because one might use them as properties for various classes. But that hasn't happened. Is that because nobody has gotten around to it or because it takes some kind of official action to "change the standard"? To recap that conversation from memory, it was agreed that people appropriately use the xxxxxxxID terms two ways: as column headings in a database table to indicate what the identifier is for the row, and as ID references connecting the row to records in other tables. In the first kind of use, it makes sense to have xyzID in the class xyz, but in the second case it does not. You can see an example of these dual uses in the DwC XML guide. This is another example where some clear instructions on the use of terms in the DwC documentation would help (i.e. to say that both of these kinds of uses are acceptable). When I was trying to understand Paul's email on Taxon and Name, I was having trouble because I was continually confused about which of the two ways of using xxxxxxID terms he was intending as either would apparently be correct.
With regards to "individualScope", I was assuming that could be covered in individualRemarks. I suppose individualScope could have a controlled vocabulary and individualRemarks not. But again, I think we can see how people want to use Individuals before deciding what terms need to go with it (assuming that Individual gets added to DwC).
Steve
Bob Morris wrote:
On Sun, Oct 31, 2010 at 4:21 PM, Kevin Richards <RichardsK@landcareresearch.co.nz mailto:RichardsK@landcareresearch.co.nz> wrote:
Not sure if this has been mentioned as I have struggled to keep up
with this thread, but it sounds to me like the benefit of the Individual class/properties is to be able > to link together various web resources that refer to data obtained from the same individual in some manner, so we probably need terms that allow the description of > how these individuals, or parts of individuals relate to each other. The Scope idea will help, but maybe there is a need for terms like "partOfIndividual",
"derivedFromIndividual"?
Now you're talkin Kevin! Actually, now you're talking about ontology, and I plead: Go slow, develop use cases; develop competency questions; develop tools. I note that Steve was careful to separate the question of adding terms to the normative, representation free, DwC, from the problem(s) of making an RDF representation of same:
On Sun, Oct 31, 2010 at 10:38 AM, Steve Baskauf <steve.baskauf@vanderbilt.edu mailto:steve.baskauf@vanderbilt.edu> wrote: >
I am pleased with the significant and thoughtful discussion that has
taken place on the tdwg-content email list
regarding the relationships among Occurrences, Individuals, and other entities that are a part of the community's
thinking about biodiversity metadata
and the way that those metadata are structured. [...] I feel that >
would be critical for
facilitating the ultimate development of a recommendation for the
representation of Darwin Core as RDF.
[...]
This separation is important, because what use one intends to make of an RDF representation has a lot of bearing on what "gotcha's" one has to take care about.
For example, if there is a desire to exploit formal semantics available for RDF stack ---which will probably emerge as a requirement once one starts talking about relations between properties---then different surprises will emerge from the pitfalls if one "merely" wishes to put SKOS relations on the properties and reason about the SKOS instead of the science. But I guess, for example, that Miranker's Morphster project [1] will benefit most in its current use cases, from good mereological ontologies for descriptive data, not just stuff like "more general than".
There are surprises even in the simplest use of RDFS and formalisms about classes. I've previously whined about premature assignment of rdfs:domain while conceding (did I???) that it can sometimes make a designer's intention clearer to humans. Perhaps more startling is that type assignment automatically "creates" an rdfs:class if one was not already available, due to the formal semantics of rdf:type [3]. Thus, in an earlier posting, Paul Murray has (unintentionally?) introduced a new class apni:TaxonName in 33407.rdf [2] via <rdf:type rdf:resource="http://biodiversity.org.au/voc/apni/APNI#TaxonName%22/%3E
Then there is the question of adequate tools for the desired style of ontology architecture. The OWL community's important tools are not friendly even to DublinCore, whose style is(?) what DwC follows. (Steve Baskauf has complained to me in private email that the Manchester validators don't seem to even check rdfs vocabulary correctly; Paul complained that Protege4 makes a big mishmash of Properties when importing DwC. (Both of these are probably false positives in cases of insufficient typing of properties themselves, and the OWL community probably doesn't care about the origin or utility of such weak typing [4]. )
So the hard part is yet to come. But I agree with you. It is quite appealing.
Bob Morris
Robert A. Morris Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: morris.bob@gmail.com mailto:morris.bob@gmail.com web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/mw/FilteredPush http://www.cs.umb.edu/~ram http://www.cs.umb.edu/%7Eram phone (+1) 857 222 7992 (mobile)
[1] http://www.cs.utexas.edu/~miranker/studentWeb/MorphsterHomePage.html http://www.cs.utexas.edu/%7Emiranker/studentWeb/MorphsterHomePage.html [2] http://biodiversity.org.au/apni.name/33407.rdf [3] http://www.w3.org/TR/rdf-schema/#ch_type [4] https://mailman.stanford.edu/pipermail/p4-feedback/2009-October/002448.html
--
Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa. Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7 by baskaufs http://code.google.com/u/baskaufs/, Today (8 hours ago)
As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Delete commenthttp://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 8http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8 by deepreef@hawaii.rr.comhttp://code.google.com/u/deepreef@hawaii.rr.com/ , Today (8 hours ago)
I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Delete commenthttp://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 9http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9 by baskaufs http://code.google.com/u/baskaufs/, Today (8 hours ago)
Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of multiple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote:
Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa. Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7 by baskaufs http://code.google.com/u/baskaufs/, Today (8 hours ago) As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Delete comment http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 8 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8 by deepreef@hawaii.rr.com http://code.google.com/u/deepreef@hawaii.rr.com/, Today (8 hours ago) I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Delete comment http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 9 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9 by baskaufs http://code.google.com/u/baskaufs/, Today (8 hours ago) Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
I have no further objection if the wording is changed as you propose here. If no one else has any objections, this one seems ready to be prepared for the TAG following the voting mechanism proposed at TDWG this year. I'll take the responsibility to see that through as soon as I can.
On Wed, Nov 3, 2010 at 7:24 AM, Steve Baskauf steve.baskauf@vanderbilt.eduwrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of multiple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote:
Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa. Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7 by baskaufs http://code.google.com/u/baskaufs/, Today (8 hours ago)
As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Delete commenthttp://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 8http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8 by deepreef@hawaii.rr.comhttp://code.google.com/u/deepreef@hawaii.rr.com/ , Today (8 hours ago)
I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Delete commenthttp://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 9http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9 by baskaufs http://code.google.com/u/baskaufs/, Today (8 hours ago)
Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707http://bioimages.vanderbilt.edu
All:
What happened to changing the term to organism? I think the word organism, at least strictly speaking, is closer to our intention than individual.
Compare the definitions you get from:
http://www.google.com/search?hl=en&q=define:+individual&btnG=Search
And
http://www.google.com/search?hl=en&q=define:+organism&btnG=Search
-Stan
On 11/3/10 9:02 AM, "John Wieczorek" tuco@berkeley.edu wrote:
I have no further objection if the wording is changed as you propose here. If no one else has any objections, this one seems ready to be prepared for the TAG following the voting mechanism proposed at TDWG this year. I'll take the responsibility to see that through as soon as I can.
On Wed, Nov 3, 2010 at 7:24 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of multiple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote: Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa.
Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Comment 8 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8 by deepreef@hawaii.rr.com http://code.google.com/u/deepreef@hawaii.rr.com/ , Today (8 hours ago) I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Comment 9 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
This is what happened to Organism, from the thread Re: [tdwg-content] Treatise on Occurrence, tokens, and basisOfRecord [SEC=UNCLASSIFIED].
John said:
"I like Organism, but I don't like the inconsistency it would make with individualID and individualCount on the one hand, or extra work to change these to organismID and organismCount on the other. Individual doesn't carry these extra burdens, and could be added without breaking any existing applications."
Rich said "OK" so it must be OK. ;-)
On Wed, Nov 3, 2010 at 10:03 AM, Blum, Stan SBlum@calacademy.org wrote:
All:
What happened to changing the term to organism? I think the word organism, at least strictly speaking, is closer to our intention than individual.
Compare the definitions you get from:
http://www.google.com/search?hl=en&q=define:+individual&btnG=Search
And
http://www.google.com/search?hl=en&q=define:+organism&btnG=Search
-Stan
On 11/3/10 9:02 AM, "John Wieczorek" tuco@berkeley.edu wrote:
I have no further objection if the wording is changed as you propose here. If no one else has any objections, this one seems ready to be prepared for the TAG following the voting mechanism proposed at TDWG this year. I'll take the responsibility to see that through as soon as I can.
On Wed, Nov 3, 2010 at 7:24 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of multiple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote:
Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa.
Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Comment 8 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8 by deepreef@hawaii.rr.com http://code.google.com/u/deepreef@hawaii.rr.com/ , Today (8 hours ago)
I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Comment 9 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago)
Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Sorry, it was late when I read John's original comment on Individual versus Organism. It didn't penetrate. I accept that Individual is more consistent and less work.
-Stan
One way to deal with this is to make it clear in the documentation that Individual is short for "IndividualOrganism".
When this individual is broken up into different parts like "tissue specimen" this can be handled through subproperties of "hasPart", "isPartOf"
On Wed, Nov 3, 2010 at 12:38 PM, John Wieczorek tuco@berkeley.edu wrote:
This is what happened to Organism, from the thread Re: [tdwg-content] Treatise on Occurrence, tokens, and basisOfRecord [SEC=UNCLASSIFIED].
John said:
"I like Organism, but I don't like the inconsistency it would make with individualID and individualCount on the one hand, or extra work to change these to organismID and organismCount on the other. Individual doesn't carry these extra burdens, and could be added without breaking any existing applications."
Rich said "OK" so it must be OK. ;-)
On Wed, Nov 3, 2010 at 10:03 AM, Blum, Stan SBlum@calacademy.org wrote:
All:
What happened to changing the term to organism? I think the word
organism, at least strictly speaking, is closer to our intention than individual.
Compare the definitions you get from:
http://www.google.com/search?hl=en&q=define:+individual&btnG=Search
And
http://www.google.com/search?hl=en&q=define:+organism&btnG=Search
-Stan
On 11/3/10 9:02 AM, "John Wieczorek" tuco@berkeley.edu wrote:
I have no further objection if the wording is changed as you propose
here. If no one else has any objections, this one seems ready to be prepared for the TAG following the voting mechanism proposed at TDWG this year. I'll take the responsibility to see that through as soon as I can.
On Wed, Nov 3, 2010 at 7:24 AM, Steve Baskauf <
steve.baskauf@vanderbilt.edu> wrote:
John, I'm not sure that I agree with your analysis that the definition prevents
the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of multiple species.
I agree that there is a need for a term that represents "collections",
"bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote:
Most of you probably do not receive postings from the Google Code site
for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an
Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa.
Steve argues that Identifications should be applied only to aggregates of
a single taxon.
Steve, aside from the aggregate issue (which should be solved
satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 <
http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago)
As a result of the discussion that has taken place on the tdwg-content
email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual
organism or
a group of individual organisms that can reliably be known to represent a
single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one
or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement
that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Delete comment <
http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ...
Comment 8 <
http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... by deepreef@hawaii.rr.com < http://code.google.com/u/deepreef@hawaii.rr.com/%3E , Today (8 hours ago)
I think the definition should be "...represent a single taxon". We
shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather
than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Delete comment <
http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ...
Comment 9 <
http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Typ... by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago)
Well, the reason that I defined it to be members of the same species is
to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then
one would have to specify which identification went with which biological
individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Stan wrote:
What happened to changing the term to organism? I think the word organism, at least strictly speaking, is closer to our intention than individual.
I agree, but John pointed out that we already have individualID and individualCount -- it would be more cumbersome to change those existing terms to organismID and organismCount.
Steve wrote:
I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species).
I agree on this point. There are two possible meanings to having an Individual identified at a higher rank like the family "Pomacanthidae": 1) "This is a single species within the family Pomacanthidae that I am unable to pinpoint just now"; and 2) "All of these things are in the Pomacanthidae, but I don't know how many species are represented." The former would meet your definition; the latter would not.
Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of multiple species.
I still prefer "taxon", but I'm not so keen on revising the definition to indicate the intent that it should be a single species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
I'm not comfortable with that approach. Can you elaborate on this:
If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose.
I don't understand why the original definition hinges on "species". I am especially uncomfortable with an insinuation within DwC that a "spceies" is somehow a special taxon rank, more meaningful than higher ranks. I know many biologists believe this, but many do not (it's a contentious issue). I think DwC should remain agnostic on this point.
I guess what I really need to understand is what are the benefits (from a data exchange perspective) for representing Individuals as only those sets of organisms believed to be circumscribed within a taxon concept of the rank of species or higher.
Thinking about this does raise a slightly related issue, involving the Identification class and the Taxon Class; but I'll include that in a sepaeate posting.
Aloha, Rich
An remark about aggregations, but with no insight as to its value:
As a biologically naive lurker, it strikes me that the struggle here may be an artifact of mixing all the reasons for aggregating organisms in an observation, whereas there plenty of use cases that suit some sets of those reasons but not others. For example some aggregates are artifacts of collecting or observation methodology, e.g. lots of stuff in a vial. Some, such as on paleo specimens are a consequence of a bunch of critters being at the same watering spot when the mud came roaring down the river. Some are ecosystem entanglements. The reason underlying the aggregation might itself participate in some kinds of scientific inference but not in other kinds. For example, if a bunch of species are all in the same paleo specimen, that helps a lot with establishing the era for each of the species' presence on earth and I suppose has implications for evolutionary history. Lichens, I gather, are treated as a single individual and given a taxon name, even though for much about lichen biology it's important that there are two individuals, from different taxa, in a specimen. Etc., etc. So to me, it looks like the problem might be less in whether there are disagreements about what constitutes an individual, so much as that there are several different kinds of aggregation based on \why/ the aggregation arose. It might be \that/ is why defining 'individual' just in terms of an unadorned notion of 'aggregation' seems to lead to debates about the fitness-for-use of such a definition.
Bob Morris
On Wed, Nov 3, 2010 at 10:24 AM, Steve Baskauf <steve.baskauf@vanderbilt.edu
wrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of multiple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote:
Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa. Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7 by baskaufs http://code.google.com/u/baskaufs/, Today (8 hours ago)
As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Delete commenthttp://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 8http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8 by deepreef@hawaii.rr.comhttp://code.google.com/u/deepreef@hawaii.rr.com/ , Today (8 hours ago)
I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Delete commenthttp://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 9http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9 by baskaufs http://code.google.com/u/baskaufs/, Today (8 hours ago)
Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707http://bioimages.vanderbilt.edu
Just a quick comment:
I think Bob's points are insightful and very much relevant to this discussion overall. We should visit this point with some care and scrutiny, to see if we need more terms and/or controlled vocabularies to be able to make the sort of reasoning assumptions that Bob is touching on.
Having said that, I think the argument-of-the-moment is to what extent we limit the scope of an aggregated Individual based on how narrowly or broadly circumscribed the taxon assigned to the aggregate may be. I saw, but haven't yet read, Steve's most recent post. Gotta run now, but I'll be back...
Rich
_____
From: Bob Morris [mailto:morris.bob@gmail.com] Sent: Wednesday, November 03, 2010 4:12 PM To: Steve Baskauf Cc: tuco@berkeley.edu; tdwg-content@lists.tdwg.org; Richard Pyle; RichardsK@landcareresearch.co.nz Subject: Re: [tdwg-content] Request for vote on proposals to add Individual as a Darwin Core class and to add the term individualRemarks as a term within that class
An remark about aggregations, but with no insight as to its value:
As a biologically naive lurker, it strikes me that the struggle here may be an artifact of mixing all the reasons for aggregating organisms in an observation, whereas there plenty of use cases that suit some sets of those reasons but not others. For example some aggregates are artifacts of collecting or observation methodology, e.g. lots of stuff in a vial. Some, such as on paleo specimens are a consequence of a bunch of critters being at the same watering spot when the mud came roaring down the river. Some are ecosystem entanglements. The reason underlying the aggregation might itself participate in some kinds of scientific inference but not in other kinds. For example, if a bunch of species are all in the same paleo specimen, that helps a lot with establishing the era for each of the species' presence on earth and I suppose has implications for evolutionary history. Lichens, I gather, are treated as a single individual and given a taxon name, even though for much about lichen biology it's important that there are two individuals, from different taxa, in a specimen. Etc., etc. So to me, it looks like the problem might be less in whether there are disagreements about what constitutes an individual, so much as that there are several different kinds of aggregation based on \why/ the aggregation arose. It might be \that/ is why defining 'individual' just in terms of an unadorned notion of 'aggregation' seems to lead to debates about the fitness-for-use of such a definition.
Bob Morris
On Wed, Nov 3, 2010 at 10:24 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
John, I'm not sure that I agree with your analysis that the definition prevents the possibility of making an Identification at a rank less specific than a species. My revised definition says that the Individual should only include groups of organisms that are reliably known to be of a single species - it doesn't say that we need to know what that species is (i.e. an identification to genus or family can be made with the hope that someone down the line would be able to refine the identification to species). Clarification on this point could be added to the comment or the Google Code page, but I don't think there is a problem with the definition per se. However, if there is a consensus that the definition is too restrictive, I would not object to changing the wording of the definition from "species (or lower taxonomic rank if it exists)" to "taxon" if there were clarification added to the comments or Google Code page that Individual was not intended to include aggregations of multiple species.
I agree that there is a need for a term that represents "collections", "bags", "aggregations", or whatever you want to call an aggregation that includes multiple species. But I have never intended that Individual should be that term. If we expand Individual to include aggregates, then it becomes unusable for its original intended purpose. I would prefer for someone to propose a different term for aggregates of individuals instead of adding that function to Individual. Then define the relationship of this new thing to Individual as a one:many relationship (one aggregation:many Individuals).
Steve
John Wieczorek wrote:
Most of you probably do not receive postings from the Google Code site for Darwin Core. Steve B. updated the proposal for the new term Individual, and then commentary ensued on the Issue tracker. Since there remains an unresolved issue, I'm bringing the discussion back here by adding the commentary stream below. The unresolved issue is Steve's amendment is the restriction in the definition to "a single species (or lower taxonomic rank if it exists)."
Rich argues that we should not obviate the capability of applying an Identification to an aggregate (e.g., fossil), where the aggregate consists of multiple taxa. Steve argues that Identifications should be applied only to aggregates of a single taxon.
Steve, aside from the aggregate issue (which should be solved satisfactorily), your suggestion is too restrictive, because it would obviate the possibility of making an Identification (even for a single organism) to any rank less specific than a species. That is a loss of capability, and therefore unreasonable.
Comment 7 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%2 0Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c7 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago)
As a result of the discussion that has taken place on the tdwg-content email list during 2010 October and November, I am updating the term recommendation for Individual as follows:
Definition: The category of information pertaining to an individual organism or
a group of individual organisms that can reliably be known to represent a single species (or lower taxonomic rank if it exists).
Comment: Instances of this class can serve the purpose of connecting one or more instances of the Darwin Core class Occurrence to one or more instances of the Darwin Core class Identification.
Refines: N/A
Please note that as a precautionary measure, I have removed the statement that Individual refines http://purl.org/dc/dcmitype/PhysicalObject because the definition of PhysicalObject specifically mentions that the object is inanimate. I am not currently aware of any well-known term that defines living things.
Steve Baskauf
Delete comment http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%2 0Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 8 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%2 0Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c8 by deepreef@hawaii.rr.com http://code.google.com/u/deepreef@hawaii.rr.com/ , Today (8 hours ago) I think the definition should be "...represent a single taxon". We shouldn't restrict it to members of the same species (or lower), because then we technically can't include things that may represent more than one species, yet would best be treated within the scope of an Individual.
Also, I'm slightly partial to the term "Organism" for this class, rather than "Individual", because it's more clearly tied to the biology domain, and less likely to collide with the word "Individual" in other domains. I know such collision is not a technical problem, but it might lead to some confusion.
Delete comment http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%2 0Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened# Comment 9 http://code.google.com/p/darwincore/issues/detail?id=69&colspec=ID%20Type%2 0Status%20Priority%20Milestone%20Owner%20Reporter%20Summary%20Opened#c9 by baskaufs http://code.google.com/u/baskaufs/ , Today (8 hours ago) Well, the reason that I defined it to be members of the same species is to ensure that the term Individual can serve the primary function that I perceived was needed: to make the connection from occurrences to identifications. When I said one or more identifications, I meant one or more opinions about what that single species (or lower) was, not that there could be multiple identifications of several different species that happened to be in the same "bag" such as the contents of a pitfall trap containing multiple species, an image that contained several species, or a specimen that contained parasites of a different species. I think that there is a need for a term for this other kind of thing, (a heterogeneous "lot", "batch", or something), but I think that including this in definition of Individual defeats the purpose for which I proposed it. If there were several different species in the "Individual", then
one would have to specify which identification went with which biological individual within the "lot", which would result in actually breaking down the "lot" into single species "Individuals" anyway.
On 01/11/2010, at 8:51 AM, Bob Morris wrote:
There are surprises even in the simplest use of RDFS and formalisms about classes. I've previously whined about premature assignment of rdfs:domain while conceding (did I???) that it can sometimes make a designer's intention clearer to humans. Perhaps more startling is that type assignment automatically "creates" an rdfs:class if one was not already available, due to the formal semantics of rdf:type [3]. Thus, in an earlier posting, Paul Murray has (unintentionally?) introduced a new class apni:TaxonName in 33407.rdf [2] via <rdf:type rdf:resource="http://biodiversity.org.au/voc/apni/APNI#TaxonName%22/%3E
No, it was deliberate. We have done some work on our hosting here at our end, and as of last Friday the boa (biodiversity.org.au) vocabulary files are now available. http://biodiversity.org.au/voc/apni/APNI redirects to http://biodiversity.org.au/voc/apni/APNI.rdf, which Protege (for instance) understands.
We have created class hierarchies for our various objects: an APNI name is a BOA name is a TDWG name. As you have noticed, our individual name objects are explicitly declared both as TDWG names and also APNI names. Although the TDWG type is implied, I include it explicitly so that people can ignore our vocabulary if they wish when looking at our data. We have created "de novo" properties and named individuals for things in our data for which we could not find a suitable equivalent in the TDWG vocabulary, and these are available too, eg: http://www.biodiversity.org.au/voc/apni/NomenclaturalQualifierTerm .
We have gone through a similar exercise for the XML vocabularies: see http://www.biodiversity.org.au/afd.name/468562.xml . A xsi:schemaLocation attribute is included, allowing XML validators to find our schema files.
Of course, "deliberately" does not necessarily mean "a good idea" or "done correctly", but you have to start somewhere.
The nice thing about the semantic web is that you can in fact do this. All of our extra bits are identified with URIs, and the URIs all start with "http://biodiversity.org.au". At present, these extra bits mean nothing outside of the data here at BOA. A human could make sense of many of them, but many of the types, properties, and named individuals do not even contain titles and descriptions: as I am not a taxonomist myself, I have only the vaguest idea what the difference might be between "nom illeg" and "nom rej". Better to leave it blank.
Of course, this is not a problem for the TDWG vocabularies, but that's because I am working the other way around: I was not trying to create a vocabulary that the general community could use, but to document an existing (albeit implied) one. Our properties declare explicit domains. For well-discussed reasons, the TDWG vocabularies do not. But I don't think that those reasons (unintentional type declarations made by people using your terms) apply. Indeed - the reverse is almost the whole point. I don't think we *want* other people using biodiversity.org.au terms: their meanings potentially are idiosyncratic to the systems here (perhaps subtly) because they don't have proper descriptions - descriptions I am not able to supply.
Once there are standards we can back-fit our data, just as everyone else will back-fit theirs. But in the meantime, the data is out there. (You can work with it, if you wish, using our splendid JSON interface. But it's subject to change, I'm afraid.) Again, the nice thing about the semantic web is that you can do this - gradually pulling together the strands of meaning using a common vocabulary as that vocabulary is developed. It might become a bit of a wild west in some areas, but those areas are explicitly fenced in with URI prefixes.The key is that our object identifiers - the URIs and LSIDs for the taxa and names - will remain persistent. Over time, we can clarify, enrich, and correct what we say *about the things that those identifiers identify*.
A serious problem that we are aware of is aggregators - systems holding copies of data and reasoning over vocabularies which we at a later stage fix. I don't know what to do about that - is seems to me that one of the problems in the semantic web is provenance and data ageing. How to you keep the whole thing from turning into mush? (Speaking of which: I would like to cryptographically sign our outgoing data with certificate issued by TDWG, which indicates that we are indeed the TDWG-approved source of data coming from the biodiversity.org.au LSID authority. But that's a whole new area.) Although we address many of these issues with oai-pmh.
So: yes, we have a custom, idiosyncratic vocabulary, we declare and use nonstandard types and properties, we declare owl:domain - but I believe it's been properly done at the "machine" level. At the higher level, it's a work-in-progress. It helps to have something concrete to discuss, I think. When I was discussing using the DwC properties and types in our RDF, of putting it out on the web, I was thinking of a timeframe of weeks, not years.
------ If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.
Please consider the environment before printing this email.
------
Paul, This morning I have been reading through some of the posts that came by too fast for me to fully process during the work week. This one resonated particularly with me.
Paul Murray wrote:
No, it was deliberate. We have done some work on our hosting here at our end, and as of last Friday the boa (biodiversity.org.au http://biodiversity.org.au) vocabulary files are now available. http://biodiversity.org.au/voc/apni/APNI redirects to http://biodiversity.org.au/voc/apni/APNI.rdf, which Protege (for instance) understands.
We have created class hierarchies for our various objects: an APNI name is a BOA name is a TDWG name. As you have noticed, our individual name objects are explicitly declared both as TDWG names and also APNI names. Although the TDWG type is implied, I include it explicitly so that people can ignore our vocabulary if they wish when looking at our data. We have created "de novo" properties and named individuals for things in our data for which we could not find a suitable equivalent in the TDWG vocabulary, and these are available too, eg: http://www.biodiversity.org.au/voc/apni/NomenclaturalQualifierTerm .
We have gone through a similar exercise for the XML vocabularies: see http://www.biodiversity.org.au/afd.name/468562.xml . A xsi:schemaLocation attribute is included, allowing XML validators to find our schema files.
Of course, "deliberately" does not necessarily mean "a good idea" or "done correctly", but you have to start somewhere.
Yes, you have to start somewhere. That was the reason why I created the sernec: namespace (http://bioimages.vanderbilt.edu/rdf/terms#). I've been trying out the terms there in my RDF. I now know that some of them don't work very well and I'll create better ones. But I wouldn't have known that if I hadn't tried to use them.
The nice thing about the semantic web is that you can in fact do this. All of our extra bits are identified with URIs, and the URIs all start with "http://biodiversity.org.au". At present, these extra bits mean nothing outside of the data here at BOA. A human could make sense of many of them, but many of the types, properties, and named individuals do not even contain titles and descriptions: as I am not a taxonomist myself, I have only the vaguest idea what the difference might be between "nom illeg" and "nom rej". Better to leave it blank.
Of course, this is not a problem for the TDWG vocabularies, but that's because I am working the other way around: I was not trying to create a vocabulary that the general community could use, but to document an existing (albeit implied) one. Our properties declare explicit domains. For well-discussed reasons, the TDWG vocabularies do not. But I don't think that those reasons (unintentional type declarations made by people using your terms) apply. Indeed - the reverse is almost the whole point. I don't think we *want* other people using biodiversity.org.au http://biodiversity.org.au terms: their meanings potentially are idiosyncratic to the systems here (perhaps subtly) because they don't have proper descriptions - descriptions I am not able to supply.
Once there are standards we can back-fit our data, just as everyone else will back-fit theirs. But in the meantime, the data is out there. (You can work with it, if you wish, using our splendid JSON interface. But it's subject to change, I'm afraid.) Again, the nice thing about the semantic web is that you can do this - gradually pulling together the strands of meaning using a common vocabulary as that vocabulary is developed. It might become a bit of a wild west in some areas, but those areas are explicitly fenced in with URI prefixes.The key is that our object identifiers - the URIs and LSIDs for the taxa and names - will remain persistent. Over time, we can clarify, enrich, and correct what we say *about the things that those identifiers identify*.
Yes! I don't intend for the URI for the image http://bioimages.vanderbilt.edu/baskauf/66921 to ever change, although its web page and the RDF representation may change (maybe a lot!). That's OK. As you say, the data is out there. Hopefully a consensus will evolve about what terms (properties) mean to everyone in the community and then my RDF will mean something to somebody else. That's what set me off down the road to trying to get the Individual class as a part of DwC.
A serious problem that we are aware of is aggregators - systems holding copies of data and reasoning over vocabularies which we at a later stage fix. I don't know what to do about that - is seems to me that one of the problems in the semantic web is provenance and data ageing. How to you keep the whole thing from turning into mush? (Speaking of which: I would like to cryptographically sign our outgoing data with certificate issued by TDWG, which indicates that we are indeed the TDWG-approved source of data coming from the biodiversity.org.au http://biodiversity.org.au LSID authority. But that's a whole new area.) Although we address many of these issues with oai-pmh.
You aren't kidding. Once I started tracking down information about myself using OpenLink. I discovered that there were all kinds of bizarre inferences that were being made about me in the LOD cloud. Most of the wrong ones were made when someone tried to take non-RDF documents and translate them into triples. But there is still the problem where bad triples get out into the cloud (e.g. errors). You can fix them in the RDF that you are serving, but how do you make them "go away" when they are being held and used for reasoning by somebody else? Or what if somebody somewhere makes the assertion
<foaf:Person rdf:about="http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me%22%3E foaf:nameDonald Duck</foaf:name> </foaf:Person>
Anybody can do that, so how do we certify metadata sources as "trusted" in our community? The state of the LOD cloud at the moment reminds me of the early days of email and the Web, when it was reasonably "safe" to assume that users' intentions were good. Then came viruses, trojans, phishing scams, etc. If those kinds of things had been considered at the start of email and the Web and considered in its design, it would have been easier to prevent (or reduce) the evolution of nefarious uses of the Web. Perhaps we should be thinking about that more now when we are in the early stages of designing for the "semantic web".
So: yes, we have a custom, idiosyncratic vocabulary, we declare and use nonstandard types and properties, we declare owl:domain - but I believe it's been properly done at the "machine" level. At the higher level, it's a work-in-progress. It helps to have something concrete to discuss, I think. When I was discussing using the DwC properties and types in our RDF, of putting it out on the web, I was thinking of a timeframe of weeks, not years.
There has been the suggestion made by several people that we need a second kind of Darwin Core, an RDF recommendation that will allow for deep semantic reasoning. That might be nice, but given the amount of discussion that it's taken just to come to an agreement about what a dwc:Individual should be, I think "years" would be an accurate estimate for that task. What I personally really want is a recommendation for Darwin Core RDF "Lite". That is, a quick (and possibly dirty for the time being) set of guidelines for using DwC terms AS THEY EXIST in RDF with only the minimal number of changes or additions necessary to get the job done. THAT is something that I could see happening in a timeframe of weeks, not years.
Steve
Anybody can do that, so how do we certify metadata sources as "trusted" in our community? The state of the LOD cloud at the moment reminds me of the early days of email and the Web, when it was reasonably "safe" to assume that users' intentions were good. Then came viruses, trojans, phishing scams, etc. If those kinds of things had been considered at the start of email and the Web and considered in its design, it would have been easier to prevent (or reduce) the evolution of nefarious uses of the Web. Perhaps we should be thinking about that more now when we are in the early stages of designing for the "semantic web".
"The cloud" is never going to be a consistent ontology (it only takes one person asserting "A is not A" ). When you ask a reasoner to reason, you always give it a limited set of triples that you trust for its axioms. I haven't looked into it yet in any detail, but I think that this is the role of SPARQL - it becomes possible to say "using the reasoning rules *here* and *here*, reason over all the triples served up at at biodiversity.org.au and zoobank.org". The other alternative is "importing" all of the individual URIs at biodiversity.org.au, which is obviously infeasible.
There has been the suggestion made by several people that we need a second kind of Darwin Core, an RDF recommendation that will allow for deep semantic reasoning.
I don't think you need an entirely different DwC. What will serve the purpose is a auxiliary vocabulary document. A separate document with OWL rules about the DwC predicates, which you can choose to import and reason over. Those rules don't need to be in the document defining the vocabulary. Perhaps more than one ruleset.
------ If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.
Please consider the environment before printing this email.
------
On a somewhat related note:
Has someone pointers to "real-world reasoning examples", i.e., where datasets, assertions, or other semantic web or LOD "knowledge" is used to make "interesting" inferences?
Additionally, what about simple querying (as opposed to reasoning) examples over LOD and other "data out there"? (I realize that the borderline between querying and reasoning is not as clear-cut as the terminology seems to imply; e.g., Datalog rules are commonly viewed as queries, while logic programming rules are viewed as inference rules)
I'm asking also because next quarter I'll be teaching an undergraduate class on scientific data management and I'm looking for interesting datasets to play with..
Thanks, best
Bertram
-- Bertram Ludäscher Professor of Computer Science Dept of Computer Science & Genome Center University of California, Davis ludaesch@ucdavis.edu / daks.ucdavis.edu Phone: +1-530-554-1800
On Mon, Nov 8, 2010 at 5:10 AM, Paul Murray pmurray@anbg.gov.au wrote:
Anybody can do that, so how do we certify metadata sources as "trusted"
in our community? The state of the LOD cloud at the moment reminds me of the early days of email and the Web, when it was reasonably "safe" to assume that users' intentions were good. Then came viruses, trojans, phishing scams, etc. If those kinds of things had been considered at the start of email and the Web and considered in its design, it would have been easier to prevent (or reduce) the evolution of nefarious uses of the Web. Perhaps we should be thinking about that more now when we are in the early stages of designing for the "semantic web".
"The cloud" is never going to be a consistent ontology (it only takes one person asserting "A is not A" ). When you ask a reasoner to reason, you always give it a limited set of triples that you trust for its axioms. I haven't looked into it yet in any detail, but I think that this is the role of SPARQL - it becomes possible to say "using the reasoning rules *here* and *here*, reason over all the triples served up at at biodiversity.org.auand zoobank.org". The other alternative is "importing" all of the individual URIs at biodiversity.org.au, which is obviously infeasible.
There has been the suggestion made by several people that we need a
second kind of Darwin Core, an RDF recommendation that will allow for deep semantic reasoning.
I don't think you need an entirely different DwC. What will serve the purpose is a auxiliary vocabulary document. A separate document with OWL rules about the DwC predicates, which you can choose to import and reason over. Those rules don't need to be in the document defining the vocabulary. Perhaps more than one ruleset.
If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.
Please consider the environment before printing this email.
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
On Nov 10, 2010, at 7:20 PM, Bertram Ludaescher wrote:
Has someone pointers to "real-world reasoning examples", i.e., where datasets, assertions, or other semantic web or LOD "knowledge" is used to make "interesting" inferences?
We reason over assertions of evolutionary phenotype diversity and those of genetic mutants to generate hypotheses for genes underlying evolutionary phenotype transitions: http://kb.phenoscape.org
-hilmar
I have not tested these with the most recent versions of the data set but they should be close.
http://www.taxonconcept.org/example-sparql-queries/
http://www.taxonconcept.org/example-sparql-queries/ http://about.geospecies.org/sparql.xhtml
http://about.geospecies.org/sparql.xhtmlThis live query works on the LOD cloud I just tried it:
http://bit.ly/aBBFCAIf you look through by previous emails on this list you will see other examples.
Also Sindice does some inferencing, you can try it here.
http://inspector.sindice.com/index.jsp
You can try it with this occurrence record http://ocs.taxonconcept.org/ocs/a71fda68-024a-4020-a34c-64dce7935e5f.rdf
- Pete
On Wed, Nov 10, 2010 at 6:20 PM, Bertram Ludaescher ludaesch@ucdavis.eduwrote:
On a somewhat related note:
Has someone pointers to "real-world reasoning examples", i.e., where datasets, assertions, or other semantic web or LOD "knowledge" is used to make "interesting" inferences?
Additionally, what about simple querying (as opposed to reasoning) examples over LOD and other "data out there"? (I realize that the borderline between querying and reasoning is not as clear-cut as the terminology seems to imply; e.g., Datalog rules are commonly viewed as queries, while logic programming rules are viewed as inference rules)
I'm asking also because next quarter I'll be teaching an undergraduate class on scientific data management and I'm looking for interesting datasets to play with..
Thanks, best
Bertram
-- Bertram Ludäscher Professor of Computer Science Dept of Computer Science & Genome Center University of California, Davis ludaesch@ucdavis.edu / daks.ucdavis.edu Phone: +1-530-554-1800
On Mon, Nov 8, 2010 at 5:10 AM, Paul Murray pmurray@anbg.gov.au wrote:
Anybody can do that, so how do we certify metadata sources as "trusted"
in our community? The state of the LOD cloud at the moment reminds me of the early days of email and the Web, when it was reasonably "safe" to assume that users' intentions were good. Then came viruses, trojans, phishing scams, etc. If those kinds of things had been considered at the start of email and the Web and considered in its design, it would have been easier to prevent (or reduce) the evolution of nefarious uses of the Web. Perhaps we should be thinking about that more now when we are in the early stages of designing for the "semantic web".
"The cloud" is never going to be a consistent ontology (it only takes one person asserting "A is not A" ). When you ask a reasoner to reason, you always give it a limited set of triples that you trust for its axioms. I haven't looked into it yet in any detail, but I think that this is the role of SPARQL - it becomes possible to say "using the reasoning rules *here* and *here*, reason over all the triples served up at at biodiversity.org.auand zoobank.org". The other alternative is "importing" all of the individual URIs at biodiversity.org.au, which is obviously infeasible.
There has been the suggestion made by several people that we need a
second kind of Darwin Core, an RDF recommendation that will allow for deep semantic reasoning.
I don't think you need an entirely different DwC. What will serve the purpose is a auxiliary vocabulary document. A separate document with OWL rules about the DwC predicates, which you can choose to import and reason over. Those rules don't need to be in the document defining the vocabulary. Perhaps more than one ruleset.
If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.
Please consider the environment before printing this email.
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
participants (10)
-
Bertram Ludaescher
-
Blum, Stan
-
Bob Morris
-
Hilmar Lapp
-
John Wieczorek
-
Kevin Richards
-
Paul Murray
-
Peter DeVries
-
Richard Pyle
-
Steve Baskauf