The current DwC Terms [1] (carrying Identifier http://rs.tdwg.org/dwc/2015-03-19/terms/ and Date Modified: 2015-06-02 ) is confusing (confused? silent? ) about the relation of the "sister classes" to the terms intended (?) to be used therewith. For example, dwc:Event, dwc:MachineObservation, and dwc:HumanObservation could reasonably(?) all have dwc:eventDate applied to them. But http://rs.tdwg.org/dwc/terms/index.htm#eventDatesuggests Class=dwc:Event.
Now it's human-clear that for each of the "multiple" boldface lines in the index, the second and subsequent terms are meant to be special cases of the left most one. What I can't see is whether [1] intends to encourage (require?) this in some explicit way, and where that explicit way is to be found. (I had a dream that the DwC RDF Guide might take a position....) .
Thanks.
--Bob p.s. I concede that some answers might lurk in the ongoing move of resources to http://tdwg.github.io/dwc/terms/. My second dream was that tdwg.github.io does Content Negotiation and curl would rescue me....
[1] http://rs.tdwg.org/dwc/terms/index.htm
I think that this has been left intentionally vague because at this point we don't have well-defined relationships among Darwin Core classes. It seems to me that placing the class terms on the same line are a clue that they are somehow related, but it isn't apparent to me that the subsequent terms are always special cases of the left-most terms. I can provide an example where a LivingSpecimen isn't a MaterialSample because it was never collected. I can also imagine Event instances which include many HumanObservations (i.e. Event serves to group observations, not to serve as a superclass for observations).
There have been various attempts to lay out how the Darwin Core classes are related to each other. But I'm not aware that there has ever been a consensus on it. That's why the RDF Guide didn't touch the issue. We were afraid that the guide would never be finished if we took up that subject.
I think it would be an excellent exercise to try to lay out how Darwin Core classes are related to each other. But first, I would suggest that we lay out the use cases that we intend to satisfy by nailing down those relationships, then show how establishing those relationships help us. For example, I could suggest that we establish that
dwc:HumanObservation rdfs:subClassOf dwc:Event.
But what would I gain by doing that? What would it prevent me from doing? Steve
Bob Morris wrote:
The current DwC Terms [1] (carrying Identifier http://rs.tdwg.org/dwc/2015-03-19/terms/ and Date Modified: 2015-06-02 ) is confusing (confused? silent? ) about the relation of the "sister classes" to the terms intended (?) to be used therewith. For example, dwc:Event, dwc:MachineObservation, and dwc:HumanObservation could reasonably(?) all have dwc:eventDate applied to them. But http://rs.tdwg.org/dwc/terms/index.htm#eventDatesuggests Class=dwc:Event.
Now it's human-clear that for each of the "multiple" boldface lines in the index, the second and subsequent terms are meant to be special cases of the left most one. What I can't see is whether [1] intends to encourage (require?) this in some explicit way, and where that explicit way is to be found. (I had a dream that the DwC RDF Guide might take a position....) .
Thanks.
--Bob p.s. I concede that some answers might lurk in the ongoing move of resources to http://tdwg.github.io/dwc/terms/. My second dream was that tdwg.github.io does Content Negotiation and curl would rescue me....
It sounds like perhaps now is the time to focus on the ontological relationships among classes, as the next major focus of DwC advancement.
I would NOT regard "HumanObservaton" as a subclass of "Event". I believe that "Event" should be kept clean as fundamentally an intersection between a Location instance and a point in time. After much thinking and testing on this, we've finally come to the conclusion in our data models that "Location" is defined by two of the four space-time dimensions (X & Y; effectively represented as Geocoordinates -- whether as a point, Point/radius, track, polygon, etc.), and "Event" is defined by one Location instance plus the other two space-time dimensions (Z & T; effectively represented as elevation, depth and date/time). I'd be happy to explain why we came to this conclusion, but that's another thread.
The point is that I think "Event" should remain as an abstract four-dimensional address, created as an instance to capture space-time information for something else. In DwC, that "something else" is an Occurrence.
Within the confines of existing DwC, "HumanObservation" comkes closes to being a subclass of Occurrence. However, there is still one missing class that I believe we need to complete the core ontology space of DwC -- which is what we refer to as "Evidence", and Darwin-SW refers to as "Token" (https://code.google.com/p/darwin-sw/). Our model draws the lines slightly differently from the diagram for D-SW, but in general they represent a convergence of thinking on the relationships among DwC classes.
In answer to Steve's question:
dwc:HumanObservation rdfs:subClassOf dwc:[Occurrence]
But what would I gain by doing that? What would it prevent me from doing?
I'm not technically savvy enough to answer that question from an implementation perspective; but from a DwC comprehension perspective, it moves us a step closer to mutual understanding of how to transform DwC content into a functional data model. We all kinda/sorta know that already, but as evidenced by the different perspectives of "HumanObservation as a subclass of Event" vs. "HumanObservation as a subclass of Occurrence" just now revealed & expressed, it probably wouldn't hurt to be more explicit about these sorts of things in DwC documentation.
-----Original Message----- From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag- bounces@lists.tdwg.org] On Behalf Of Steve Baskauf Sent: Saturday, August 08, 2015 2:47 AM To: Bob Morris Cc: tdwg-tag@tdwg.org Subject: Re: [tdwg-tag] "Class" attribute in DwC
I think that this has been left intentionally vague because at this point
we don't
have well-defined relationships among Darwin Core classes. It seems to me that placing the class terms on the same line are a clue that they are
somehow
related, but it isn't apparent to me that the subsequent terms are always special cases of the left-most terms. I can provide an example where a LivingSpecimen isn't a MaterialSample because it was never collected. I
can
also imagine Event instances which include many HumanObservations (i.e. Event serves to group observations, not to serve as a superclass for observations).
There have been various attempts to lay out how the Darwin Core classes
are
related to each other. But I'm not aware that there has ever been a
consensus
on it. That's why the RDF Guide didn't touch the issue. We were afraid
that the
guide would never be finished if we took up that subject.
I think it would be an excellent exercise to try to lay out how Darwin
Core
classes are related to each other. But first, I would suggest that we lay
out the
use cases that we intend to satisfy by nailing down those relationships,
then
show how establishing those relationships help us. For example, I could suggest that we establish that
dwc:HumanObservation rdfs:subClassOf dwc:Event.
But what would I gain by doing that? What would it prevent me from doing? Steve
Bob Morris wrote:
The current DwC Terms [1] (carrying Identifier http://rs.tdwg.org/dwc/2015-03-19/terms/ and Date Modified: 2015-06-02 ) is confusing (confused? silent? ) about the relation of the "sister classes" to the terms intended (?) to be used therewith. For example, dwc:Event, dwc:MachineObservation, and dwc:HumanObservation could reasonably(?) all have dwc:eventDate applied to them. But http://rs.tdwg.org/dwc/terms/index.htm#eventDatesuggests Class=dwc:Event.
Now it's human-clear that for each of the "multiple" boldface lines in the index, the second and subsequent terms are meant to be special cases of the left most one. What I can't see is whether [1] intends to encourage (require?) this in some explicit way, and where that explicit way is to be found. (I had a dream that the DwC RDF Guide might take a position....) .
Thanks.
--Bob p.s. I concede that some answers might lurk in the ongoing move of resources to http://tdwg.github.io/dwc/terms/. My second dream was that tdwg.github.io does Content Negotiation and curl would rescue me....
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: PMB 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 322-4942 If you fax, please phone or
so that I will know to look for it. http://bioimages.vanderbilt.edu http://vanderbilt.edu/trees
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Just to further elaborate on the example: If we assert:
dwc:HumanObservation rdfs:subClassOf dwc:Event.
and then someone stated:
<birdObservation1> rdf:type dwc:HumanObservation.
I think that would entail:
<birdObservation1> rdf:type dwc:Event.
because of the semantics of rdfs:subClassOf
On the other hand, if we assert something like
dwc:HumanObservation skos:narrower dwc:Event.
that would NOT entail
<birdObservation1> rdf:type dwc:Event.
but instead would entail different stuff like:
dwc:Event skos:broader dwc:HumanObservation. dwc:HumanObservation skos:narrowerTransitive dwc:Event. etc.
None of these entailments are intrinsically good or bad. But if we make assertions like dwc:HumanObservation rdfs:subClassOf dwc:Event. or dwc:HumanObservation skos:narrower dwc:Event. we must be aware that a machine could reason the entailed relationships, and should only make those assertions if we want a machine to be able to do those kinds of reasoning. In other words, we should only make assertions with semantic implications to accomplish some purpose related to machine reasoning, and not just because it seems like it might be a good idea. If our purpose is just to make things more clear to a human, then providing a better human-readable definition would be a better way to accomplish that.
Steve
Richard Pyle wrote:
dwc:HumanObservation rdfs:subClassOf dwc:[Occurrence]
But what would I gain by doing that? What would it prevent me from doing?
I'm not technically savvy enough to answer that question from an implementation perspective; but from a DwC comprehension perspective, it moves us a step closer to mutual understanding of how to transform DwC content into a functional data model. We all kinda/sorta know that already, but as evidenced by the different perspectives of "HumanObservation as a subclass of Event" vs. "HumanObservation as a subclass of Occurrence" just now revealed & expressed, it probably wouldn't hurt to be more explicit about these sorts of things in DwC documentation.
On 9/08/2015 3:21 am, Richard Pyle wrote:
I would NOT regard "HumanObservaton" as a subclass of "Event". I believe that "Event" should be kept clean as fundamentally an intersection between a Location instance and a point in time.
In which case, you would need a separate class for a thing that happens at a time, but which is not located (and perhaps, as the philosophers say, does not have an extension).
And this in a way is a reply to Bob's original question - why aren't these relationships explicit? The reason is that the second you try to make them so, you almost immediately start running into philosophical conundrums that people have been debating for thousands of years. The old "How many angels can dance on the head of a pin?" problem is an exercise in distinguishing between things that are located and things that are extended.
Something as abstract and basic as "thing that happens at a place and a time" should be borrowed from someone else's vocabulary. The first problem there is that if you do that, then if that other vocabulary defines inference rules, then anyone that uses your vocabulary must respect those rules or their ontology becomes inconsistent.
Another problem is that other people's vocabularies are never really quite exactly what you need.
And the underlying problem is that something as simple as "a point in time" is actually a really hard question. What about things that have a duration, that happen over a couple of weeks? Things that are cyclic? Points in time that you are uncertain or vaguely defined? What about the distinction between something that happens over a two week period, and something that is instantaneous but that is known to have happened over a certain two week period; or that is known to have happened at least once over that period; or that almost certainly will happen at some period in the future (eg, a scheduled observation)? The second you try to nail this stuff down, it immediately starts sprouting hair. It's intractable and perhaps not the kind of thing that biologists can best spend their time doing.
Here at biodiversity.org.au, our new APNI dataset is exposed using a custom made vocabulary (aside from things like dc:title) without much of an attempt to describe our objects in terms of well-known classes. Maybe at some point in the future we might be able to pull the TDWG, DwC, SKOS terms into the data set. But that cant be done if those terms are so strictly defined that our data in fact does not meet the strict definitions of those terms. As it stands, our ontology does not connect to other ontologies by way of the vocabulary in which it is described.
So is there any hope at all that we can create a distributed semantic web of facts relating to taxonomy - the output of the taxonomic work that people are doing)?
I think so, because the biologists and taxonomists are working on the same stuff, certainly the same kinds of stuff, within the constraints of the Real World™. And this maybe is a clue about where to look for useful vocabulary. Rather than attempting to solve age-old questions about the nature of time and thing, look to the specific subject matter.
After all - why have an 'event' class at all? The only thing you can do with such a class is to construct a query that asks, for instance, "tell me about everything whose foaf:person is Dr Joe Bloggs and that is a thing that happened". On the other hand, "specimen" in the strict sense of "something in a collection with an accession number" is very important, fairly specific to the subject matter, and entirely worth having a common term for.
Likewise, 'location' may mean 'geographic polygon', or it may conceivably mean a collection, or an institution. (How so? Because a location is anything that might be given in reply to the question 'where is X?'. Notice that this is language-dependent). Each of these three things has an existing vocabulary defined by geographers, librarians, and (I suppose) company registrars respectively. Does anyone really need a higer-layer linking them together?
Enough rambling, I think. FWIW:
* define the lower-layer objects and terms that people actually use when they do taxonomy * don't bother with the higher abstractions. That way lieth madness, and is a time-sink.
A few quick comments:
In which case, you would need a separate class for a thing that happens at
a
time, but which is not located (and perhaps, as the philosophers say, does not have an extension).
Not necessarily; at least not from the perspective of modelling biodiversity data. If you really wanted to capture a load of metadata about the "time" aspect (which one could actually do), I could see justification in establishing a class of object for the "Time" component, in the same way that we have a class for "Location". But fundamentally, we're just talking about coordinates for four-dimentional space-time; so really "Location" and "Time" might best be wrapped into the same "Space-Time" class (i.e., the "Where/When" class), and the Event would then become something along the lines of a "Who/What" class. But at some point, perfecting the conceptual data model represents an impediment to practical progress.
And this in a way is a reply to Bob's original question - why aren't these relationships explicit? The reason is that the second you try to make them
so,
you almost immediately start running into philosophical conundrums that people have been debating for thousands of years. The old "How many angels can dance on the head of a pin?" problem is an exercise in distinguishing between things that are located and things that are
extended.
Agreed! The art all of this is in balancing the "Normalize until it hurts" exercise, with the "De-Normalize until it works" reality. There's no objectively correct answer. Just a cloud of possible options that we're gradually trying to collectively sharpen down.
Something as abstract and basic as "thing that happens at a place and a time" should be borrowed from someone else's vocabulary. The first problem there is that if you do that, then if that other vocabulary
defines
inference rules, then anyone that uses your vocabulary must respect those rules or their ontology becomes inconsistent.
Another problem is that other people's vocabularies are never really quite exactly what you need.
DEFINITELY agreed!!!!
And the underlying problem is that something as simple as "a point in
time" is
actually a really hard question. What about things that have a duration,
that
happen over a couple of weeks? Things that are cyclic?
Indeed! And it was a poor choice of words on my side to use the word "point". "Time" should be regarded in the same way that we regard the other three dimensions. That is, either in the form of an arbitrarily precise point with an explicitly stated error (as we do in DwC for geocoordinates), or in an explicit min/max range (as we do in DwC for things like elevation and depth), or in a way that handles real-world data (actual ranges, ranges representing imprecision/uncertainty, and multiple points within a scope bound by min and max values, etc.) We could deal with this stuff by treating "time" as a class, but in my experience (and yes, I actually did go down that road many years ago), the payoff ain't worth the investment.
So is there any hope at all that we can create a distributed semantic web
of
facts relating to taxonomy - the output of the taxonomic work that people are doing)?
Probably not. At least not in my lifetime (or career-time). But one can always hope! :-)
I think so, because the biologists and taxonomists are working on the same stuff, certainly the same kinds of stuff, within the constraints of the
Real
WorldT. And this maybe is a clue about where to look for useful
vocabulary.
Rather than attempting to solve age-old questions about the nature of time and thing, look to the specific subject matter.
Yeah....but.... we can't even get past the tired old arguments about "what is a species", and the difference between nomenclature and taxonomy (names and concepts), and even the word "name" has a staggering heterogeny of meanting even within the confines of biological nomenclature/taxonomy. It baffles me that the same arguments that were going on when Taxacom/TDWG were born, continue almost unabated today. But, like I said.... one can always hope! :-)
After all - why have an 'event' class at all? The only thing you can do
with
such a class is to construct a query that asks, for instance, "tell me
about
everything whose foaf:person is Dr Joe Bloggs and that is a thing that happened". On the other hand, "specimen" in the strict sense of
"something
in a collection with an accession number" is very important, fairly
specific to
the subject matter, and entirely worth having a common term for.
We've debated this one back and forth as well. But it always ends up in the same place. That is, all of these arguments ultimately end up with the same conclusion: everything should be reduced to a triple-store. Indeed, life would be so easy if that were actually practical. At this stage of human digestion of biodiversity data, software availability and consumer fluency, and a number of other Real WorldT issues, however, it does not seem to be the path of least resistance towards progress.
Likewise, 'location' may mean 'geographic polygon', or it may conceivably mean a collection, or an institution. (How so? Because a location is
anything
that might be given in reply to the question 'where is X?'. Notice that
this is
language-dependent). Each of these three things has an existing vocabulary defined by geographers, librarians, and (I suppose) company registrars respectively. Does anyone really need a higer-layer linking them together?
And this is trivial compared to the analogous issues surrounding the term "taxon" (or "taxon name", or "taxon concept", etc.) Oy vey!
Enough rambling, I think. FWIW:
Likewise.
Aloha, Rich
P.S. Bob -- yes, I transcribed both posts to GitHub....
Richard L. Pyle, PhD Database Coordinator for Natural Sciences | Associate Zoologist in Ichthyology | Dive Safety Officer Department of Natural Sciences, Bishop Museum, 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
I have opened https://github.com/tdwg/vocab/issues/26 about DwC Class hierarchy and hope all discussion will continue there instead of in the tdwg-tag mailing list. If nothing else, discussion in the issue list is less prone to proliferation of many embedded copies of the traffic. -Bob
On Fri, Aug 7, 2015 at 8:23 PM, Bob Morris morris.bob@gmail.com wrote:
The current DwC Terms [1] (carrying Identifier http://rs.tdwg.org/dwc/2015-03-19/terms/ and Date Modified: 2015-06-02 ) is confusing (confused? silent? ) about the relation of the "sister classes" to the terms intended (?) to be used therewith. For example, dwc:Event, dwc:MachineObservation, and dwc:HumanObservation could reasonably(?) all have dwc:eventDate applied to them. But http://rs.tdwg.org/dwc/terms/index.htm#eventDatesuggests Class=dwc:Event.
Now it's human-clear that for each of the "multiple" boldface lines in the index, the second and subsequent terms are meant to be special cases of the left most one. What I can't see is whether [1] intends to encourage (require?) this in some explicit way, and where that explicit way is to be found. (I had a dream that the DwC RDF Guide might take a position....) .
Thanks.
--Bob p.s. I concede that some answers might lurk in the ongoing move of resources to http://tdwg.github.io/dwc/terms/. My second dream was that tdwg.github.io does Content Negotiation and curl would rescue me....
[1] http://rs.tdwg.org/dwc/terms/index.htm
-- Robert A. Morris
Emeritus Professor of Computer Science UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390
Filtered Push Project Harvard University Herbaria Harvard University
email: morris.bob@gmail.com web: http://efg.cs.umb.edu/ web: http://wiki.filteredpush.org http://wiki.datakurator.net http://taxonconceptexplorer.org/ http://www.cs.umb.edu/~ram
participants (4)
-
Bob Morris
-
pmurray
-
Richard Pyle
-
Steve Baskauf