Rob,
Thanks - responses below.
On Mon, 24 Jun 2013, Robert Guralnick wrote:
Joel -- From an insider-outsider perspective, a couple quick comments:
- Do you mean Darwin Core is frequently misunderstood by standards
developers? Or do you mean Darwin Core is frequently misunderstood by people without specialized skills to read and understand standards?
Good questions. What most concerns me is that, amongst those who are expressing occurrence data on the semantic web, there is a divergence of opinion regarding the semantics of core Darwin Core terms.
- I see the point that some clean-up would be useful but my view is that
Darwin Core fulfills its intended purpose for most people who want to map their headers in a spreadsheet to a set of terms in the Core.
I think you're right. The question is, can we clarify the semantics of existing terms in a way that will enable Darwin Core to be used effectively on the semantic web, without breaking existing infrastructure, and without altering the interpretations of existing Darwin Core records? I think we should try.
This support an ecosystem of data that has come available online over the last 15 years. I was talking to Tim Robertson, and I think the number is 3 records per second (per average) coming online via GBIF, the vast majority in Darwin Core format.
- Is it enough to clean up Darwin Core somehow, wipe our hands and walk
away? I guess maybe we could be sharper with term definitions. But is that the problem or is the problem that what we want to do with Darwin Core doesn't fit its history and intended use as an exchange format.
Most of the arguments and discussions I've seen on this list have involved doing pretty basic things with Darwin Core - mainly representing occurrences of organisms at particular times and places. We saw (or, at least, I saw) a shift away from a specimen-centric point of view when the rdfs:subClass backbone was removed from the dwctype vocabulary (previously called the type hierarchy) circa 2010/2011. But the documentation surrounding other parts of the standard have not caught up with this shift. (This is, IMO, a source of confusion.)
- I see the bigger challenge being how we grow more semantically
meaningful representations that let us do new things
Yes - this is the bigger challenge! Clarifying whether an occurrence is a category of information, a superclass of preserved specimen, a record of an organism's appearance at a particular place and time, or all of the above should be the easy part.
(an example might be the Biocollections Ontology (BCO)) versus more limited things we do with Darwin Core.
This is just my naive impression. I am not an expert in RDF or the semantic web.
No one is. (Well, maybe this one guy at MIT ...)
Id like yet more clarity before we get into what might be an challenging task. Could we be even more focused?
Yes, we must be more focussed than simply saying "we'll clean it all up".
Can we surgically repair the key things in DwC not do a "clean up"?
Possibly. Steve has been working tirelessly to do just that, and he recently asked for suggestions on how to repair our fractured understanding of dwc:occurrence. My suggestion is to tackle a few other key terms while we're at it, in the context of a clarification of the purposes of the two Darwin Core namespaces, and the status of other relevant standards.
Cheers, Joel.
Best, Rob
On Mon, Jun 24, 2013 at 1:45 PM, joel sachs jsachs@csee.umbc.edu wrote: Hi Everyone,
Darwin Core remains poorly documented, occasionally inconsistent, and frequently misunderstood. (Does anyone disagree with that characterization?) I believe this is one of the reasons we're seeing a proliferation of overlapping and sometimes incompatible ontologies building on Darwin Core terms. One of the suggestions that came up on the TDWG-RDF mailing list is to have a clean-up-a-thon/document-a-thon for TDWG namespaces and terms. I suggest that, until such a clean up of Darwin Core occurs, TDWG accept no additions to the Darwin Core standard. There are several examples in support of my claim that we're building on a shaky foundation - an obvious one is that, as Steve is currently pointing out, there is no consensus on what constitutes a Darwin Core occurrence. (Can anyone name an instance of the class "http://rs.tdwg.org/dwc/terms/Occurrence"?) The clean-up-a-thon proposal was enthusiastically endorsed within the RDF group, but no one volunteered to organize it. I propose that we self-organize, and find a way to carve out two days at the coming meeting to hash out as much as we can, with a follow-on workshop if necessary. But first, I'd be interested to know - am I the only one who feels this way? Sincerely, Joel. p.s. I've said this before, but it bears repeating - Darwin Core is almost an excellent standard, and almost ideally suited to be the foundation for a semantic web for biodiversity informatics. I have great respect for those who were involved in its creation and continued curation - for their hard work, and clear thinking, and patience for people like me struggling to understand. But all that work, thought, and patience will be for naught, if the gyre is allowed to widen much further. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content