[tdwg-content] A radical proposal for Darwin Core

joel sachs jsachs at csee.umbc.edu
Tue Jun 25 04:39:07 CEST 2013


Thanks - responses below.

On Mon, 24 Jun 2013, Robert Guralnick wrote:

>   Joel -- From an insider-outsider perspective, a couple quick comments:
> 1)  Do you mean Darwin Core is frequently misunderstood by standards
> developers?  Or do you mean Darwin Core is frequently misunderstood by
> people without specialized skills to read and understand standards? 

Good questions. What most concerns me is that, amongst those who are 
expressing occurrence data on the semantic web, there is a divergence of 
opinion regarding the semantics of core Darwin Core terms.

> 2)  I see the point that some clean-up would be useful but my view is that
> Darwin Core fulfills its intended purpose for most people who want to map
> their headers in a spreadsheet to a set of terms in the Core.  

I think you're right. The question is, can we clarify the semantics of 
existing terms in a way that will enable Darwin Core to be used 
effectively on the semantic web, without breaking existing infrastructure, 
and without altering the interpretations of existing Darwin Core records? 
I think we should try.

> This support
> an ecosystem of data that has come available online over the last 15 years.
>  I was talking to Tim Robertson, and I think the number is 3 records per
> second (per average) coming online via GBIF, the vast majority in Darwin
> Core format.

> 3)  Is it enough to clean up Darwin Core somehow, wipe our hands and walk
> away?  I guess maybe we could be sharper with term definitions.  But is that
> the problem or is the problem that what we want to do with Darwin Core
> doesn't fit its history and intended use as an exchange format. 

Most of the arguments and discussions I've seen on this list have involved 
doing pretty basic things with Darwin Core - mainly representing 
occurrences of organisms at particular times and places. We saw (or, at 
least, I saw) a shift away from a specimen-centric point of view when the 
rdfs:subClass backbone was removed from the dwctype vocabulary (previously 
called the type hierarchy) circa 2010/2011. But the documentation 
surrounding other parts of the standard have not caught up with this 
shift. (This is, IMO, a source of confusion.)

> 4)  I see the bigger challenge being how we grow more semantically
> meaningful representations that let us do new things

Yes - this is the bigger challenge! Clarifying whether an occurrence is a 
category of information, a superclass of preserved specimen, a record of 
an organism's appearance at a particular place and time, or all of the 
above should be the easy part.

> (an example might be
> the Biocollections Ontology (BCO)) versus more limited things we do with
> Darwin Core.  
>   This is just my naive impression.  I am not an expert in RDF or the
> semantic web.  

No one is. (Well, maybe this one guy at MIT ...)

> Id like yet more clarity before we get into what might be an
> challenging task.  Could we be even more focused?  

Yes, we must be more focussed than simply saying "we'll clean it all up".

> Can we surgically repair
> the key things in DwC not do a "clean up"?  

Possibly. Steve has been working tirelessly to do just that, and he 
recently asked for suggestions on how to repair our fractured 
understanding of dwc:occurrence. My suggestion is to tackle a few other 
key terms while we're at it, in the context of a clarification of the 
purposes of the two Darwin Core namespaces, and the status of other 
relevant standards.


> Best, Rob
> On Mon, Jun 24, 2013 at 1:45 PM, joel sachs <jsachs at csee.umbc.edu> wrote:
>       Hi Everyone,
>       Darwin Core remains poorly documented, occasionally
>       inconsistent, and
>       frequently misunderstood. (Does anyone disagree with that
>       characterization?) I believe this is one of the reasons we're
>       seeing a
>       proliferation of overlapping and sometimes incompatible
>       ontologies
>       building on Darwin Core terms.
>       One of the suggestions that came up on the TDWG-RDF mailing list
>       is to
>       have a clean-up-a-thon/document-a-thon for TDWG namespaces and
>       terms. I
>       suggest that, until such a clean up of Darwin Core occurs, TDWG
>       accept no
>       additions to the Darwin Core standard. There are several
>       examples in
>       support of my claim that we're building on a shaky foundation -
>       an obvious
>       one is that, as Steve is currently pointing out, there is no
>       consensus on
>       what constitutes a Darwin Core occurrence. (Can anyone name an
>       instance of
>       the class "http://rs.tdwg.org/dwc/terms/Occurrence"?)
>       The clean-up-a-thon proposal was enthusiastically endorsed
>       within the RDF
>       group, but no one volunteered to organize it. I propose that we
>       self-organize, and find a way to carve out two days at the
>       coming meeting
>       to hash out as much as we can, with a follow-on workshop if
>       necessary. But
>       first, I'd be interested to know - am I the only one who feels
>       this way?
>       Sincerely,
>       Joel.
>       p.s.
>       I've said this before, but it bears repeating - Darwin Core is
>       almost an
>       excellent standard, and almost ideally suited to be the
>       foundation for a
>       semantic web for biodiversity informatics. I have great respect
>       for those
>       who were involved in its creation and continued curation - for
>       their hard
>       work, and clear thinking, and patience for people like me
>       struggling to
>       understand. But all that work, thought, and patience will be for
>       naught,
>       if the gyre is allowed to widen much further.
>       _______________________________________________
>       tdwg-content mailing list
>       tdwg-content at lists.tdwg.org
>       http://lists.tdwg.org/mailman/listinfo/tdwg-content

More information about the tdwg-content mailing list