[tdwg-content] More Strange Monkey Business-like things in GBIF KOS Document

joel sachs jsachs at csee.umbc.edu
Wed Feb 16 02:24:58 CET 2011

Hi Hilmar,

No argument from me, just my prejudice against "solution via 
ontology", and my enthusiasm for "schema-last" - the idea that the schema 
reveals itself after you've populated the knowledge base. This was never 
really possible with relational databases, where a table must be defined 
before it can be populated. But graph databases (expecially the "anyone can say 
anything" semantic web) practically invite a degree of schema-last.
Examples include Freebase (schema-last by design), and FOAF, whose 
specification is so widely ignored and mis-used (often to good effect), 
that the de-facto spec is the one that can be abstracted from FOAF files 
in the wild.

The semantic web is littered with ontologies lacking instance data; my 
hope is that generating instance data is a significant part of the 
ontology building process for each of the ontologies proposed by the 
report. By "generating instance 
data" I mean not simply marking up a few example records, but generating 
millions of triples to query over as part of the development cycle. This 
will indicate both the suitability of the ontology to the use 
cases, and also its ease of use.

I like the order in which the GBIF report lists its infrastructure 
recommendations. Persistent URIs (the underpinning of everything);
followed by competency questions and use cases (very helpful in the 
prevention of mental masturbation); followed by OWL ontologies  (to 
facilitate reasoning). Perhaps the only place where we differ is that 
you're comfortable with "incorporate instance data into the ontology 
design process" being implicit, while I never tire of seeing that point 
hammered home.

Regards - 

On Mon, 14 Feb 2011, Hilmar Lapp wrote:

> On Feb 14, 2011, at 12:05 PM, joel sachs wrote:
>> I think the recommendations are heavy on building ontologies, and light on 
>> suggesting paths to linked data representations of instance data.
> Good observation. I can't speak for all of the authors, but in my experience 
> building Linked Data representations is mostly a technical problem, and thus 
> much easier compared to building soundly engineered, commonly agreed upon 
> ontologies with deep domain knowledge capture. The latter is hard, because it 
> requires overcoming a lot of social challenges.
> As for the GBIF report, personally I think linked biodiversity data 
> representations will come at about the same pace whether or not GBIF pushes 
> on that front (though GBIF can help make those representations better by 
> provisioning stable resolvable identifier services, URIs etc). There is a 
> unique opportunity though for "neutral" organizations such as GBIF (or, in 
> fact, TDWG), to significantly accelerate the development of sound ontologies 
> by catalyzing the community engagement, coherence, and discourse that is 
> necessary for them.
> 	-hilmar
> -- 
> ===========================================================
> : Hilmar Lapp  -:- Durham, NC -:- informatics.nescent.org :
> ===========================================================

More information about the tdwg-content mailing list