[tdwg-tag] Re: TDM Ontology

Tue May 8 22:25:45 CEST 2007

Hi Roger,

I'm not sure I share this vision of a "law of conservation of pain".
It's true that one of the points in the other message was to ease the 
process of sharing data, but this doesn't mean that clients will 
necessarily have trouble (I hope not!).

>>>From the TAPIR perspective, we handle extensibility by allowing 
providers to work with multiple conceptual schemas. If you produce a 
list of concepts from the TDM terms, anyone is free to produce other 
complementary lists in the future, without breaking compatibility.

You know that in TAPIR it's also possible to produce outputs in 
different XML formats, even RDF. This should facilitate the work of 
clients. 

I suppose that clients will usually request data in formats that 
include elements that they know something about. But anyway, nothing 
prevents them to request things that they don't have any knowledge 
about. The TapirLink browser that I demonstrated during the TAPIR 
workshop is one of those clients: it dynamically builds an output 
model based on what the provider declared to have, and it simply 
displays this data in a tabular form.

Now let's assume that we decide to work with a generic conceptual 
schema with two main concepts, category of InfoItem and InfoItem 
value. Let's also assume that providers will be able to easily share 
their data according to this conceptual model. In TAPIR, the output 
formats will be very limited - they will need to follow this generic 
approach. But let's suppose that this will not be a problem.
What is going to happen is that clients will get amost anything from 
there - basically values of things that can be categorised in many 
ways. If clients want to perform validation they will need to do it 
themselves (the output format will be too generic, so we cannot use 
XML validation). Perhaps RDF validation will offer more 
possibilities, but then you're only considering data exchange in an 
RDF world. The meaning of InfoItems you would get from a dictionary 
of categories, in the same way that you could get the meaning of 
elements from a dictionary (DarwinCore for instance, or some 
ontology).

In this case, it's not clear to me what would be the big benefits of 
using the generic model approach, but maybe I'm missing something.
The more knowledge you have about the elements or concepts, the more 
interesting and powerful the applications will be. It's a 
philosophical issue. 

If we decide to avoid the more "traditional" way of structuring and 
modelling data because we feel it somehow limits our applications, 
then I think we first need to clearly understand what are these 
limitations. Otherwise, by doing things in a very different way we 
may miss the opportunity of using existing tools and resources - but 
still running the risk of facing again in a different road the same 
data structuring issues that we tried to avoid.

Best Wishes,
--
Renato

PS: I'm sorry for crossposting. I'll send any follow-ups only to the 
new taxon-model mailing list:
http://lists.tdwg.org/mailman/listinfo/taxon-model

On 8 May 2007 at 9:35, Roger Hyam wrote:
> 
> Renato,
> 
> Thanks for your comments. That is an interesting view of the problem  
> and I think you may be  correct for the supplier databases (though I  
> don't have first hand knowledge of these database schemas). Generally  
> the nearer the exchange format is to the supplier's schema the easier  
> it will be for them to publish. Taking the approach Markus suggests  
> would produce the result you are after I believe.
> 
> There is just one problem that you didn't address.
> 
> Who wants to consume the data and what do they want to do with it?
> 
> To have something that is easy to produce, easy to consume and easy  
> to extend is more or less impossible. There has to be some pain  
> somewhere!
> 
> What is your vision of a client application? How would it handle  
> elements it hadn't seen before - or is this not a requirement?
> 
> All the best,
> 
> Roger