[tdwg-tag] Re: TDM Ontology

Tue May 8 10:35:22 CEST 2007

Renato,

Thanks for your comments. That is an interesting view of the problem  
and I think you may be  correct for the supplier databases (though I  
don't have first hand knowledge of these database schemas). Generally  
the nearer the exchange format is to the supplier's schema the easier  
it will be for them to publish. Taking the approach Markus suggests  
would produce the result you are after I believe.

There is just one problem that you didn't address.

Who wants to consume the data and what do they want to do with it?

To have something that is easy to produce, easy to consume and easy  
to extend is more or less impossible. There has to be some pain  
somewhere!

What is your vision of a client application? How would it handle  
elements it hadn't seen before - or is this not a requirement?

All the best,

Roger

On 7 May 2007, at 17:35, Renato De Giovanni wrote:

> OK, so I "think" I'm starting to understand the problem that has led
> to the current approach taken by the taxon data model.
>
> Trying to make a quite simplified analogy with specimen data, imagine
> a collection that used a simple OCR process in all labels and now it
> has only one table with a single textual field. Since we can find
> different things in labels (some may have coordinates, others not,
> some may have collecting date, etc.) the suggestion for this kind
> situation would then be to tag all records individually. For instance
> saying that "in this specific record you may find something about the
> location, in this other record you may find something about location
> and date" and so on.
>
> Now back to species databases, if tagging is really something at the
> record level (and I suppose it is), I would be really surprised to
> see a species database which is ready to use some kind of tagging
> mechanism. Tagging at the record level would therefore require
> changing the data structure and revising all records.
>
> If this kind of work is being considered, then why not restructure
> everything according to the new terminology that was proposed during
> the meeting? Unless we are talking about some kind of data that
> simply cannot be separated and structured according to the proposed
> terms...
>
> Looking at the results of the meeting, it's really tempting to take
> all terms and put them into a simple conceptual schema like
> DarwinCore. It would not only provide a common XML vocabulary, but we
> would almost instantly benefit from the existing technology for
> sharing/accessing distributed data. From the TAPIR perspective, data
> exchange schemas like PlinianCore could be seen as output models.
> All providers from the different networks could still try to map the
> same agreed terms/concepts.
>
> If tagging will not take place at the record level, but at the field
> level, like "I have field X which sometimes has content about
> behaviour, sometimes about evolution, sometimes both, so I will
> automatically tag all records with both terms", then I see no big
> difference if in the current way of using TAPIR we just take the two
> corresponding concepts and map them against the same local field.
>
> RDF could still be one of the TAPIR outputs, but the ontology would
> probably need a different approach (as discussed in previous
> messages).
>
> Best Regards,
> --
> Renato
>
> On 4 May 2007 at 13:16, Bob Morris wrote:
>>>
>>> Anyway, I'm not quite familiar with species-level data sources.  
>>> From the
>>> previous messages, it seems that the main reason for using the  
>>> generic
>>> tagging approach is that most data sources will have chunks of text
>>> including information about one or more TDM categories, and it  
>>> will be
>>> impractical to separate this information in a more structured  
>>> way. Did I
>>> understand the problem correctly?
>>
>> Yes, but it is worse. Many such sources have \both/ textual---but
>> categorized---data and structured data. And both may need ontological
>> mapping so that both machine integration and human display  
>> applications
>> have a chance of putting together the right stuff and also not  
>> ignoring
>> what the client wishes not be ignored.
>>>
>>> In this case, then you're right that it would be interesting if  
>>> someone
>>> could investigate this a bit more, make some tests and give us a  
>>> more
>>> practical feedback. If most participants of the species model  
>>> workshop
>>> have this kind of database, maybe they could try to map their  
>>> fields to
>>> the TDM categories.
>>
>> I am presently doing some of that, albeit first trying to hand  
>> code some
>> instances with Protege and Altova SemanticWorks. I guess the  
>> interesting
>> part will come for stuff that \doesn't/ map well. At the moment, I am
>> somewhat at a loss for what our intent was in this case, but maybe in
>> another few hours I will have figured that out. ...
>>
>> Bob
>>
>>>
>>> Best Regards,
>>> --
>>> Renato
>
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag