Re: [tdwg-tag] Re: TDM Ontology

8 May 2007

      Renato,

Thanks for your comments. That is an interesting view of the problem  
and I think you may be  correct for the supplier databases (though I  
don't have first hand knowledge of these database schemas). Generally  
the nearer the exchange format is to the supplier's schema the easier  
it will be for them to publish. Taking the approach Markus suggests  
would produce the result you are after I believe.

There is just one problem that you didn't address.

Who wants to consume the data and what do they want to do with it?

To have something that is easy to produce, easy to consume and easy  
to extend is more or less impossible. There has to be some pain  
somewhere!

What is your vision of a client application? How would it handle  
elements it hadn't seen before - or is this not a requirement?

All the best,

Roger

On 7 May 2007, at 17:35, Renato De Giovanni wrote:
...
OK, so I "think" I'm starting to understand the problem that has led
to the current approach taken by the taxon data model.
Trying to make a quite simplified analogy with specimen data, imagine
a collection that used a simple OCR process in all labels and now it
has only one table with a single textual field. Since we can find
different things in labels (some may have coordinates, others not,
some may have collecting date, etc.) the suggestion for this kind
situation would then be to tag all records individually. For instance
saying that "in this specific record you may find something about the
location, in this other record you may find something about location
and date" and so on.
Now back to species databases, if tagging is really something at the
record level (and I suppose it is), I would be really surprised to
see a species database which is ready to use some kind of tagging
mechanism. Tagging at the record level would therefore require
changing the data structure and revising all records.
If this kind of work is being considered, then why not restructure
everything according to the new terminology that was proposed during
the meeting? Unless we are talking about some kind of data that
simply cannot be separated and structured according to the proposed
terms...
Looking at the results of the meeting, it's really tempting to take
all terms and put them into a simple conceptual schema like
DarwinCore. It would not only provide a common XML vocabulary, but we
would almost instantly benefit from the existing technology for
sharing/accessing distributed data. From the TAPIR perspective, data
exchange schemas like PlinianCore could be seen as output models.
All providers from the different networks could still try to map the
same agreed terms/concepts.
If tagging will not take place at the record level, but at the field
level, like "I have field X which sometimes has content about
behaviour, sometimes about evolution, sometimes both, so I will
automatically tag all records with both terms", then I see no big
difference if in the current way of using TAPIR we just take the two
corresponding concepts and map them against the same local field.
RDF could still be one of the TAPIR outputs, but the ontology would
probably need a different approach (as discussed in previous
messages).
Best Regards,
--
Renato
On 4 May 2007 at 13:16, Bob Morris wrote:
...
...
Anyway, I'm not quite familiar with species-level data sources.  
From the
previous messages, it seems that the main reason for using the  
generic
tagging approach is that most data sources will have chunks of text
including information about one or more TDM categories, and it  
will be
impractical to separate this information in a more structured  
way. Did I
understand the problem correctly?
Yes, but it is worse. Many such sources have \both/ textual---but
categorized---data and structured data. And both may need ontological
mapping so that both machine integration and human display  
applications
have a chance of putting together the right stuff and also not  
ignoring
what the client wishes not be ignored.
...
In this case, then you're right that it would be interesting if  
someone
could investigate this a bit more, make some tests and give us a  
more
practical feedback. If most participants of the species model  
workshop
have this kind of database, maybe they could try to map their  
fields to
the TDM categories.
I am presently doing some of that, albeit first trying to hand  
code some
instances with Protege and Altova SemanticWorks. I guess the  
interesting
part will come for stuff that \doesn't/ map well. At the moment, I am
somewhat at a loss for what our intent was in this case, but maybe in
another few hours I will have figured that out. ...
Bob
...
Best Regards,
--
Renato
_______________________________________________
tdwg-tag mailing list
tdwg-tag@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-tag

Re: [tdwg-tag] Re: TDM Ontology

Roger Hyam