[tdwg-tag] Re: TDM Ontology
Renato De Giovanni
renato at cria.org.br
Mon May 7 18:35:00 CEST 2007
OK, so I "think" I'm starting to understand the problem that has led
to the current approach taken by the taxon data model.
Trying to make a quite simplified analogy with specimen data, imagine
a collection that used a simple OCR process in all labels and now it
has only one table with a single textual field. Since we can find
different things in labels (some may have coordinates, others not,
some may have collecting date, etc.) the suggestion for this kind
situation would then be to tag all records individually. For instance
saying that "in this specific record you may find something about the
location, in this other record you may find something about location
and date" and so on.
Now back to species databases, if tagging is really something at the
record level (and I suppose it is), I would be really surprised to
see a species database which is ready to use some kind of tagging
mechanism. Tagging at the record level would therefore require
changing the data structure and revising all records.
If this kind of work is being considered, then why not restructure
everything according to the new terminology that was proposed during
the meeting? Unless we are talking about some kind of data that
simply cannot be separated and structured according to the proposed
Looking at the results of the meeting, it's really tempting to take
all terms and put them into a simple conceptual schema like
DarwinCore. It would not only provide a common XML vocabulary, but we
would almost instantly benefit from the existing technology for
sharing/accessing distributed data. From the TAPIR perspective, data
exchange schemas like PlinianCore could be seen as output models.
All providers from the different networks could still try to map the
same agreed terms/concepts.
If tagging will not take place at the record level, but at the field
level, like "I have field X which sometimes has content about
behaviour, sometimes about evolution, sometimes both, so I will
automatically tag all records with both terms", then I see no big
difference if in the current way of using TAPIR we just take the two
corresponding concepts and map them against the same local field.
RDF could still be one of the TAPIR outputs, but the ontology would
probably need a different approach (as discussed in previous
On 4 May 2007 at 13:16, Bob Morris wrote:
> > Anyway, I'm not quite familiar with species-level data sources. From the
> > previous messages, it seems that the main reason for using the generic
> > tagging approach is that most data sources will have chunks of text
> > including information about one or more TDM categories, and it will be
> > impractical to separate this information in a more structured way. Did I
> > understand the problem correctly?
> Yes, but it is worse. Many such sources have \both/ textual---but
> categorized---data and structured data. And both may need ontological
> mapping so that both machine integration and human display applications
> have a chance of putting together the right stuff and also not ignoring
> what the client wishes not be ignored.
> > In this case, then you're right that it would be interesting if someone
> > could investigate this a bit more, make some tests and give us a more
> > practical feedback. If most participants of the species model workshop
> > have this kind of database, maybe they could try to map their fields to
> > the TDM categories.
> I am presently doing some of that, albeit first trying to hand code some
> instances with Protege and Altova SemanticWorks. I guess the interesting
> part will come for stuff that \doesn't/ map well. At the moment, I am
> somewhat at a loss for what our intent was in this case, but maybe in
> another few hours I will have figured that out. ...
> > Best Regards,
> > --
> > Renato
More information about the tdwg-tag