[tdwg-tapir] Interpretation of TAPIR filters
Renato De Giovanni
renato at cria.org.br
Tue Dec 4 18:17:54 CET 2007
I think we could still recommend providers to share their data using
the same datatypes defined by conceptual schemas, but without being
too strict. After all, it's better to share something using a
different datatype than not share anything. In the future, the TAPIR
Tester could try to check this and raise warnings when the concept
datatype is different from the mapped (declared) datatype.
So if we all agree, here's a summary of the necessary changes:
* Add an optional attribute (datatype) for each mapped concept in
* Datatypes would come from XML Schema built-in datatypes and should
be declared with the full URI, such as:
* Providers should declare the underlying datatype used when mapping
the concept, which should preferably be the same datatype defined by
the corresponding conceptual schema.
* The default dataype is http://www.w3.org/2001/XMLSchema#string
However, there's one remaining issue: Should we handle custom
datatypes? For example, what would be the corresponding datatype for
DarwinCore/ABCD collecting dates? (now it's a custom DateTimeISO).
Regarding standard TAPIR errors, I certainly agree it would be
interesting to define them. I wish I had more time to revise what we
have and make a proposal.
About TapirLink, I think I used a similar approach. An error is
raised if you try to use "like" with non-string datatypes. Also the
configurator doesn't check if the underlying datatype is compatible
with the one defined by the conceptual schema.
On 4 Dec 2007 at 10:48, Döring, Markus wrote:
> Just catching up.
> I agree with Renato about all the filter issues as you probably have
> guessed. Regarding sorting now. In pywrapper I have left the "how" it gets
> sorted to the underlying database type. And that might be very different to
> the conceptual or model one. But I wonder if it is really important to be
> specific about the sorting order? At least it is sorted in some stable way.
> I agree it makes sense to announce that datatype in the capabilities, so you
> understand the sorting. That can easily be done using xml schema datatypes
> (should cover nearly all db types). The underlying datatype also affects
> what COPs you can use with it. You will get an error with PyWrapper for
> example if you do a LIKE on a date or integer type.
> If we will force people to adapt their datatypes in the underlying database
> this is quite a burden. This way every data provider will need a copy of
> their database and they will never be able to use the original dataset. That
> might not be a problem and in fact this allows you to do quite some data
> transformation in between, but for many providers I know this will be too
> much - or they will close to never update their data clone.
> So for now I would suggest to indicate the underlying db type in the concept
> capabilities and just use whatever there is for ordering. We should probably
> come up with a standard error for the CopNotSupportedByLocalDatatype. How do
> you deal with this in TapirLink Renato?
More information about the tdwg-tag