[tdwg-tapir] Fwd: Tapir protocol - Harvest methods?

Renato De Giovanni renato at cria.org.br
Wed May 21 20:16:44 CEST 2008


If we want to ensure the lowest possible barrier for providers, then 
I think zipped csv files need to be supported. If we really want to 
handle complex data using the same format, then we need something 
like the csv star scheme you mentioned (with well-defined rules about 
all files and how the records are related).

The limitation in this case is that we would only handle one-level 
relationships (not a generic solution) and providers with complex 
data would probably need to write some code to generate the dumps 
(not sure how many providers would do it) - unless wrappers that can 
handle complex data implement additional functionality to produce 
these dumps.

On the other hand, if we allow more than one format, complex data 
could be handled with compact XML representations (in a generic way) 
which could be automatically produced by existing wrappers.

So my understanding is that the biggest decision is: Use a single 
format (csv) with additional rules for complex data, or allow 
different formats (one for simple and another for complex data).

Although I know it's usually much better for clients to deal with a 
single format, my *feeling* in this case is that it would be more 
effective to allow different formats. I'm also not sure if it would 
be easier for clients to handle additional star scheme rules when 
importing complex data than it would be to parse a single XML file 
encoded in some compact structure.

Just some thoughts...

Best Regards,

On 20 May 2008 at 17:36, Markus Döring wrote:

> Renato,
> complex data can also be represented by tab files, with a file for  
> each extension that has a pointer in the first column.
> That is what we originally had in mind with the star scheme.
> Markus

More information about the tdwg-tag mailing list