[tdwg-tag] Final changes in TAPIR

Fri Jan 30 16:30:48 CET 2009

Renato,

I really think #2 is worth including.  There are times when I wish to 
send small requests through the TAPIR protocol but there are other times 
when, especially on first inspection it would be nice to pull a initial 
dump.  Your XML format for archives looks find but I would consider 
adding attributes like dateCreated and numberOfRecords.  This way if 
there are monthly archives for example I could pull the latest one.  
Secondly the number of records would be useful to know how much is in 
the dump and not just the size.  Depending on the actual dump files the 
other question I would have is is the dump the delta of the previous 
dump or a complete dump.  That way I would know if I was getting the 
full dump or I have to get all the dumps and then merge them together.

These are my inital thoughts.

Regards,

Michael Giddens
Biodiversity Informatics Software Development
www.SilverBiology.com
Baton Rouge, LA
phone: +1 225-937-9657
email: mikegiddens at silverbiology.com
skype: mikegiddens

renato at cria.org.br wrote:
> Dear all,
>
> There are just two items left on the list of possible changes before
> submitting TAPIR to the TDWG standards track:
>
> 1) Allow custom operations to be declared as part of capabilities.
>
> I would suggest to simply include a new custom slot for this in the schema
> in case someone needs to use it in the future.
>
> 2) Allow dump files to be declared.
>
> This has been discussed some time ago in the TAPIR mailing list but we
> didn't come to a final conclusion.
>
> Since some networks are starting to harvest data by fetching entire dump
> files, I think it's important to allow TAPIR services to declare any dump
> files that may be available. Fetching a dump file from a provider and
> using incremental harvesting in later interactions with the service will
> probably be the most efficient approach.
>
> Since TAPIR generates XML output, it makes more sense to me to see dump
> files in XML. However, Tim/Markus (GBIF) are proposing another format for
> dump files using tab/csv files together with a metafile. It should be easy
> to allow both options when declaring a dump file in TAPIR capabilities,
> but I don't think it's the role of TAPIR to define specific formats. We
> can probably use something like this to declare dump files:
>
> <archives>
>   <archive format="" location="" outputModel=""/>
>   ...
> </archives>
>
> Where format could be "xml" or any custom term, and outputModel would be
> optional (only used with "xml" format). Things like date when the dump
> file was generated and whether it's gzipped or not could be additional
> attributes, but in most cases this can be discovered through the protocol
> used to retrieve the file, so in principle I would not include the
> attributes.
>
> Please let me know if this is an acceptable solution or if you have any
> different thoughts. Also let me know if you have any other ideas or
> suggestions about TAPIR in general. This is the time.
>
> I would like to finally submit specification & schema to the standards
> track in the beginning of the week.
>
> Best Regards,
> --
> Renato
>
>
>