[tdwg-tapir] Hosting strategies

Wouter Addink wouter at eti.uva.nl
Mon May 14 14:12:44 CEST 2007

we are following a kind of push strategy with the development of a 
checklists provider tool for GBIF currently. The tool converts data in 
spreadsheets or text files into a public database compatible with TCS, with 
a TAPIR service. That database can be at the local provider or directly at 


----- Original Message ----- 
From: "Roger Hyam" <roger at tdwg.org>
To: "Dave Vieglais" <vieglais at ku.edu>
Cc: <tdwg-tapir at lists.tdwg.org>
Sent: Monday, May 14, 2007 11:40 AM
Subject: Re: [tdwg-tapir] Hosting strategies

> Dave,
> I think you are right. Whatever strategy is used there has to be an 
> element of push in it. The production database has to push data to  the 
> publicly visible database that can then be scraped/searched by  interested 
> parties. The difference between pushing data to a public  database that is 
> managed by your own institution or one that is  managed by a third party 
> is really quite minor.
> I am concerned because I am not sure of the technical facilities and 
> human resources of potential data suppliers.
> I wonder if anyone has some figures on this stuff?
> All the best,
> Roger.
> On 11 May 2007, at 12:02, Dave Vieglais wrote:
>> Hi Everyone,
>> Not really a TAPIR specific response, but perhaps the right  audience. 
>> I'm probably stating the obvious, but the simplest way  to get around the 
>> hassles of running a server and the associated  firewall headaches is not 
>> to serve the data but instead to push  it.  By adding an authentication 
>> layer, it would be an extension to  the GBIF REST services to allow 
>> POSTing data, rather than just GET  (I may be wrong on this - not exactly 
>> sure what degree of REST  implementation has been done by GBIF).  Add in 
>> DELETE and UPDATE  and instead of GBIF running harvesters to capture 
>> data,  contributors could simply push their data when necessary.  This 
>> would I expect be an attractive solution for those data providers  that 
>> would prefer not to operate servers but would still like to  contribute 
>> to the global knowledge pool of biodiversity.
>> More than likely a mixed model may be more ideal - with some data 
>> sources acting as servers, and others pushing their data.  How  about if 
>> those institutions that were comfortable running and  maintaining servers 
>> also adopted the same complete REST  implementation as the hypothetical 
>> GBIF mentioned above?  And what  if the servers were, for the most part, 
>> aware of each other and so  could act as proxies or mirrors for the other 
>> servers (perhaps even  automatically replicating content).  The end 
>> result would be more  of a mesh topology of comparatively high 
>> reliability and availability.
>> It would I think be a more scalable solution, and perhaps more 
>> maintainable in the long term, since it is a relatively simple  thing to 
>> update a standalone application that could push the data  compared with 
>> updating and securing a server.  The expense of  participation in the 
>> networks would drop, and resources could be  directed towards operation 
>> of a few high quality / high reliability  services for accessing the 
>> data.
>> Just a thought.  There are obvious social implications, such as the 
>> perception of loosing control of one's data - but then it could  also be 
>> argued that if a provider had the ability to DELETE their  records from a 
>> server, then they actually have more control over  the distribution of 
>> their data than currently.
>> cheers,
>>   Dave V.
>> On May 11, 2007, at 19:19, Roger Hyam wrote:
>>> Hi Everyone,
>>> There is a requirement that all wrapper type applications (TAPIR, 
>>> DiGIR, BioCASe and others) have but that I don't think we address.
>>> All instances need to have:
>>> Either a database on a server in a DMZ or with an ISP with the  ability 
>>> to export data from the production database to the public  database and 
>>> then keep changes in the production database  synchronize with the 
>>> public database.
>>> Or the ability to provide a secured/restricted connection directly  to 
>>> production database through the firewall.
>>> Configuring the wrapper software against a database seems a  smaller 
>>> problem than getting a handle on an up to date database to  configure it 
>>> against!
>>> Should we have a recommended strategy or best practice for  overcoming 
>>> these problems? Do we have any figures on how they are  overcome in the 
>>> existing BioCASe and DiGIR networks?
>>> Many thanks for your thoughts,
>>> Roger
>>> _______________________________________________
>>> tdwg-tapir mailing list
>>> tdwg-tapir at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
> _______________________________________________
> tdwg-tapir mailing list
> tdwg-tapir at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tapir

More information about the tdwg-tag mailing list