Robert Huber wrote:
It would be very interesting to me whether OAI ever was discussed in the TDWG or for use in GBIF?
---------------------
Robert,
Back in the 90's issues of "control" (integrity and availability, even down to classes of users) and attribution really dominated discussions. Many institutions wanted assurance that users would understand who created, or at least digitized, and maintained the data. Oposing schools of thought emerged, the "data hoarders" versus the "data anarchists". Given that context, several forays into data integration (Species Analyst, REMIB, AVH, etc.) prior to DiGIR and BioCASe all assumed a requirement for distributed queries (and later indexing), rather than harvesting and wholesale caching. We were aware of OAI by 2000, and it was even implemented in vPlants project http://www.vplants.org/, but the rest of us avoided that approach because we fully expected it to be much more difficult to sell "let us serve your data for you". The community is STILL very wary of empire building.
Regarding Rod Page's contrast between the approaches in biodiversity- and (molecular) bio-informatics: I think one of the drivers behind the two approaches that is that (lab-)bench scientists tend to view sequence data as observation-events and relatively fixed once they are submitted; whereas collections people tend to view specimens (and maybe even classifications) as evolving entities that must be updated. We have a longer view of the specimens and the data about them.
Having said that, there is no doubt that building large warehouses is a simpler and better performing architecture for integration. We just need to address data replication explicitly and deliberately, in a heterogeneous environment where "subscribers" can go offline (i.e., a database transaction oriented approach would be too stringent).
A GUID system that incorporates versioning should help support data replication. Open and standards-based protocols appear to be very effective in calming fears about empire buiding.
-Stan
Stanley D. Blum, Ph.D. Research Information Manager California Academy of Sciences 875 Howard St. San Francisco, CA +1 (415) 321-8183