RDF query and inference in a distributed environment

Tue Jan 3 23:33:59 CET 2006

> I'm a great believer in caching, but it comes equipped with one trivial
> technical cost which is not always trivial socially. That is, the data
> originator must be willing to enter into a contract (in the software,
> not legal, sense) to update the data only at a specific time, that is to
> provide a "good until" promise.

Not if automated and robust synchronization protocols are established.  This
is what I meant by "issues of replication/synchronization".  Data records
would not need to have a pre-determined lifespan.  Rather, the lifespan
would be defined as "good until updated by an authorized editor".  Once an
authorized editor updates *any* copy of a given GUID-tagged record, the
update would be automatically propagated to all protocol-compliant
replicates of that record. Unlike dynamic DNS, most of our records (taxon
names, specimen objects, defined & georeferenced localities, publications,
etc.) are mostly static and perpetual, changing only when errors or
omissions are corrected. Thus, most of the data traffic among replicate
copies would be from new record additions, rather than existing record
modifications.

> Otherwise, a client of the cached
> version is clueless whether the data in the cache is accurate.

Not if robust automated replication/synchronization protocols are in place.

Aloha,
Rich