RDF query and inference in a distributed environment

Richard Pyle deepreef at BISHOPMUSEUM.ORG
Tue Jan 3 08:24:53 CET 2006


> Long term what I think might happen is that users have their own triple
> stores, and as they do queries the results get added to their own
> triple store and they can make inferences locally that they are
> interested in. MIT's Piggy bank project
> (http://simile.mit.edu/piggy-bank/) is an example of this sort of
> approach.

With hard drive sizes spiraling skyward, and $/GB ($/TB) spiraling
downward.... I'm wondering whether or not the "distributed" system that
serves us best might be "distributeded mirror copies", rather than
distributed complementary data.  I've been pushing this approach for
taxonomic data for a while, but perhaps it would be useful for other shared
data as well (geographic localities, people/agents, publications/references,
etc.)  Even for specimen data -- where "ownership" is unambiguous -- it
seems that as long as the ownership is clearly embedded in the core
metadata, there are more fundamental advantages in storing and serving data
from multiple data resources, rather than serving it from only one single
data resource.

One way to look at it would be "robust caching", with automated update
capabilities.  The main benefits would be:

1) Large-scale distributed backup of the world's biodata (ensuring
perpetuity across a changing technological landscape);
2) Performance and reliability enhancement for local data authority needs;
4) Essentially 100% data availability (like DNS), regardless of which
servers are up or down at any given moment;
3) Maximization of distributed work/effort for data "maintenance and
repair".

The point is, the technology discussions would focus less on issues of
distributed queries, and more on issues of replication/synchronization and
data edit authorization protocols.

Perhaps this would be reaching too far, too soon.  But on the other hand, I
don't see why implementing a "distributed mirror" system would be any more
technically, financially, or socially challenging than implementing a
distributed query system for distributed data.

Aloha,
Rich

Richard L. Pyle, PhD
Database Coordinator for Natural Sciences
  and Associate Zoologist in Ichthyology
Department of Natural Sciences, Bishop Museum
1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252
email: deepreef at bishopmuseum.org
http://hbs.bishopmuseum.org/staff/pylerichard.html




More information about the tdwg-tag mailing list