RDF query and inference in a distributed environment
Dear Rod,
Thanks for putting those ideas together and sharing them with the group. We really needed some more concrete examples to explore the use of LSID and RDF in the GUID metadata. It definitely sound like a good combination of technologies to explore.
Reading your manuscript one question kept coming to my mind. The most important benefits of using RDF seem to be the graph representation, the possibility of merging separate graphs into larger ones (the merge operation), and the possibility of making inferences on the resulting graph. All those features are supported by the so called triple stores.
However, the triple store examples I come across always use a local and centralized architecture. I'm aware that those are simplifications needed to get the new concepts through, but I suppose we need to address the scalability issues of those technologies if we want to use them in production environment.
So I wanted to explore a little in this list is the following issue: How does the RDF inference mechanisms scale up to a distributed environment like ours? In my opinion, GUID technologies provide only a bit (although a fundamental bit) of the infrastructure needed to make the RDF metadata framework fully functional, i.e., it provides the function: get me all metadata for this ID (besides persistence and maybe provenance). I suppose that other pieces of functionality will be required, such as: i) other metadata discovery mechanisms (to find out, for example, who has metadata using this or that ontology); ii) a more flexible search mechanism (such as RDQL or SPARQL). In other words, we might need a RDF compatible query protocol.
Then it makes a lot of sense to have "aggregators" all around (such as GBIF or others) harvesting and merging RDF triples of interest into local triple stores and providing query and inference services, and providing links (for humans and machines) back to the information source.
Does anyone have an idea on how SPARQL or related technologies are right now in terms of supporting this kind of functionality?
Cheers,
Ricardo
Roderic Page wrote:
I've finally managed to put down some ideas on LSIDs, metadata, and taxonomic names in the form of a manuscript. It's a bit rough, but I thought members of this list might find it of interest. You can grab a PDF here: http://darwin.zoology.gla.ac.uk/~rpage/lsid/examples/lsid.pdf
The web site http://darwin.zoology.gla.ac.uk/~rpage/lsid/examples has links to some example RDF files that are mentioned in the manuscript (where they are trimmed to save space).
I'd welcome any comments.
Regards
Rod
Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names at http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species at http://ispecies.org
participants (1)
-
Ricardo Scachetti Pereira