[tdwg-tapir] Output models based on RDFS class definitions?
Hi Guys,
This may be crazy idea but I thought it was worth running past you.
I have been working on RDFS vocabulary for TCS and this has spun off a bunch of ideas. One is stolen from the prismstandard.org thing of having avowed serializations of RDF data. You can read more here http://biodiv.hyam.net/schemas/TCS_RDF/tcs_rdf_examples.zip if you haven't already.
Following on from this I have had the idea of generating Tapir output models from RDFS class definitions. The TaxonName class, for instance, has 23 properties. Most of these are literals. It might be possible to automatically generate a Tapir output model using XSLT from this class definition. This would then give people the ability to query for TaxonName objects that are an avowed XML Schema serialization of RDF over Tapir. The data mapping done by the client could be quite simple because they are only having to retrieve a single object (more or less a table per class if we were building a cache database) which would also make it fast. It would be easy for people to understand and work through a data mapping process to set up a provider and the process could make use of querying a central repository of objects types that is periodically updated.
The problem comes with properties that are not literals but resource pointers. An example is TaxonName->authorshipTeam. In RDF this would be a resource that points to a team of people. If we all use GUIDs then the data source could just issue a GUID for the resource identifier and the client could then run another query to get the author team (or do it by resolving an LSID...). This multiple calls process is obviously inefficient. It is effectively getting the client to do their own joins but as the client is going to want to do this over multiple data sources and also it could be prohibitive to allow clients to do any arbitrary joins on a single host machine it might be a price worth paying. The other thing is that it solves the non-relational problem i.e. what if the data source can provide AuthorTeams and TaxonNames but can't permit arbitrary queries between the two (e.g. one is a researcher's Access db and the other the institutions collections management system).
Effectively this is a mechanism for formally creating lots of Darwin Core schemas to use....
Does this sound like a crazy idea or not?
All the best,
Roger
participants (1)
-
Roger Hyam