[tdwg-content] Software available for LSID resolution, Linked Data, OAI-PMH [SEC=UNCLASSIFIED]

Paul Murray pmurray at anbg.gov.au
Tue Jun 28 09:37:25 CEST 2011


We have uploaded to google code the software currently being used at biodiversity.org.au to serve up LSID metadata, linked-data objects identified by URI, and reply to OAI-PMH requests.

The code is available here:
	svn checkout http://ala-nsl.googlecode.com/svn/service-layer/trunk service-layer

or
	svn checkout https://ala-nsl.googlecode.com/svn/service-layer/trunk service-layer

(I find that http doesn't work for me - something to do with authenticating to google code through the corporate firewall)

The system is our "quick, get something working" interim solution for creating a web presence for our data, pending a full National Species List application. It is a suite of XSQL source files that are loaded into an instance of the eXist XML database (exist.sourceforge.net).

Using it is a matter of

* acquiring a domain name that you wish to use for your LSID authority and hosting the software
* generating dumps of your data as XML
* identifying your XML elements with lsids
* writing XSL style shhets to convert your XML into the various output formats that you wish to support. 

As this system imposes few constraints on the underlying XML, you do not need to "massage" your data into some specific format. Being small and self-contained, it could be a useful way of exposing small data sets peculiar to some space - although it accommodates millions of records perfectly well at our installation.

The payoff is that system does correctly implement the LSID standard (etc). This can be seen by navigating to
	http://lsid.tdwg.org/summary/urn:lsid:biodiversity.org.au:apni.taxon:54321

As you see - the software at tdwg has no difficulty talking to our resolver and fetching the data from it.

Likewise, linked data is implemented - although HEAD requests are not getting through correctly and so the first test on this page fails.
	http://validator.linkeddata.org/vapour?vocabUri=http%3A%2F%2Fbiodiversity.org.au%2Fapni.taxon%2F54321&classUri=http%3A%2F%2F&propertyUri=http%3A%2F%2F&instanceUri=http%3A%2F%2F&defaultResponse=dontmind&userAgent=vapour.sourceforge.net

Nevertheless: uriburner, zeitgeist and so on manage to retrieve and process the data from our URIs correctly.

Caveats:

* The system is not tested with LSIDs with versions. I believe that the usual use for versioned LSIDs is for serving up data (eg: media) rather than metadata, and that is not the focus of this bit of software.
* The system requires some configuration, both internally and at the web server. Most particularly, we needed to perform mappings at our reverse proxy to get URIs passed though to the correct targets behind out firewall.
* The script that loads the executable XSQL into eXist is a unix script which works on Solaris and on OSX. The repository does not include an equivalent windows command file.
* You will almost certainly need to write at least some XSLT to get the output you want. The repository comes with a demo dataset.
* There is minimal documentation, I'm afraid. The READMEs point you at our confluence page, but there's not much there in relation to getting this bit of software going.
* The system does not magically vanish all the issues with regard to RDF and XML vocabularies and the like. Conforming to standards is a matter of generating and loading conformant XML, or tweaking your XSLT.
If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.

Please consider the environment before printing this email.


More information about the tdwg-content mailing list