[Tdwg-guid] Zoobank LSID Service

older
[Fwd: [seek-kr-sms] Modeling large...

peter.hollas＠thomson.com

22 Sep 2006 22 Sep '06

10:42

Hi everyone, I am in the process of putting together the LSID service component of Zoobank.org, after a few initial hiccups and difficulties I have a basic service up and running in the development environment. Now I have arrived at the contentious issue of what data/metadata to return and in which flavour. I have a few questions... 1. What format should I return the metadata response in? From discussions with Roger it seems appropriate to go with RDF/TCS for the metadata reponse. Would this be a good way to go, and does anyone have any static examples of such a response? 2. If a Data service were implemented, should the response simply be the LSID or the taxon name as we don't currently hold any holotype photos etc. 2. Is there any standard scheme for LSID discovery? i.e. Would it be a good/bad idea to extend the LSID service to allow machine queries of LSIDs by taxon name rather than discovering them through the web interface? Any comments and suggestions are very welcome! Regards, Peter. Peter Hollas MSc BSc(hons) (Peter.Hollas@thomson.com) Software Engineer / Systems Administrator Thomson Zoological Innovation Centre York Science Park Heslington York YO10 5DG Tel: 01904-435113 Fax: 01904-435114

Show replies by date

Sally Hinchcliffe

22 Sep 22 Sep

11:49

Hi I'm attaching a copy of an RDF file of IPNI data which is in RDF/TCS. (I thought it it was on the wiki but now I can't find it) which covers all ranks and various examples (records with basionyms, records with types etc.) If you look on the wiki there is also information about our LSID server. We've not been working on it for a while so it's pretty much in the state we left it after the GUIDs workshop. Over the next few weeks we hope to get back going with it and update the wiki particularly with the details of how we handle versioning. There are also some notes on the wiki about things I had problems with with the version of TCS/RDF that Roger had circulated. I don't know if it's been changed since ... HTH Sally

...

Hi everyone,

I am in the process of putting together the LSID service component of Zoobank.org, after a few initial hiccups and difficulties I have a basic service up and running in the development environment.

Now I have arrived at the contentious issue of what data/metadata to return and in which flavour.

I have a few questions...

1. What format should I return the metadata response in? From discussions with Roger it seems appropriate to go with RDF/TCS for the metadata reponse. Would this be a good way to go, and does anyone have any static examples of such a response?

2. If a Data service were implemented, should the response simply be the LSID or the taxon name as we don't currently hold any holotype photos etc.

2. Is there any standard scheme for LSID discovery? i.e. Would it be a good/bad idea to extend the LSID service to allow machine queries of LSIDs by taxon name rather than discovering them through the web interface?

Any comments and suggestions are very welcome!

Regards, Peter.

Peter Hollas MSc BSc(hons) (Peter.Hollas@thomson.com) Software Engineer / Systems Administrator Thomson Zoological Innovation Centre York Science Park Heslington York YO10 5DG

Tel: 01904-435113 Fax: 01904-435114

_______________________________________________ TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid

*** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE rdf:RDF [ <!ENTITY tn 'http://tdwg.org/2006/03/12/TaxonNames/'>]> <rdf:RDF xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:tn="http://tdwg.org/2006/03/12/TaxonNames/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:30000959-2:1.1.2.1"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>Amaryllidaceae J.St.-Hil.</dc:title> <dcterms:created>2004-01-20 00:00:00.0</dcterms:created> <dcterms:modified>2005-06-23 15:45:33.0</dcterms:modified> <tn:rankString>fam.</tn:rankString> <tn:nameComplete>Amaryllidaceae</tn:nameComplete> <tn:uninomial>Amaryllidaceae</tn:uninomial> <tn:authorship>J.St.-Hil.</tn:authorship> <tn:publishedIn>Expos. Fam. Nat. 1: 134. 1805 [Feb-Apr 1805]</tn:publishedIn> <tn:year>1805</tn:year> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>Amaryllis Linnaeus, nom. cons.</dc:title> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:312219-2:1.2"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>subtrib. Baccaridinae Less.</dc:title> <dcterms:created>2004-01-20 00:00:00.0</dcterms:created> <dcterms:modified>2005-08-19 15:59:04.0</dcterms:modified> <tn:rankString>subtrib.</tn:rankString> <tn:nameComplete>subtrib. Baccaridinae</tn:nameComplete> <tn:uninomial>Baccaridinae</tn:uninomial> <tn:authorship>Less.</tn:authorship> <tn:publishedIn>Linnaea 5: 145. 1830</tn:publishedIn> <tn:year>1830</tn:year> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>Baccharis L.</dc:title> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:1786-1:1.1.2.1.1.1"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>Dodonaea Plum. ex Adans.</dc:title> <dcterms:created>2003-07-02 00:00:00.0</dcterms:created> <dcterms:modified>2005-11-29 09:12:00.0</dcterms:modified> <tn:rankString>gen.</tn:rankString> <tn:nameComplete>Dodonaea</tn:nameComplete> <tn:uninomial>Dodonaea</tn:uninomial> <tn:authorship>Plum. ex Adans.</tn:authorship> <tn:publishedIn>Fam. ii. 342 (1763).</tn:publishedIn> </tn:TaxonName>  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:60435733-2:1.1.2.1"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>Renistipula Borhidi</dc:title> <dcterms:created>2004-09-07 06:46:08.0</dcterms:created> <dcterms:modified>2004-09-07 06:50:11.0</dcterms:modified> <tn:rankString>gen.</tn:rankString> <tn:nameComplete>Renistipula</tn:nameComplete> <tn:uninomial>Renistipula</tn:uninomial> <tn:authorship>Borhidi</tn:authorship> <tn:publishedIn>Acta Bot. Hung. 46(1-2): 122 (-123). 2004</tn:publishedIn> <tn:year>2004</tn:year> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>Renistipula galeottii (Standl.) Borhidi</dc:title> <tn:typeName rdf:resource="urn:lsid:ipni.org:Names:60435735-2"/> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:70029497-1:1.1"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>Astrophytum subgen. Stigmatodactylus D.R.Hunt</dc:title> <dcterms:created>2003-10-15 08:31:34.0</dcterms:created> <dcterms:modified>2003-10-15 08:31:34.0</dcterms:modified> <tn:rankString>subgen.</tn:rankString> <tn:nameComplete>Astrophytum subgen. Stigmatodactylus</tn:nameComplete> <tn:genusPart>Astrophytum</tn:genusPart> <tn:infragenericEpithet>Stigmatodactylus</tn:infragenericEpithet> <tn:authorship>D.R.Hunt</tn:authorship> <tn:publishedIn>Cactaceae Syst. Init. 16: 4 (11 Oct. 2003).</tn:publishedIn> <tn:year>2003</tn:year> <tn:hasAnnotation> <tn:NomenclaturalNote> <tn:noteType rdf:resource="&tn;#basedOn"/> <tn:objectTaxon rdf:resource="urn:lsid:ipni.org:names:70028178-1"/> </tn:NomenclaturalNote> </tn:hasAnnotation> </tn:TaxonName>  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:907328-1:1.1.4.2.2.1"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>Gentlea costaricensis Lundell</dc:title> <dcterms:created>2003-07-02</dcterms:created> <dcterms:modified>2006-01-15</dcterms:modified> <tn:rankString>spec.</tn:rankString> <tn:nameComplete>Gentlea costaricensis</tn:nameComplete> <tn:genusPart>Gentlea</tn:genusPart> <tn:specificEpithet>costaricensis</tn:specificEpithet> <tn:authorship>Lundell</tn:authorship> <tn:publishedIn>Wrightia 6(5): 115 (1980), nom. nov.:.</tn:publishedIn> <tn:hasAnnotation> <tn:NomenclaturalNote> <tn:noteType rdf:resource="&tn;#replacementNameFor"/> <tn:objectTaxon rdf:resource="urn:lsid:ipni.org:names:587264-1"/> </tn:NomenclaturalNote> </tn:hasAnnotation> </tn:TaxonName>  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:1002492-1:1.1.2.2"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>Maxillaria machinazensis D.E.Benn. & Christenson</dc:title> <dcterms:created>2003-07-02 00:00:00.0</dcterms:created> <dcterms:modified>2006-01-15 06:10:58.0</dcterms:modified> <tn:rankString>spec.</tn:rankString> <tn:nameComplete>Maxillaria machinazensis</tn:nameComplete> <tn:genusPart>Maxillaria</tn:genusPart> <tn:specificEpithet>machinazensis</tn:specificEpithet> <tn:authorship>D.E.Benn. & Christenson</tn:authorship> <tn:publishedIn>in Lindleyana, 13(2): 71 (1998)â.</tn:publishedIn> <tn:year>1998</tn:year> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>M. Cavero B. et al. 1642, NY (holo)</dc:title> <tn:typeSpecimen>M. Cavero B. et al. 1642, NY</tn:typeSpecimen> <tn:typeOfType rdf:resource="&tn;#holo" /> </tn:NomenclaturalType> </tn:typifiedBy> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>M. Cavero B. et al. 1642, USM (iso)</dc:title> <tn:typeSpecimen>M. Cavero B. et al. 1642, USM</tn:typeSpecimen> <tn:typeOfType rdf:resource="&tn;#iso" /> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:199571-2:1.2"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>Piper wrightii C.DC.</dc:title> <dcterms:created>2004-01-20 00:00:00.0</dcterms:created> <dcterms:modified>2004-12-23 14:43:26.0</dcterms:modified> <tn:rankString>spec.</tn:rankString> <tn:nameComplete>Piper wrightii</tn:nameComplete> <tn:genusPart>Piper</tn:genusPart> <tn:specificEpithet>wrightii</tn:specificEpithet> <tn:authorship>C.DC.</tn:authorship> <tn:publishedIn>Symb. Antill. 3: 189. 1902</tn:publishedIn> <tn:year>1902</tn:year> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>Wright 1687, GOET (lecto)</dc:title> <tn:typeOfType rdf:resource="&tn;#lecto"/> <tn:typeSpecimen>Wright 1687, GOET</tn:typeSpecimen> <tn:lectotypePublication>Saralegui in Fl. Rep. Cuba, Ser. A. Pl. Vasc. 9(3): 80. 2004</tn:lectotypePublication> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:265591-2:1.3.2.1"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>Vincetoxicum gentlei Lundell & Standl.</dc:title> <dcterms:created>2004-01-20 00:00:00.0</dcterms:created> <dcterms:modified>2006-01-18 13:49:55.0</dcterms:modified> <tn:rankString>spec.</tn:rankString> <tn:nameComplete>Vincetoxicum gentlei</tn:nameComplete> <tn:genusPart>Vincetoxicum</tn:genusPart> <tn:specificEpithet>gentlei</tn:specificEpithet> <tn:authorship>Lundell & Standl.</tn:authorship> <tn:publishedIn>Field Mus. Nat. Hist., Bot. Ser. 17: 269. 1937</tn:publishedIn> <tn:year>1937</tn:year> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>P. Gentle 1779, MICH (lecto)</dc:title> <tn:typeOfType rdf:resource="&tn;#lecto"/> <tn:typeSpecimen>P. Gentle 1779, MICH</tn:typeSpecimen> <tn:lectotypePublication>W.D.Stevens in Novon 15(4): 612. 2005 [12 Dec 2005]</tn:lectotypePublication> </tn:NomenclaturalType> </tn:typifiedBy> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>P. Gentle 1779, A (isolecto)</dc:title> <tn:typeOfType rdf:resource="&tn;#iso"/> <tn:typeSpecimen>P. Gentle 1779, A</tn:typeSpecimen> <tn:lectotypePublication>W.D.Stevens in Novon 15(4): 612. 2005 [12 Dec 2005]</tn:lectotypePublication> </tn:NomenclaturalType> </tn:typifiedBy> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>P. Gentle 1779, F (isolecto)</dc:title> <tn:typeOfType rdf:resource="&tn;#iso"/> <tn:typeSpecimen>P. Gentle 1779, F</tn:typeSpecimen> <tn:lectotypePublication>W.D.Stevens in Novon 15(4): 612. 2005 [12 Dec 2005]</tn:lectotypePublication> </tn:NomenclaturalType> </tn:typifiedBy> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>P. Gentle 1779, MO (isolecto)</dc:title> <tn:typeOfType rdf:resource="&tn;#iso"/> <tn:typeSpecimen>P. Gentle 1779, MO</tn:typeSpecimen> <tn:lectotypePublication>W.D.Stevens in Novon 15(4): 612. 2005 [12 Dec 2005]</tn:lectotypePublication> </tn:NomenclaturalType> </tn:typifiedBy> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>P. Gentle 1779, P (isolecto)</dc:title> <tn:typeOfType rdf:resource="&tn;#iso"/> <tn:typeSpecimen>P. Gentle 1779, P</tn:typeSpecimen> <tn:lectotypePublication>W.D.Stevens in Novon 15(4): 612. 2005 [12 Dec 2005]</tn:lectotypePublication> </tn:NomenclaturalType> </tn:typifiedBy> <tn:typifiedBy> <tn:NomenclaturalType> <dc:title>P. Gentle 1779, US (isolecto)</dc:title> <tn:typeOfType rdf:resource="&tn;#iso"/> <tn:typeSpecimen>P. Gentle 1779, US</tn:typeSpecimen> <tn:lectotypePublication>W.D.Stevens in Novon 15(4): 612. 2005 [12 Dec 2005]</tn:lectotypePublication> </tn:NomenclaturalType> </tn:typifiedBy> </tn:TaxonName>  <tn:TaxonName rdf:about="urn:lsid:ipni.org:names:60433970-2:1.2"> <tn:nomenclaturalCode rdf:resource="&tn;#botanical" /> <dc:title>Malus orientalis var. montana (Uglitzk.) Ponomar.</dc:title> <dcterms:created>2004-07-01 04:44:21.0</dcterms:created> <dcterms:modified>2004-11-25 11:58:58.0</dcterms:modified> <tn:rankString>var.</tn:rankString> <tn:nameComplete>Malus orientalis var. montana</tn:nameComplete> <tn:genusPart>Malus</tn:genusPart> <tn:specificEpithet>orientalis</tn:specificEpithet> <tn:infraspecificEpithet>montana</tn:infraspecificEpithet> <tn:authorship>(Uglitzk.) Ponomar.</tn:authorship> <tn:basionymAuthorship>Uglitzk.</tn:basionymAuthorship> <tn:combinationAuthorship>Ponomar.</tn:combinationAuthorship> <tn:publishedIn>Sborn. Nauchn. Trudov Prikl. Bot. Genet. Selekts. 146: 7. 1992</tn:publishedIn> <tn:year>1992</tn:year> <tn:hasBasionym rdf:resource="urn:lsid:ipni.org:Names:895825-1"/> <tn:hasAnnotation> <tn:NomenclaturalNote> <tn:noteType rdf:resource="&tn;#laterHomonymOf"/> <tn:objectTaxon rdf:resource="urn:lsid:ipni.org:names:985539-1"/> </tn:NomenclaturalNote> </tn:hasAnnotation> <tn:hasAnnotation> <tn:NomenclaturalNote> <tn:noteType rdf:resource="&tn;#publicationStatus"/> <tn:note>nom. illeg. later homonym</tn:note> </tn:NomenclaturalNote> </tn:hasAnnotation> </tn:TaxonName> </rdf:RDF>

Roderic Page

12:18

Dear Peter, Here are some thoughts. On 22 Sep 2006, at 11:42, <peter.hollas@thomson.com> wrote:

...

Hi everyone,

I am in the process of putting together the LSID service component of Zoobank.org, after a few initial hiccups and difficulties I have a basic service up and running in the development environment.

Now I have arrived at the contentious issue of what data/metadata to return and in which flavour.

I have a few questions...

1. What format should I return the metadata response in? From discussions with Roger it seems appropriate to go with RDF/TCS for the metadata reponse. Would this be a good way to go, and does anyone have any static examples of such a response?

I think TCS is a bit limited as it currently stands. For example, I'd strongly encourage linking to publications via GUIDs where ever possible, including DOIs. Including bibliographic information simply as text seems retrograde to me. This is particularly an issue for ZooBank, because you potentially could serve LSIDs for all the zoological literature indexed in Zoological Record (as well as DOIs where ever they have been assigned).

...

2. If a Data service were implemented, should the response simply be the LSID or the taxon name as we don't currently hold any holotype photos etc.

I'm not sure you have any "data" as such. I would treat names and/or concepts as metadata. If you wanted to include other items, such as publications or images, then I would include these by linking them in the metadata, using rdf:resource attributes. Let's say you have a tag "holotypeImage" (I'm not advocating you have such a tag), then <holotypeImage rdf:resource="urn:lsid:zoobank.org:image:156545" /> would do the trick. For an image, I think you do have data (namely the stream of bytes corresponding to the image file, in other words, if the image is a TIFF file the data is that file). For example, we serve images from SID (http://sid.zoology.gla.ac.uk/) as data. To see an example of RDF metadata for an image, try http://lsidres.org/urn:lsid:sid.zoology.gla.ac.uk:id:2000. If you use LSID LaunchPad, you should be able to get the actual image from this LSID lsidres://urn:lsid:sid.zoology.gla.ac.uk:id:2000 I'm also guessing that Zoobank won't itself actually have many (any?) holotype images, as these will be stored by other sites, so most times you'll want to link to the database that actually curates the images. In the same way, presumably you'd link to an external provider's information on the holotype specimen, rather than duplicate that information. This may provoke a fire storm, but I think the data/metadata distinction is in most cases overblown.

...

2. Is there any standard scheme for LSID discovery? i.e. Would it be a good/bad idea to extend the LSID service to allow machine queries of LSIDs by taxon name rather than discovering them through the web interface?

I think it would be a bad idea to extend LSIDs as such, but absolutely vital to add a search interface. My vote is OpenSearch, which is simple, open, and supported. See http://ispecies.blogspot.com/2006/07/adding-sources-to-ispecies.html for some links on this topic. You could have a simple service that returned a list of LSIDs that matched a query. The other advantage of OpenSearch is that you can display the search results using news feed readers, which means you kill two birds with one stone -- a human readable display and a computer readable format. Regards Rod

...

Any comments and suggestions are very welcome!

Regards, Peter.

Peter Hollas MSc BSc(hons) (Peter.Hollas@thomson.com) Software Engineer / Systems Administrator Thomson Zoological Innovation Centre York Science Park Heslington York YO10 5DG

Tel: 01904-435113 Fax: 01904-435114

_______________________________________________ TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid

------------------------------------------------------------------------ ---------------------------------------- Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html iChat: aim://rodpage1962 reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species: http://ispecies.org Rod's rants on phyloinformatics: http://iphylo.blogspot.com Rod's rants on ants: http://semant.blogspot.com

peter.hollas＠thomson.com

13:13

Rod and Sally, Thanks for your posts, they are very helpful! I agree that for ZooBank it's probably best to just return metadata instead of data, as at the moment we don't have any useful such as holotype photos. Indeed I thought that we could just link to them using an RDF resource anyway. Our decision to go with TCS/RDF was largely based on the flexibility it offers. Often we don't have any information about a name other than the name itself and classification (such as a higher taxa like "animalia"). TCS doesn't impose restrictions on which fields must be included, and seems to fit much better to the datasets we have. I'm sure that we'll be including Dublic Core where possible too. Once I get something up and running I'll post some example endpoints for comment. Regards, Peter. -----Original Message----- From: tdwg-guid-bounces@mailman.nhm.ku.edu [mailto:tdwg-guid-bounces@mailman.nhm.ku.edu] On Behalf Of Roderic Page Sent: 22 September 2006 13:18 To: <tdwg-guid@mailman.nhm.ku.edu> <tdwg-guid@mailman.nhm.ku.edu> Subject: Re: [Tdwg-guid] Zoobank LSID Service Dear Peter, Here are some thoughts. On 22 Sep 2006, at 11:42, <peter.hollas@thomson.com> wrote:

...

Hi everyone,

I am in the process of putting together the LSID service component of Zoobank.org, after a few initial hiccups and difficulties I have a basic service up and running in the development environment.

Now I have arrived at the contentious issue of what data/metadata to return and in which flavour.

I have a few questions...

1. What format should I return the metadata response in? From discussions with Roger it seems appropriate to go with RDF/TCS for the metadata reponse. Would this be a good way to go, and does anyone have any static examples of such a response?

...

2. If a Data service were implemented, should the response simply be the LSID or the taxon name as we don't currently hold any holotype photos etc.

...

2. Is there any standard scheme for LSID discovery? i.e. Would it be a good/bad idea to extend the LSID service to allow machine queries of LSIDs by taxon name rather than discovering them through the web interface?

...

Any comments and suggestions are very welcome!

Regards, Peter.

Peter Hollas MSc BSc(hons) (Peter.Hollas@thomson.com) Software Engineer / Systems Administrator Thomson Zoological Innovation Centre York Science Park Heslington York YO10 5DG

Tel: 01904-435113 Fax: 01904-435114

_______________________________________________ TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid

Roger Hyam

13:17

Hi Peter, Glad to hear you are moving ahead with the LSID resolver. peter.hollas@thomson.com wrote:

...

Hi everyone,

I am in the process of putting together the LSID service component of Zoobank.org, after a few initial hiccups and difficulties I have a basic service up and running in the development environment.

Now I have arrived at the contentious issue of what data/metadata to return and in which flavour.

I have a few questions...

1. What format should I return the metadata response in? From discussions with Roger it seems appropriate to go with RDF/TCS for the metadata reponse. Would this be a good way to go, and does anyone have any static examples of such a response?

Sally has sent the version we are currently working with. At the St Louis meeting next month I hope this will become part of a larger TDWG ontology and the name spaces stabilize. From the implementation point of view any changes are likely to be minor though. If you can produce this you should be able to produce the 'final' version.

...

2. If a Data service were implemented, should the response simply be the LSID or the taxon name as we don't currently hold any holotype photos etc.

I wouldn't bother returning anything for 'data' about names. This is debated but as no one can agree on what it definitely should be (if anything) so nobody is likely to call the getData method anyhow. Holotype images are likely to have LSIDs of their own and be linked to from the name metadata.

...

2. Is there any standard scheme for LSID discovery? i.e. Would it be a good/bad idea to extend the LSID service to allow machine queries of LSIDs by taxon name rather than discovering them through the web interface?

I have been proposing that a standard harvesting protocol should be adopted. OAI was the top candidate but I haven't had time to look into it and I don't think anybody else has. I just noticed Rod is suggesting OpenSearch. I'd like to see some one have a look into these alternatives after October and try and come up with some recommendations/applicability statements. OAI has the notion of subsets of data and it would be good to have some method of sharing subset names etc. All the best, Roger

...

Any comments and suggestions are very welcome!

Regards, Peter.

Peter Hollas MSc BSc(hons) (Peter.Hollas@thomson.com) Software Engineer / Systems Administrator Thomson Zoological Innovation Centre York Science Park Heslington York YO10 5DG

Tel: 01904-435113 Fax: 01904-435114

_______________________________________________ TDWG-GUID mailing list TDWG-GUID@mailman.nhm.ku.edu http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid

-- ------------------------------------- Roger Hyam Technical Architect Taxonomic Databases Working Group ------------------------------------- http://www.tdwg.org roger@tdwg.org +44 1578 722782 -------------------------------------

Sally Hinchcliffe

13:23

Hi all Roger said:

...

I have been proposing that a standard harvesting protocol should be adopted. OAI was the top candidate but I haven't had time to look into it and I don't think anybody else has. I just noticed Rod is suggesting OpenSearch. I'd like to see some one have a look into these alternatives after October and try and come up with some recommendations/applicability statements. OAI has the notion of subsets of data and it would be good to have some method of sharing subset names etc.

We're going to start looking at OAI in the context of GBIF using it to keep up to date with IPNI data. We have been held up for the last few months but should begin investigating this soon. Sally *** Sally Hinchcliffe *** Computer section, Royal Botanic Gardens, Kew *** tel: +44 (0)20 8332 5708 *** S.Hinchcliffe@rbgkew.org.uk

Robert Huber

25 Sep 25 Sep

08:48

Roger, we use OAI to sync data from various sources on our marine science portals and it works fine.. we use it for metadata in DIF XML and ISO 19136 format. For some first tests with OAI you could try the DLESE software which is available at: http://www.dlese.org/library/index.jsp well, I think the Lucene will not work with metadata other than dublin core, but harvesting etc should work fine with metadata in e.g. darwin core.. best regards, Robert Dr. Robert Huber WDC-MARE / PANGAEA - www.pangaea.de, www.wdc-mare.org Stratigraphy.net - www.stratigraphy.net _____________________________________________ MARUM - Institute for Marine Environmental Sciences (location) University Bremen Leobener Strasse POP 330 440 28359 Bremen Phone ++49 421 218-65593, Fax ++49 421 218-65505 e-mail rhuber@wdc-mare.org, robert.huber@stratigraphy.net

...

I have been proposing that a standard harvesting protocol should be adopted. OAI was the top candidate but I haven't had time to look into it and I don't think anybody else has. I just noticed Rod is suggesting OpenSearch. I'd like to see some one have a look into these alternatives after October and try and come up with some recommendations/applicability statements. OAI has the notion of subsets of data and it would be good to have some method of sharing subset names etc.

Steven Perry

22 Sep 22 Sep

20:58

Hi Peter, peter.hollas@thomson.com wrote:

...

2. Is there any standard scheme for LSID discovery? i.e. Would it be a good/bad idea to extend the LSID service to allow machine queries of LSIDs by taxon name rather than discovering them through the web interface?

Any comments and suggestions are very welcome!

Regards, Peter.

In my view, LSID is primarily a naming scheme. While it does allow for the resolution of data objects, it was not designed to support other common data access tasks such as discovery, search, query, or harvest. To support these additional data access tasks I feel we ought to look to other standard or well-established protocols. Some protocols, like the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) are well suited to harvesting. Others, like the W3C's SPARQL protocol are well suited to search and query. Unfortunately there's no single protocol for working with RDF data that does everything we need. The problem for data providers is that each additional protocol that they are asked to support adds an additional burden on them. At the same time, the easier a provider makes it to access their data, the more their data will be used. However, many data providing organizations wish to devote their resources to the creation and curation of data, rather than to the implementation of data access protocols. We're hoping to provide an end-to-end solution for this problem with the wasabi project (formally known as DiGIR2). While the rest of this message is a bit of a plug, for wasabi, it also acts to illustrate the fact that different stakeholders in a data network have different requirements for data access. The central idea behind wasabi is that data providing organizations generate RDF data objects, assign LSIDs to them, and drop them into a locally-installed wasabi server. The wasabi server makes data objects available through a variety of standard data access protocols. It supports LSID to metadata resolution through a simple HTTP-get protocol and through a plugin to the IBM LSID resolver. The wasabi server also allows data aggregators to use OAI-PMH to efficiently fetch data objects in bulk so that they can be indexed for search. One benefit of the OAI protocol, especially the implementation in the wasabi server, is that it allows for incremental harvesting. Because it only sends the data objects that have changed since the last harvest, OAI-PMH decreases the load on a provider's server and allows for fast indexing of their data. Finally the wasabi server provides direct SPARQL query access to data objects. Any time a wasabi server is queried (through OAI, SPARQL, or LSID resolution), the access is logged. This helps data providers keep track of who is using their data. If a provider doesn't want to write a custom program that generates RDF/XML they can use the wasabi server's synchronizer program, along with a concept mapping configuration file, to periodically connect to their database, transform its contents into RDF data objects, and load them into the wasabi server. The wasabi server will keep track of which objects are newly added, deleted, or updated. The RDF server is only one component of the wasabi project. It also includes a library that implements the client side of the supported data access protocols. This makes it easy for people to write custom software to grab data from wasabi servers. Researchers who want to gather large amounts of data for analysis can use the client library to simplify the task. Another important part of the wasabi project is the indexer. The indexer supports harvesting from multiple distributed wasabi servers. Harvested data are then pushed into the indexer which can generate indices that are designed to support various types of queries. For example, the indexer can use a Google-style inverted index that is well suited to full-text queries, a database with geospatial extensions to support geographical queries, or even a triple store. The wasabi indexer could be used by large data aggregators like GBIF or by custom software developers. One example of the later are the developers of collections management software who might want to index and cache data from TCS providers so that their users can associate specimens with taxon concepts through an easy to use desktop software package. Both the client library and the indexer allow computers to access and work with data. The wasabi project also provides an extensible web portal. Built over the client library and an index of data objects harvested from wasabi servers, the portal component allows data to be accessed by people through a customizable interface. It supports browsing, searching, and downloading of data objects. One use for the portal component might be as the public face of a thematic data networks like FishNet2, MaNIS, OrNIS, or HerpNet. The wasabi project is free and open source. It is implemented in Java. The server, client library, and indexer portions of the project are now in beta and we plan to release them by the end of the year. The portal is still under active development. I'll be presenting wasabi at the TDWG meeting in St. Louis next month. -Steve

6801

Age (days ago)

6804

Last active (days ago)

List overview

Download

7 comments

6 participants

participants (6)

peter.hollas＠thomson.com
Robert Huber
Roderic Page
Roger Hyam
Sally Hinchcliffe
Steven Perry