[Tdwg-guid] LSIDs for PubMed, GenBank, and NCBI Taxonomy

Benjamin H Szekely bhszekel at us.ibm.com
Mon May 8 16:30:10 CEST 2006


Hi Rod,
      The Perl implemented authorities run on port 80 and the Java ones 
(which are NCBI etc..) run on port 9090.  However, NCBI recently upated 
their web services causing our Java stuff to break.  We are in the midst 
of fixing it.  As for the ontologies themselves, I couldn't agree more. 
However, the task of implementing the ontologies was so daunting that as a 
first pass, I just did a naive transliteration from their Web Service XML 
format to RDF and wrote Axis-Jastor translation code in the middle.  My 
hope is to begin replacing predicates with DC predicates where possible, 
and have domain experts suggest ontologies that we could reuse for the 
more bio-specific predicates. 

- Ben

Roderic Page <r.page at bio.gla.ac.uk> wrote on 05/08/2006 04:12:16 AM:

> Dear Ben,
> 
> Who do I contact regarding the lsid.biopathways.org resolver? It's a 
> bit flaky, and I think the DNS SRV record is out of date. My crude LSID 
> tester reports errors (e.g., 
> http://linnaeus.zoology.gla.ac.uk/~rpage/lsid/tester/? 
> q=urn%3Alsid%3Ancbi.nlm.nih.gov.lsid.biopathways.org%3Apubmed%3A12441807 

> ), which I suspect is because the SRV record states the resolver is on 
> port 9090, when it's actually on port 80 (at least, I get a WSDL from 
> http://lsid.biopathways.org:80/authority/ but not 
> http://lsid.biopathways.org:9090/authority/).
> 
> As an aside, I guess this is an interesting management issue -- how 
> does one discover who to complain to when an LSID authority is broken?
> 
> I've had a quick look at the PubMed stuff that I did manage to get out 
> of the Biopathways site. My concern about this is that it perhaps 
> models the PubMed record a little too closely (in the same way that a 
> lot of NCBI's XML is horrendous as it models the underlying ASN.1 - 
> yuck). If we're going to integrate this stuff with other providers 
> (e.g., RDF feeds from the publishing industry or projects like 
> Connotea) then I think we need to adopt generic vocabularies (such as 
> Dublin Core and PRISM), rather than object specific ones. I think the 
> last thing a user wants to do is have to learn a new vocabulary for 
> each data source. That rather defeats the dream of easy integration.
> 
> Regards
> 
> Rod
> 
> 
> On 6 May 2006, at 15:43, Benjamin H Szekely wrote:
> 
> >
> > Hi Rod,
> >      We have implementations of NCBI Pubmed, Genbank, Protein, and 
> > OMIM.  Gene is on the way.  I'm in the process of updating our 
> > implementations since the NCBI upgraded their web service to 1.4.  I 
> > have created OWL ontologies for the various databases that you might 
> > find useful in your implementation.  They will be posted online soon 
> > but I could send them to you if you like.
> >
> > - Ben
> >
> > tdwg-guid-bounces at mailman.nhm.ku.edu wrote on 05/06/2006 06:12:13 AM:
> >
> >  > The web server that hosts most of my LSID work (including the 
> > Taxonomy  
> >  > Search Engine) got hacked recently. Rebuilding it is taking time, 
> > but  
> >  > one positive outcome is I'm trying to clean up some LSID stuff.
> >  >
> >  > I've rebuilt the LSID authority for PubMed, GenBank, and NCBI 
> > taxonomy  
> >  > (by hacking Roger Hyam's PHP code - IBM's Perl stack is giving me 
> > grief  
> >  > on a Fedora Core 4 box). These are, of course, experimental, but  
> >  > hopefully the RDF will be of interest. I've tried to use standard  
> >  > vocabularies, and link  between records wherever possible (e.g., a 
 
> >  > PubMed record for a paper will list sequences referred to in that  
> >  > paper, a sequence will link back to the paper in which it was  
> >  > published, etc.). I've set this up partly to support my attempt to 
 
> >  > build a triple store for ants (see  
> >  > 
> > http://iphylo.blogspot.com/2006/05/ants-rdf-and-triple-stores.html),  
> >  > which I hope to have running in time for GUID2.
> >  >
> >  > You can try out the LSIDs, which have the form
> >  >
> >  > urn:lsid:ncbi.nlm.nih.gov.lsid.zoology.gla.ac.uk:   pubmed or gi or 
 
> >  
> >  > taxon : id
> >  >
> >  > e.g.
> >  >
> >  > urn:lsid:ncbi.nlm.nih.gov.lsid.zoology.gla.ac.uk:pubmed:16601190
> >  >
> >  > urn:lsid:ncbi.nlm.nih.gov.lsid.zoology.gla.ac.uk:gi:87047074
> >  >
> >  > urn:lsid:ncbi.nlm.nih.gov.lsid.zoology.gla.ac.uk:taxon:369204
> >  >
> >  > There's a lot more which could be added to the metadata, but I hope 
 
> >  
> >  > this is of interest.
> >  >
> >  > Regards
> >  >
> >  > Rod
> >  >
> >  >
> >  > 
> > 
----------------------------------------------------------------------- 
> > -
> >  > ----------------------------------------
> >  > Professor Roderic D. M. Page
> >  > Editor, Systematic Biology
> >  > DEEB, IBLS
> >  > Graham Kerr Building
> >  > University of Glasgow
> >  > Glasgow G12 8QP
> >  > United Kingdom
> >  >
> >  > Phone:    +44 141 330 4778
> >  > Fax:      +44 141 330 2792
> >  > email:    r.page at bio.gla.ac.uk
> >  > web:      http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> >  > reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
> >  >
> >  > Subscribe to Systematic Biology through the Society of Systematic
> >  > Biologists Website:  http://systematicbiology.org
> >  > Search for taxon names: 
> > http://darwin.zoology.gla.ac.uk/~rpage/portal/
> >  > Find out what we know about a species: http://ispecies.org
> >  > Rod's rants on phyloinformatics: http://iphylo.blogspot.com
> >  >
> >  >
> >  >    
> >  >    
> >  >      
> >  > ___________________________________________________________
> >  > Win tickets to the 2006 FIFA World Cup Germany with Yahoo! 
> > Messenger.
> >  > http://advision.webevents.yahoo.com/fifaworldcup_uk/
> >  >
> >  >
> >  > _______________________________________________
> >  > TDWG-GUID mailing list
> >  > TDWG-GUID at mailman.nhm.ku.edu
> >  > http://mailman.nhm.ku.edu/mailman/listinfo/tdwg-guid
> >
> ------------------------------------------------------------------------ 

> ----------------------------------------
> Professor Roderic D. M. Page
> Editor, Systematic Biology
> DEEB, IBLS
> Graham Kerr Building
> University of Glasgow
> Glasgow G12 8QP
> United Kingdom
> 
> Phone:    +44 141 330 4778
> Fax:      +44 141 330 2792
> email:    r.page at bio.gla.ac.uk
> web:      http://taxonomy.zoology.gla.ac.uk/rod/rod.html
> reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
> 
> Subscribe to Systematic Biology through the Society of Systematic
> Biologists Website:  http://systematicbiology.org
> Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
> Find out what we know about a species: http://ispecies.org
> Rod's rants on phyloinformatics: http://iphylo.blogspot.com
> 
> 
> Send instant messages to your online friends 
http://uk.messenger.yahoo.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-tag/attachments/20060508/9b0c6c8a/attachment.html 


More information about the tdwg-tag mailing list