[tdwg-content] status of uBio

Richard Pyle deepreef at bishopmuseum.org
Fri Oct 16 21:25:21 CEST 2015

> It's somewhat clear what uBio identifiers refer to: names vs. 

> something more nebulous involving taxa or ... something.  

> (Not trying to push your button, Rich Pyle).




Actually, saying that uBio identifiers refer to “names” is a bit like saying uBio identifiers refer to “stuff” (which is only slightly more ambiguous than “names”).  I think what you mean (a bit more explicitly) is that uBio identifiers refer to “name-strings”, wherein each unique literal UTF-8-encoded string of characters purported to represent a scientific name receives an integer as an identifier.


GNI also assigns unique identifiers to such name-strings, but the difference is that the GNI identifier is a hash of the string itself.  Thus, given a string, you can algorithmically derive the GNI identifier for it.  There is no way to algorithmically convert the uBio name-strings into their corresponding integer uBio identifiers.


I have a question, and a suggestion.


Question: Are new uBio integer identifiers ever going to be minted?  Or do we just want to maintain them for legacy purposes?


Suggestion: Regardless of the answer to the question, I suggest that Dima generate an index for uBio integer identifiers and the corresponding hash uuid used by GNI.  I will import these into BioGUID.org, so going forward we will always have an index to translate the uBio integers into the GN hashed UUIDs.  From there, it would be a simple step to allow GN services to process uBio identifiers natively, and do all the cool things that Steve (and others) would like uBio/GN to do.






From: tdwg-content-bounces at lists.tdwg.org [mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Steve Baskauf
Sent: Friday, October 16, 2015 3:09 AM
To: Dmitry Mozzherin
Cc: Chuck Miller; tdwg-content at lists.tdwg.org; Jonathan A Rees; Shorthouse, David
Subject: Re: [tdwg-content] status of uBio


Thanks all for the information and comments about the status of uBio.  I'm glad to hear that the server will probably come back up.  If not, then I hope the data will be made available to those who said they would be willing to host it.

I have been interested in using the uBio identifiers for several reasons:
1. They have managed to stick around for a long time and are stable in their format (as LSIDs and HTTP proxied LSIDs).
2. The coverage of names is really good for plants, animals, different geographic locations, etc.  I also use ITIS identifiers but it's fairly common for me to not be able to find one for the name I need, which almost never happens with uBio.
3. It's somewhat clear what uBio identifiers refer to: names vs. something more nebulous involving taxa or ... something.  (Not trying to push your button, Rich Pyle).  
4. You can actually get RDF associated with the LSID version of the uBio identifiers.  I was wanting to download some to play with in our new triplestore (http://rdf.library.vanderbilt.edu) when I discovered that the server was down.  The RDF is somewhat ad hoc, but hey, it's there.

There isn't really any other source that has all of these characteristics.  So please keep uBio going indefinitely, if at all possible.  

Dmitry Mozzherin wrote: 

I had been administering uBio for the last year, but now I am moving from MBL. uBio machine is in a bad shape, and it crashes after a few hours of work. My plan is to create Docker containers for database, code and data, which should make whole system much more stable, and much more manageable. Good news I will definitely try my best to do it, the bad news I am spread thinner than usual with move, transferring hardware and grant, GN things, EOL things, and figuring out what to do with the house etc. uBio 'code' part is about 35 Gb, which makes the task more complicated, but I am quite optimistic that I will be able to make containers and put them either on an  MBL machine, run it from University of Illinois, or give it to Naturalis -- depending on what will make more sense for Dave Remsen, MBL and all interested in the project.


Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
PMB 351634
Nashville, TN  37235-1634,  U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20151016/1faadcdc/attachment.html 

More information about the tdwg-content mailing list