[tdwg-content] ITIS TSNID to uBio NamebankIDs mapping

Cynthia Parr parrc at si.edu
Tue May 24 23:08:11 CEST 2011


>
> Quick note -- EOL is also aligning ITIS tsn's with Catalogue of Life, NCBI
> identifiers and many other partner sources. They don't all update with the
> same frequency on EOL but shouldn't be too stale.
>
> It might be a little complicated, but you can use the EOL API and find out
> the cross-walks http://www.eol.org/api (see expecially the PAGES method)
>
> Steve, since Bioimages is an EOL partner you could even see exactly how we
> align your specific taxa with others.
>
> Cyndy
>
> Cynthia Sims Parr, parrc at si.edu
> Director, Species Pages Group
> Encyclopedia of Life http://www.eol.org
> Office: 202.633.9513, Fax: 202.633.8742
>
> Mailing address:
> National Museum of Natural History
> Smithsonian Institution
> P.O. Box 37012, MRC 106
> Washington, DC 20013-7012
>
>
>   On Tue, May 24, 2011 at 4:50 PM, Nicolson, David <NICOLSOD at si.edu>wrote:
>
>> Hi Steve (and Dave),
>>
>> [NB: After having composed the email below, just before sending it, I
>> re-read your initial email more carefully and realized that you said you
>> already had the ITIS TSNs, and were looking to add the NamebankIDs! Doh!
>> Well, in case you (or anyone else) is interested in methods of matching
>> names to get TSNs, I'll go ahead and send this anyway. But do note the
>> comments below about the ITIS "versions" and ongoing overhaul of the
>> vascular plant data in ITIS!!! -Dave]
>>
>> I noticed this just before leaving work last week, and was out yesterday,
>> but I wanted to chime in on this. I'm glad the uBio tools are meeting your
>> needs (they do have some cool stuff!), but it should be noted that those
>> tools are using a static snapshot of ITIS data from January 2009, and we
>> have added about 50,000 additional scientific names, and updated tens of
>> thousands of names beyond that (most of that in the last 6 months, as the
>> frequency of loads dropped off in 2009-2010 due to technical issues).
>>
>> I also want to note that ITIS is right in the middle of a full update of
>> the vascular plant data in ITIS, and we're loading updated families on a
>> monthly basis... and at long last we are tackling all the leftover issues
>> from several bulk loads from USDA PLANTS data that left unreconciled bits of
>> ITIS' older vascular plant data in various confusing states... so it is a
>> VAST improvement that is underway.
>>
>> There are several options for bouncing your names off the current version
>> of ITIS.
>>
>> One is to automate a matching process using the live ITIS data, based on
>> the existing ITIS Web Services. I am CC'ing Alan Hampson, our IT fellow who
>> built the Web Services ( http://www.itis.gov/web_service.html ), in case
>> you'd like to follow up with him on that option. The advantage is that once
>> you have a process in place it is completely self-serve and can always
>> utilize the current ITIS data. If you have the resources to do this I think
>> it would be greatly to your advantage to use this approach.
>>
>> You can explore some ideas for client software to use the services at:
>> http://www.itis.gov/ws_develop.html
>>
>> And for more information on ITIS web services try
>> http://www.itis.gov/ws_description.html
>> http://www.itis.gov/ITISWebService.xml
>>
>> The ability to flag multiply-matched names (as you noted) should probably
>> be considered, so that appropriate manual steps can be taken. This solution
>> will allow you to take advantage of subsequent updates to ITIS with a
>> minimum of additional effort, and given that the plant data are in the
>> middle of a major overhaul, this bears consideration!
>>
>> Another possibility is to grab a full snapshot of the ITIS data, and load
>> it into a database so you can do what you wish. The obvious drawback is that
>> it goes out of date, as with the ITIS snapshot uBio is currently using. But
>> it puts you in the driver's seat re what to do & getting new versions of
>> ITIS. Some general information about the full exports is in the following
>> page, although conspicuously absent is any mention of the MySQL version
>> which (assuming you have the free MySQL properly installed & configured) can
>> be loaded with just a few clicks or a few command lines (depending on your
>> platform):
>> http://www.itis.gov/ftp_download.html
>> And the current ITIS data are all here for downloading:
>> http://www.itis.gov/downloads/
>>
>> A third option, which I note with some trepidation, is the old "Compare
>> Nomenclature/Taxonomy" function on the ITIS site:
>> http://www.itis.gov/taxmatch_ftp.html
>> This is a VERY old function that we do plan on replacing (timeframe not
>> yet certain), and it is vulnerable to timeouts, etc., which is why it notes
>> to limit the number of names per pass. But with smaller chunks of names it
>> does work quite well. The caveat is that I would make sure to choose the 4th
>> option in Step 4, as it is at least aware (unlike the 3 other options) of
>> multiply-matched name cases, and lists them separately at the bottom of the
>> report. Just a bare listing of the scientific names, with the word "name" at
>> the top, saved as plain text, is all that is needed for input.
>>
>> A final option would be to ask someone at ITIS to handle the matching for
>> you (leaving you to decide re the multiply-matched names). This might be
>> simple from your end, but is suboptimal as it leaves you in the same
>> position as you are now should you want or need to compare names again in
>> the future (whether due to acquiring new names in your system, or wanting to
>> check against a later updated version of ITIS), and it pulls someone here
>> (probably me) off of the push to get more updates into ITIS. But in a pinch,
>> I'm certainly willing to try to help you, should it come down to that! I
>> would just ask that you seriously consider the web services option (in
>> particular) or the others above first.
>>
>> I hope this helps some. If you have already run all your matches against
>> the old "ITIS" data via uBio then you might consider re-running (against the
>> current ITIS data) at least the leftover names that you did not yet get
>> matched. Let us know if you have questions (the itiswebmaster at itis.govaddress goes to myself and Alan and several others, so that might be the
>> best bet for a follow-up unless you have a question specifically for me).
>>
>> Regards,
>> Dave
>>
>> David Nicolson
>> Data Development Coordinator, Integrated Taxonomic Information System
>> Biologist, USGS Core Science Systems, Biological Informatics Program
>> nicolsod at si.edu     Office 202-633-2149    Fax 202-786-2934
>> http://www.itis.gov/
>> http://www.cbif.gc.ca/itis/
>> "Nihil sumas necesse est..."
>>
>>
>> -----Original Message-----
>> Date: Fri, 20 May 2011 05:42:03 -0500
>> From: Steve Baskauf <steve.baskauf at vanderbilt.edu>
>> Subject: Re: [tdwg-content] ITIS TSNID to uBio NamebankIDs mapping
>> To: "David Remsen (GBIF)" <dremsen at gbif.org>
>> Cc: "tdwg-content at lists.tdwg.org" <tdwg-content at lists.tdwg.org>
>> Message-ID: <4DD6457B.2080204 at vanderbilt.edu>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> Thanks, all, for the responses.  The "Compare to ITIS" function does
>> just what I want.  I did a test run of 1000 names and it worked like a
>> charm.  I will need to do a little massaging because sometimes two or
>> more ITIS IDs come back for each uBio ID.  But I can handle that.
>> Steve
>>
>> David Remsen (GBIF) wrote:
>> > Steve
>> >
>> > Have you tried this?
>> > http://www.ubio.org/clients/ITIS/index.php
>> >
>> > or this?
>> > http://www.ubio.org/services/mapper/index2.php
>> >
>> > All this ubio talk makes me think we were on to something.  Worth a
>> thought about adopting the new stnadrds and tools and making it really
>> smooth.
>> >
>> > DR
>> >
>> >
>> > On 20 May 2011, at 04:46, Steve Baskauf wrote:
>> >
>> >
>> >> I have generated a csv spreadsheet of about 39 000 plant names for the
>> >> U.S. which has the ITIS TSNIDs for the names in a column.  I would like
>> >> to have the uBio Namebank IDs in another column of the table.  I have
>> >> been looking them up on the uBio website by typing in the names as I
>> >> need to know the IDs, but after doing about 300 of them, I'm getting
>> >> tired of it.  Does anybody have a clever idea of a way to get the other
>> >> 38 000 Namebank IDs without looking them up.  I'm sure that it would be
>> >> possible to find this out because uBio gets names from ITIS.  However,
>> I
>> >> haven't seen any clues about how to do it in an automated fashion.  I'm
>> >> guessing that there might be some way to use the uBio web services, but
>> >> if so, it isn't obvious and I probably don't have the skills to carry
>> it
>> >> out anyway.
>> >>
>> >> Any ideas?
>> >> Steve
>> >>
>> >> --
>> >> Steven J. Baskauf, Ph.D., Senior Lecturer
>> >> Vanderbilt University Dept. of Biological Sciences
>> >>
>> >> postal mail address:
>> >> VU Station B 351634
>> >> Nashville, TN  37235-1634,  U.S.A.
>> >>
>> >> delivery address:
>> >> 2125 Stevenson Center
>> >> 1161 21st Ave., S.
>> >> Nashville, TN 37235
>> >>
>> >> office: 2128 Stevenson Center
>> >> phone: (615) 343-4582,  fax: (615) 343-6707
>> >> http://bioimages.vanderbilt.edu
>> >>
>> >> _______________________________________________
>> >> tdwg-content mailing list
>> >> tdwg-content at lists.tdwg.org
>> >> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>> >>
>> >>
>> >
>> > .
>> >
>> >
>>
>> --
>> Steven J. Baskauf, Ph.D., Senior Lecturer
>> Vanderbilt University Dept. of Biological Sciences
>>
>> postal mail address:
>> VU Station B 351634
>> Nashville, TN  37235-1634,  U.S.A.
>>
>> delivery address:
>> 2125 Stevenson Center
>> 1161 21st Ave., S.
>> Nashville, TN 37235
>>
>> office: 2128 Stevenson Center
>> phone: (615) 343-4582,  fax: (615) 343-6707
>> http://bioimages.vanderbilt.edu
>>
>>
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>>
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110524/fcf805da/attachment.html 


More information about the tdwg-content mailing list