Hi David,

You might want to look at GeoNames - you can download the data and import it onto your  own app pretty easily.

Han did this pretty quickly last summer.  

If it not perfect for your needs you might find it useful since it contains useful language variations for locations.



- Pete

On Fri, May 17, 2013 at 9:57 AM, Shorthouse, David <davidpshorthouse@gmail.com> wrote:

The Canadensys development team, http://www.canadensys.net is looking
for efficient, low-maintenance ways to validate and reconcile data in
its National cache of occurrence data. We are working on a Java
library to initially tackle single-field Darwin Core validations,
https://github.com/Canadensys/narwhal-processor. We hope this library
is sufficiently generalized for uses outside our project.

Our current challenge is to reconcile country names, which requires
access to an up-to-date, well-maintained knowledge base of country
names, their alternative representations (possibly multilingual), and
mappings to known misspellings. For performance reasons, we'd like
this thesaurus to be embedded in the library, but with the capacity to
be periodically refreshed with data pulled from external resources
such as dbpedia.org. This clearly has ties to semantic web thinking
and, because we're new to the tools and services in this space, we'd
like to solicit pointers and feedback such that we build this part of
our library with maximal benefit to other projects. We started
collecting thoughts here:


David P. Shorthouse
Christian Gendreau
tdwg mailing list

Pete DeVries
Semantic Web / Linked Open Data
Encyclopedia of Life
Marine Biological Laboratory 
7 MBL Street Woods Hole MA 02543
Email: pdevries@mbl.edu

University of Wisconsin - Madison Entomology
TaxonConcept / GeoSpecies Knowledge Bases
Email: pdevries@wisc.edu