On Aug 28, 2010, at 6:59 PM, Blum, Stan wrote:Regarding #1, and assuming that this concerns molecular data where individuals often function as OTUs, I think it would be even more important for the long-term usefulness of the data to support the ability to reference the specimen with a resolvable GUID...On the other hand, a full name backed up by a URL or source/GUID would be a big improvement on codes and abbreviationsMy main aim is to support data integration (rather than validation), and the two most important integrating variables for the foreseeable future (at least in my limited vision) are species name (or other taxonomic identifier) and geographic coordinates. These are important partly because the great mass of users outside of TDWG are committed to using the same kinds of species names and the same kinds of coordinates.Most phylogeny information artefacts (e.g., files) out there don't have either one of these , so integrating phylogenetic information into the global web of data isn't going to get very far until we make it easy for users to put this information into their trees.To the extent that the scientific community is committed in the same way to specimen identifiers, then this makes the problem simpler because the specimen source would become the integrating variable, and would mediate the integration of data by species or location (because the specimen would have a species and a location). But I don't think we are there yet.ArlinOn 8/27/10 8:27 AM, "Arlin Stoltzfus" <arlin@umd.edu> wrote:<ATT00001.txt>
I'm sending this reply to only the tdwg-phylo list (sending to everyone seems like overkill).
Here are two ideas based on the use of phylogenies:
1. For various reasons, its important to be able to associate valid species sources or other universal identifiers (e.g., NCBI gis) with the human-readable OTU identifiers used in tree files, but this typically isn't done and it's not always easy. The goal of this project is to enable ordinary phylogenetics & systematics users to use current standards (Newick, NHX, phyloxml, ...) to associate species names (possibly other tax ids) with phylogenies in their usual workflows. The focus is on developing short-term tools and strategies that might lead to better long-term solutions. In some cases, its just a matter of knowing how to use the file format properly, possibly aided by better tools for data input. For users whose workflows rely on Newick, we would need a way to keep a separate mapping of OTU ids and tax ids, along with tools to interconvert or translate to one of the other formats (this could be as simple as an Excel spreadsheet or as complex as a web service that maintains your mapping and does the translation for you).
2. There is a huge variety of tree viewers. To some extent, users need this variety due to their having different feature sets. But users shouldn't have to choose the viewer based on data format restrictions. The goal of this project is to improve the usability of tree viewers. Assess the interoperability (standards compatibility) of tree viewing software, develop strategies to improve it, and get started on any strategies that can be implemented. Its not possible to modify viewers whose source code is unavailable, but there may be ways to work around this with scripts and translation tools.
Arlin
-------Arlin Stoltzfus (arlin@umd.edu)Fellow, IBBR; Adj. Assoc. Prof., UMCP; Research Biologist, NISTIBBR, 9600 Gudelsky Drive, Rockville, MDtel: 240 314 6208; web: www.molevol.org
<ATT00001.txt>