[tdwg-guid] Testing TCS and some ideas on GUIDs and ontologies: ranking synomymies
Dear all,
I just have finished a TCS XML demo page at: http://taxonconcept.stratigraphy.net/taxon_tcs?taxid=793 which is a TCS representation of synonym list (concepts) data on the taxon 'Subbotina patagonica' published by Berggren & Norris (1997) and Pearson et al. (2006), see a html output here: http://taxonconcept.stratigraphy.net/taxon_details?taxid=793
The XML valdates against the schema, but of course, what I have implemented is a 'interpretation' of the standard, e.g. I have now summarized the two synonymies to one DataSet, and I am not sure if this granularity makes sense. So I would like to ask you to check if the output looks reasonable.
I was really happy to see how good TCS fits to my data and after I have seen the first results I realized how good this fits to an experiment I was trying some months ago:
I tried to apply the Google PageRank algorithm to synonymies and as a first attempt I calculated what I called TaxonRank. I have provided some more info on this here: http://stratigraphynet.blogspot.com/2008/01/how-taxonrank-works.html (+ my geoinformatics community prepared a special volume on ontologies etc and I have also submitted a paper to Computers&Geosciences which will be published soon).
After seeing the TCS XML output and reading many RDF and ontology related mailing list contributions I wonder if anybody has ever thought about doing something like this before? My tests are on synonymy list (concepts) only, but I think similar approaches could lead to interesting results also if applied to e.g. genetics. And..wouldn't be ranking based on such ontologies a killer application for GUIDs? apologies for cross posting..
Robert
On 28 Feb 2008, at 11:12, Robert Huber wrote:
I tried to apply the Google PageRank algorithm to synonymies and as a first attempt I calculated what I called TaxonRank. I have provided some more info on this here: http://stratigraphynet.blogspot.com/2008/01/how-taxonrank-works.html (+ my geoinformatics community prepared a special volume on ontologies etc and I have also submitted a paper to Computers&Geosciences which will be published soon).
After seeing the TCS XML output and reading many RDF and ontology related mailing list contributions I wonder if anybody has ever thought about doing something like this before? My tests are on synonymy list (concepts) only, but I think similar approaches could lead to interesting results also if applied to e.g. genetics. And..wouldn't be ranking based on such ontologies a killer application for GUIDs? apologies for cross posting..
I've argued in the past (e.g., http://iphylo.blogspot.com/2006/01/finding-good-phylogenies-using.html ) that PageRank-style algorithms can be applied to all manner of links, and there is research suggesting that even incomplete links can generate a useful ranking (http://iphylo.blogspot.com/2008/02/incomplete-citation-and- ranking.html ).
I've a manuscript in review for Briefings in Bioinformatics that applies PageRank to rank some ant specimens, based on links to images, sequences, and publications. I think there's enormous scope for exploring link graphs, and this is yet another reason for GUIDs -- the trick is to reuse them across multiple data sets. If we don't have shared identifiers then we won't be able to create the links. Hence, it is vital that we reuse GUIDs for taxonomic names and literature, or at a minimum have a mapping service that links alternative GUIDs for the same object. If each biodiversity project issues its own GUIDs for the same journal articles, for example, then there won't be many "killer apps".
Regards
Rod
------------------------------------------------------------------------ ---------------------------------------- Professor Roderic D. M. Page DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html iChat: aim://rodpage1962 reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Find out what we know about a species: http://ispecies.org Rod's rants on phyloinformatics: http://iphylo.blogspot.com Rod's rants on ants: http://semant.blogspot.com
participants (2)
-
Robert Huber
-
Roderic Page