Hi Rod -
that's a great question, and a good challenge. To be fair, I think this question, and the challenge, applies as much to the life sciences as a whole, and I'll admit that my view on what's been accomplished so far is rather skeptical, if we ask of the things that have been done, what couldn't also have been accomplished with "traditional" data integration approaches, in possibly less time. So with biodiversity science lagging behind the biomedical and other fields in this regard, it isn't a surprise that there aren't glaring success stories here yet.
That notwithstanding, there has been quite a bit of work in other fields (especially biomedicine) on using linked data and semantic web technologies towards data integration for the purposes of (enabling) scientific discovery (see refs below, and also http://www.w3.org/wiki/HCLSIG/LODD) , and these may help in imagining what could be achievable in our own domain. I'll also say, although I haven't been getting much traction on that among biodiversity informatics folks, that the most impactful of these relied critically on abilities to reason over the data. That's a step beyond RDF.
-hilmar Belleau, François, Marc-Alexandre Nolin, Nicole Tourigny, Philippe Rigault, and Jean Morissette. 2008. “Bio2RDF: towards a mashup to build bioinformatics knowledge systems.” Journal of Biomedical Semantics 41 (5) (October): 706-16. doi:10.1016/j.jbi.2008.03.004.
Belleau, François, Nicole Tourigny, Benjamin Good, and Jean Morissette. 2008. Bio2RDF: A Semantic Web Atlas of Post Genomic Knowledge about Human and Mouse. In Data Integration in the Life Sciences (DILS), ed. Amos Bairoch, Sarah Cohen-Boulakia, and Christine Froidevaux, 153-160. Springer Berlin / Heidelberg. doi: 10.1007/978-3-540-69828-9_15.
Cheung, Kei-Hoi, Ernest Lim, Matthias Samwald, Huajun Chen, Luis Marenco, Matthew E Holford, Thomas M Morse, Pradeep Mutalik, Gordon M Shepherd, and Perry L Miller. 2009. “Approaches to neuroscience data integration.” Briefings in Bioinformatics 10 (4) (July): 345-53. doi: 10.1093/bib/bbp029.
On Sep 20, 2011, at 4:40 AM, Roderic Page wrote:
It's morning and the coffee hasn't quite kicked in yet, but reading through recent TDWG TAG posts, and mindful of the upcoming meeting in New Orleans (which sadly I won't be attending) I'm seeing a mismatch between the amount of effort being expended on discussions of vocabularies, ontologies, etc. and the concrete results we can point to.
Hence, a challenge:
"What new things have we learnt about biodiversity by converting biodiversity data into RDF?"
I'm not saying we can't learn new things, I'm simply asking what have we learnt so far?
Since around 2006 we have had literally millions of triples in the wild (uBio, ION, Index Fungorum, IPNI, Catalogue of Life, more recently Biodiversity Collections Index, Atlas of Living Australia, World Register of Marine Species, etc.), most of these using the same vocabulary. What new inferences have we made?
Let's make the challenge more concrete. Load all these data sources into a triple store (subchallenge - is this actually possible?). Perhaps add other RDF sources (DBpedia, Bio2RDF, CrossRef). What novel inferences can we make?
I may, of course, simply be in "grumpy old arse" mode, but we have millions of triples in the wild and nothing to show for it. I hope I'm not alone in wondering why...
Regards
Rod
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag