On Oct 18, 2010, at 12:32 PM, Arlin Stoltzfus wrote:

Does anyone out there want to explore this idea and contribute the  
results to the TDWG project?   If so, please let me know.

Apropos of this, a fascinating reporting on the LOD cloud was just released, as described in the message below.  Life science resources represent a sizable chunk of the cloud.  


---------- Forwarded message ----------
From: Anja Jentzsch <anja@anjeve.de>
Date: Tue, Oct 19, 2010 at 10:33 AM
Subject: ANN: LOD Cloud - Statistics and compliance with best practices
To: "public-lod@w3.org" <public-lod@w3.org>

Hi all,

in the last weeks, we have analyzed which data sources in the new version of the LOD cloud comply to various best practices that are recommended by W3C or have emerged within the LOD community.

We have checked the implementation of the following nine best practices:

1. Provide dereferencable URIs
2. Set RDF links pointing at other data sources
3. Use terms from widely deployed vocabularies
4. Make proprietary vocabulary terms dereferencable
5. Map proprietary vocabulary terms to other vocabularies
6. Provide provenance metadata
7. Provide licensing metadata
8. Provide data-set-level metadata
9. Refer to additional access methods

The compliance with the best practices was either checked manually or by using scripts that downloaded and analyzed some data from the data sources.
We have added the results of the evaluation in the form of tags to the LOD data set catalog on CKAN [1].

We are now happy to release the first statistics about the structure of the LOD could as well as the compliance of the datasets with the best practices.
The statistics can be found here:


The document contains an initial, preliminary release of the statistics. If you spot any errors in the data describing the LOD data sets on CKAN, it would be great if you would correct them directly on CKAN.

For information on how to describe datasets on CKAN please refer to the Guidelines for Collecting Metadata on Linked Datasets in CKAN [2].

After your feedback and corrections, we will then move the corrected version of the statistics to http://www.lod-cloud.net/ (around October 24th).

Have fun with the statistics and the encouraging as well as disappointing insights that they provide.


Chris Bizer, Anja Jentzsch and Richard Cyganiak

[1] http://www.ckan.net/group/lodcloud
[2] http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation

Arlin Stoltzfus (arlin@umd.edu)
Fellow, IBBR; Adj. Assoc. Prof., UMCP; Research Biologist, NIST
IBBR, 9600 Gudelsky Drive, Rockville, MD
tel: 240 314 6208; web: www.molevol.org