Two Scenarios

Sat Dec 3 18:45:40 CET 2005

Dear Arthur,

tracing the transfer of biological specimen among collections on different
locations and - by extension - tracking all the downstream information
that has been collected on a given specimen by different people provides
many scenarios that can be neatly resolved by using a solid UID/ontology
framework. Here's a use-case that shows you some examples in the microbial
domain (among others) that comes pretty close to the two scenarios that
you were drawing

Over 500 collections around the world (bacteria, archaea, filamentous
fungi, yeast) have been registered at the World Data Centre for
Microorganisms
(http://lmg.ugent.be/~pdawyndt/maps/world_collections.html), which is the
data center of the World Federation of Culture Collections. The
StrainInfo.net portal system (http://www.straininfo.net) enables simple
navigation through all collections from a single entry point (access to
the portal is currently restricted, use 'guest' as login and 'straininfo'
as password).

For each specimen (eg. try "LMG 6923", a bacterial specimen), the portal
system offers the following services to the user:

1.      a list of cultures of the same specimen in other collections
around the world. Multiple resolution is provided to the related entry in
the online catalogs of the culture collections, providing further
background information on the specimen
2.      a complete history of the specimen, showing its transfer among
different collections ever since its original isolation
(http://lmg.ugent.be/~pdawyndt/maps/history.pdf)
3.      a list of sequences in EMBL/GenBank/DDBJ. Remark that the public
sequence databases score poorly in their support of searching sequences
for a given specimen
4.      a list of literature references of studies that use the given
specimen
5.      geographic distribution of the collections where the given
specimen can be ordered

All these services and the information system that goes behind the
StrainInfo.net portal strongly build on unique identifiers and a extensive
network of object-relationships. To keep pace with updated information
sources, the information system perpetually finds new objects (and
automatically assigns unique identifiers to them if they are not in place;
i.e. accumulative learning) and discovers all sorts relationships between
these objects (built on top of the unique identifiers). This requires an
expert system that is tightly coupled to an extensible domain-related
ontology (what is what, and how should it be recognized in the light of
synonymy, homology, syntactical variation). The good thing is that all of
this complexity remains pretty much hidden for the end-user.

Hopefully, this sketches somewhat the possibilities of the GUID/ontology
frameworks that are envisioned in this mailing list

Cheers,

Peter

Further reading:

Dawyndt P, Vancanneyt M, De Meyer H, Swings J. Knowledge Accumulation and
Resolution of Data Inconsistencies during the Integration of Microbial
Information Sources. IEEE Transactions on Knowledge and Data Engineering,
17(8), pp. 1111-1126, 2005.

Romano P, Dawyndt P, Piersigilli F, Swings J. Improving interoperability
between microbial information and sequence databases. BMC Bioinformaticsi,
6(4), S23, 2005. doi:10.1186/1471-2105-6-S4-S23.

-----------------------------------------------------------
Dr. Peter Dawyndt

Laboratory of Microbiology
Ghent University
Ledeganckstraat 35
B-9000 Gent, Belgie
Tel.: (32)9.264.51.32
Fax.: (32)9.264.50.92

KERMIT -- Research Unit "Knowledge-based Systems"
Ghent University
Coupure links 653
B-9000 Gent, Belgie
Tel.: (32)9.264.60.18

Email: Peter.Dawyndt at UGent.be
-----------------------------------------------------------