These relationships would be specified in the metadata attached to the GUIDs, not the GUIDs themselves (they are simply unique identifiers).
For example, if we think of you tax number/Social Security Number/National Insurance Number (insert whatever identifier your government attaches to you here), then you could have two GUIDs such as
JE 5679434A
and
JH 5679434B
The metadata for JE 5679434A could contain a statement that the individuals are related, e.g. something like
<rdf:Description rdf:about="JE 5679434A"> <isMarriedTo rdf:resource ="JH 5679434B" /> </rdf:Description>
In other words, the person identifed by "JE 5679434A" is married to the person identified by "JH 5679434B".
One can develop ontologies that specify these relationships, and enable us to deduce other facts. For example, if X is married to Y, then Y is married to X, but if Z is a child of Y, Y is the parent of Z, and so on. What is nice is that you wouldn't have to explicitly state that Y is the parent of Z in the metadata Y, it can be inferred from the relationship Z is a child of Y.
I use RDF here because these are the kind of things it handles nicely. All (!) you'd need is a consistent vocabulary to describe the relationships. RDF already has some basic ones ("sameAs", "subPropertyOf", etc.). In the examples you provide, I guess you'd want "part of", "extracted from", "hosted by", "parent of", "mother of", etc.
Does this help?
Regards
Rod
On 25 Nov 2005, at 11:18, Arthur Chapman wrote:
Below I have placed two scenarios that show some of the cross-discipline problems I believe we face with GUIDs. They don't provide the answers, alas!
It would appear to me that each of these separate entities need a GUID; but that each needs to show some relationship (nearly a genealogy or pedigree line) - child of (i.e. derived from); brother of (duplicate collection); sister of (wet collection); part of (genetic study) etc. Can these be built into a GUID?
If we just look at the simplest problem, where a herbarium makes a collection and sends out duplicates to other herbaria. More often than not, the duplicates are distributed prior to receiving a catalogue number in the originating ionstitution. We can only thus identify duplicates using collector name and number, but these are not always unique, and not all collectors use numbers. - We can't use the lat/long coordinates as these are often put on after distribution and are often different (one collection I looked at in 5 different herbaria was given 4 different lat/longs). The resolution of many of these duplicates will need to be a human problem - possibly helped by parsing routines similar those being developed for location information in the BioGeomancer project, and possibly some artificial intelligence (to sort out collector's names used in different ways, etc. - initials first/surname first, etc.).
I wish I could supply the answers!
These scenarios don't show up all that well in text, I have also attached a word document.
PLANT
- Collector Makes collection
a. Provides collector number (not always Unique) <Fred 123> i. Submits collection to Herbarium
- Herbarium supplies collection number <Index Herbarium-CANB12345>
- and a name <TCS-123454>
a. Herbarium distributes collections to other herbaria i. New herbaria supply collection numbers <IH-NY65432;
IH-MO34562; IH-K98765> ii. New herbarium creates wet collection in alcohol
<IH-NY-wet-65432> b. Herbarium takes tissue samples and cultivates and propagates in living collection i. Living collections supplies a number <BotGard-ANBG99-24-12-18> 1. Living collection makes a new collection for herbarium <IH-CANB35556> c. University researcher takes pollen sample and carries out analysis. i. Stores microscope slide of pollen grains <ANUxxxx342567587> 1. carries out genetic study a. sends collection to GenBank <Genbank 456783> ii. Submits seeds to Seed Bank 1. Seed bank supplies a number <CSIRO-Euc34547> a. Seed bank supplies seed to China where it is cultivated. i. China takes sample for chemical assay <XXXX3435646763> 1. Develops drug and applies for patent <USA Pat xxxxxxxx> -------------------
ZOO ANIMAL 2. Collector catches two animals a. Probably doesnt give a number b. Sells animals to two different zoos i. Zoo A documents and gives animal a number <ZOOA-xxxxx22323>
- Animal was pregnant and gives birth to 2 young (parent unknown)
- Animals given numbers <ZOOA-xxxxx22323-1; ZOOA-xxxxx22323-2>
a. Young (1) sold to Zoo B i. Zoo B gives a number <ZOOB-yyy34562> ii. Young (1) develops a parasite 1. Parasite sent to Lab 1 <LAB1-2222342> a. Lab identifies parasite and sends specimen to Museum
<Museum-FL12322> b. Young (2) dies i. Specimen sent to Museum <Museum XY2222234> 1. Tissue sample taken and stored <Museum XY2222234-a> 2. Skin sample sent to different museum <Museum BT4563721> ii. Zoo B documents and gives animal number <ZOOB-yyy12345> 1. Zoo B crosses ZOOB-yyy12345 with ZOOB-yyy34562 a. Offspring born and given numbers <ZOOB-yyy673221; ZOOB-yyy673222> i. Animal ZOOB-yyy673221 sold to Zoo C 1. Zoo C supplies number <ZOOC-22245> needs to link pedigree. ii. Animal ZOOB-yyy673222 etc. \etc.
Hope this helps and just doesn't confuse the issue even more.
Arthur Chapman Toowoomba, Australia
------------------------------------------------------------------------ ---------------------------------------- Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names at http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species at http://ispecies.org