Two Scenarios

Roderic Page r.page at BIO.GLA.AC.UK
Fri Nov 25 15:45:15 CET 2005


On 25 Nov 2005, at 15:34, Roger Hyam wrote:

>
>  But wait... there is more. As Arthur points out we already have most  
> of this stuff defined. TCS encapsulates a whole load of semantics  
> about nomenclatural relationship (types of type etc) and TaxonConcept  
> relationship (child taxon of, hybrid parent of etc) and ABCD has  
> similar knowledge about collections.  A great deal of re-engineering  
> and transition is involved. We mustn't go throwing any babies out with  
> the bath water.

Maybe I'm missing something (since I've avoided XML schema like the  
plague), but given that RDF is (usually) expressed in XML, shouldn't we  
be able with a minimum of fuss to "port" the relevant bits from an XML  
schema to an RDF schema? Maybe I'm being naive but I don't see why this  
is such an engineering nightmare.

>
>  Also serving this stuff may be problematic....

Serving what stuff?


>
>  So yes GUID+RDF seems to solve most every problem just at the moment.
>
>  Roger
>
>
>
>  Arthur Chapman wrote:Rod
>>
>> This is a neat solution and may well work.  I like it!
>>
>> It is somewhat akin to the "Relation" element in Dublin Core but  
>> which has generally not been implemented with a controlled vocabulary  
>> as was recommended at the Canberra meeting of Dublin Core in about  
>> 1996 or 1997.
>>
>> It was implemented in the Australian Government Locator Service  
>> (AGLS) as Australian Standard AS5044 with a controlled vocabulary.   
>> The vocabulary is not what we would need, but gives a parallel  
>> example
>>
>> isVersionOf
>> hasVersion
>> isReplacedBy
>> replaces
>> isRequiredBy
>> requires
>> isPartOf
>> isReferencedBy
>> isFormatOf
>> hasFormat
>> isBasisFor
>> isBasedOn
>>
>> http://www.naa.gov.au/recordkeeping/gov_online/agls/ 
>> AGLS_reference_description.pdf
>>
>> Cheers
>>
>> Arthur
>>
>> >From Roderic Page <r.page at BIO.GLA.AC.UK> on 25 Nov 2005:
>>
>>
>>> These relationships would be specified in the metadata attached to  
>>> the
>>> GUIDs, not the GUIDs themselves (they are simply unique identifiers).
>>>
>>> For example, if we think of you tax number/Social Security
>>> Number/National Insurance Number (insert whatever identifier your
>>> government attaches to you here), then you could have two GUIDs such  
>>> as
>>>
>>> JE 5679434A
>>>
>>> and
>>>
>>> JH 5679434B
>>>
>>> The metadata for JE 5679434A could contain a statement that the
>>> individuals are related, e.g. something like
>>>
>>> <rdf:Description rdf:about="JE 5679434A">
>>>      <isMarriedTo rdf:resource ="JH 5679434B" />
>>> </rdf:Description>
>>>
>>> In other words, the person identifed by "JE 5679434A" is married to  
>>> the
>>>
>>> person identified by "JH 5679434B".
>>>
>>> One can develop ontologies that specify these relationships, and  
>>> enable
>>>
>>> us to deduce other facts. For example, if X is married to Y, then Y  
>>> is
>>> married to X, but if Z is a child of Y, Y is the parent of Z, and so
>>> on. What is nice is that you wouldn't have to explicitly state that Y
>>> is the parent of Z in the metadata Y, it can be inferred from the
>>> relationship Z is a child of Y.
>>>
>>> I use RDF here because these are the kind of things it handles  
>>> nicely.
>>> All (!) you'd need is a consistent vocabulary to describe the
>>> relationships. RDF already has some basic ones ("sameAs",
>>> "subPropertyOf", etc.). In the examples you provide, I guess you'd  
>>> want
>>>
>>> "part of", "extracted from", "hosted by", "parent of", "mother of",
>>> etc.
>>>
>>> Does this help?
>>>
>>> Regards
>>>
>>> Rod
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 25 Nov 2005, at 11:18, Arthur Chapman wrote:
>>>
>>>
>>>> Below I have placed two scenarios that show some of the
>>>> cross-discipline problems I believe we face with GUIDs. They don't
>>>> provide the answers, alas!
>>>>
>>>> It would appear to me that each of these separate entities need a
>>>> GUID; but that each needs to show some relationship (nearly a
>>>> genealogy or pedigree line) - child of (i.e. derived from); brother
>>>>
>>> of>
>>>
>>>> (duplicate collection); sister of (wet collection); part of (genetic
>>>> study) etc.  Can these be built into a GUID?
>>>>
>>>> If we just look at the simplest problem, where a herbarium makes a
>>>> collection and sends out duplicates to other herbaria.  More often
>>>> than not, the duplicates are distributed prior to receiving a
>>>> catalogue number in the originating ionstitution.  We can only thus
>>>> identify duplicates using collector name and number, but these are
>>>>
>>> not>
>>>
>>>> always unique, and not all collectors use numbers. - We can't use  
>>>> the
>>>>
>>>> lat/long coordinates as these are often put on after distribution  
>>>> and
>>>>
>>>> are often different (one collection I looked at in 5 different
>>>> herbaria was given 4 different lat/longs). The resolution of many of
>>>> these duplicates will need to be a human problem - possibly helped  
>>>> by
>>>>
>>>> parsing routines similar those being developed for location
>>>> information in the BioGeomancer project, and possibly some  
>>>> artificial
>>>>
>>>> intelligence (to sort out collector's names used in different ways,
>>>> etc. - initials first/surname first, etc.).
>>>>
>>>> I wish I could supply the answers!
>>>>
>>>> These scenarios don't show up all that well in text, I have also
>>>> attached a word document.
>>>>
>>>> ---------------------
>>>> PLANT
>>>> 1.  Collector Makes collection
>>>>  a.  Provides collector number (not always Unique) <Fred 123>
>>>>   i.  Submits collection to Herbarium
>>>>    1.  Herbarium supplies collection number <Index
>>>>
>>> Herbarium-CANB12345> >
>>>
>>>>    2.  and a name <TCS-123454>
>>>>     a.  Herbarium distributes collections to other herbaria
>>>>      i.  New herbaria supply collection numbers <IH-NY65432;
>>>> IH-MO34562; IH-K98765>
>>>>
>> === message truncated ==
>>
>>
>
> --
>
> -------------------------------------
>  Roger Hyam
>  Technical Architect
>  Taxonomic Databases Working Group
> -------------------------------------
>  http://www.tdwg.org
>  roger at tdwg.org
>  +44 1578 722782
> -------------------------------------
>
>
------------------------------------------------------------------------ 
----------------------------------------
Professor Roderic D. M. Page
Editor, Systematic Biology
DEEB, IBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom

Phone:    +44 141 330 4778
Fax:      +44 141 330 2792
email:    r.page at bio.gla.ac.uk
web:      http://taxonomy.zoology.gla.ac.uk/rod/rod.html
reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html

Subscribe to Systematic Biology through the Society of Systematic
Biologists Website:  http://systematicbiology.org
Search for taxon names at http://darwin.zoology.gla.ac.uk/~rpage/portal/
Find out what we know about a species at http://ispecies.org




More information about the tdwg-tag mailing list