Index Fungorum LSID server

Roderic Page r.page at BIO.GLA.AC.UK
Tue Apr 25 10:10:00 CEST 2006


Maybe I'm missing something, but isn't the point of inverseOf that we
don't have to include the link, if follows from the property? Hence, if
I say specimen x is a part of a taxon concept TC, then it follows that
TC includes x, but I don't need an actual triple saying that.

Hence, surely the point of ontologies is to free us from making the
direct link? I've only advocated making the link within metadata for a
single provider so we can do something without relying on ontologies at
this stage.

In terms of the cost of adding triples, I don't think it will  be too
large overall. For example if we have reciprocal links between
specimens and taxa within a source database (i.e., actual triples), I
suspect it won't add too much -- some taxa will have lots of specimens,
but most will have very few (a power law kind of thing)..

A good example of this is NCBI, where each sequence may have a link to
the PubMed record for the paper in which they were published, and each
PubMed record may list all the sequences published in that paper.

Given that the community hasn't actually built much yet, I'm not
worried about "splits". Let's make some stuff and see what happens --
"suck it and see" is my maxim.

Rod


On 25 Apr 2006, at 09:35, Roger Hyam wrote:

> Hi Rod,
>
> (Here I have switched from using reciprocal - which is confusing to
> using the OWL terminology)
>
> I think my point is where do we draw the line in developing workable
> ontologies. What rules do we have as to formally defined relationships?
> Some properties should have inverseOf
> (http://www.w3.org/TR/2004/REC-owl-guide-20040210/#inverseOf)
> properties
> and some properties should not - but how do we decide which is which?
>
> Unless we have rules to apply then every time we want to create a new
> relationship the community will split over whether the inverseOf
> property should be defined or not.
>
> Any suggestions as to what the rules should be?
>
> Cheers,
>
> Roger
>
>
> Roderic Page wrote:
>> On 24 Apr 2006, at 16:42, Roger Hyam wrote:
>>
>>>
>>>  The isBasionymOf property is an interesting one. This does not exist
>>> in the current vocabulary because it is based on the schema version
>>> of TCS. The thinking in TCS went that a name that is a basionym does
>>> not 'know' that it is a basionym and therefore it would be wrong to
>>> model this as a property of the object. There are many places where
>>> this might be an issue. Does a specimen know that it is the type of a
>>> name? Without the name it isn't a type.
>>
>> What about the relationship? From my perspective, it's useful to know
>> that the original name for Eutypella ventricosa is Valsa ventricosa,
>> and this is what I model using the predicate "hasBasionym". So it's a
>> relationship, not a property of an object.
>>
>>>
>>>  If you called an LSID for a specimen would you expect to be told
>>> that the specimen was the type of a name?  If we use Concise Bounded
>>> Descriptions ( http://swdev.nokia.com/uriqa/CBD.html) then we won't
>>> know unless there is a triple with a subject of the specimen and an
>>> object of the name (i.e. we have an isTypeSpecimenOf property or
>>> similar). But logically where does this stop? We can't add reciprocal
>>> properties to the object definition for everything that anyone may
>>> say of it. If I define a taxon with a list of specimens I wouldn't
>>> expect all the specimens to have reciprocal includedInTaxonConcept
>>> links back to my object. It would be impossible in an open system
>>> where some one else may own the record.
>>
>> No, we don't want reciprocal record for everything, but there are
>> cases where it is useful, especially WITHIN a single source. For
>> example, if I have access to all of IndexFungorum then I can run a
>> query that discovers all the names for which Valsa ventricosa is the
>> basionym. But if I don't have a copy of IndexFungorum, and I'm relying
>> on the metadata attached to a LSID, then if the metadata for Valsa
>> ventricosa has "isBasionnymOf" tags connecting it to all names for
>> which it is a basionym, I can discover those names. More importantly,
>> I can infer whether two names are synonyms (e.g., if two names share
>> Valsa ventricosa as a basionym, then those names are synonyms).
>> Without this, I'm stuck. I think what we need to consider is whether
>> two names that are synonyms (or whatever relationship we are
>> interested in) are "reachable", that is given the metadata for the
>> names we can go from name A to B and visa versa. This is related to
>> the concept of a "scutter" :
>>
>> "a scutter is simply a computer program that loads, parses, interprets
>> and acts upon the contents of a Web of interconnected RDF/XML
>> documents. In this sense it is just a Semantic Web variant on the old
>> theme of distributed Web indexing, sometimes called a 'harvester',
>> 'spider', or 'robot'. The links between RDF documents are usually, but
>> not necessarily, expressed using RDF's 'rdfs:seeAlso' property." (see
>> http://rdfweb.org/topic/Scutter)
>>
>> So, I think we gain a lot of power if our metadata is sufficiently
>> linked to support a scutter. For example, given metadata for a PubMed
>> publication, we could get to sequences, via that to taxa (including
>> names in numerous databases via LinkOut), to specimens, and so on, all
>> via metadata. Indeed, one could let a scutter loose and aggregate data
>> a la Google -- who needs GBIF anyway ;-).
>>
>> In fact, this would be a cool challenge. Start a scutter and see what
>> can be retrieved.
>>
>>>
>>>  I am thinking aloud here but we have to be very careful in adding
>>> things to vocabularies - even when they seem really useful.
>>> Ultimately if a client wants to know everything about a object that a
>>> data source has it will have to ask the "Give me all the things that
>>> refer to X" question. Either that or we have to guarantee that all
>>> links are always reciprocal - which we can't.
>>
>> No, you don't have to guarantee links are reciprocal, but you do want
>> some degree of reachability -- that I can get from one object to
>> another. If we aggregate everything into one central repository (a la
>> Google indexing the web) then this isn't an issue, but it is if we
>> don't. I agree that for some things we don't want to have reciprocal
>> links -- but I'd suggest we'd need to think seriously about supporting
>> basic search. As you point out, we need  to support the "Give me all
>> the things that refer to X" question.
>>
>> Ultimately, I think we can't ignore search, or perhaps more generally
>> "finability" (which depends on things being linked). See the wonderful
>> book "Ambient Findability" by Peter Morville
>> (http://www.oreilly.com/catalog/ambient/). If we don't make our stuff
>> findable, we are wasting our time.
>>
>> Rod
>>
>>>
>>>  On the other hand things that are in people's data bases that are
>>> easy to pass should perhaps be represented in an ontology - if they
>>> are useful.
>>>
>>>  Well done for another LSID authority Kevin.
>>>
>>>  All the best,
>>>
>>>  Roger
>>>
>>>
>>>
>>>  Kevin Richards wrote:Thanks for those comments Rod.
>>>> As you have seen this is an initial attempt.
>>>>
>>>>
>>>>>> The syntax
>>>>>>
>>>>>>      <TaxonNames:hasBasionym>
>>>>>>        <rdf:Description
>>>>>> rdf:about="urn:lsid:indexfungorum.org:Names:148860" />
>>>>>>      </TaxonNames:hasBasionym>
>>>>>>
>>>>>> strikes me as odd.
>>>>>>
>>>> This is due to an accidental omission of the RDF entity type of the
>>>> basionym object.  Will fix this.
>>>>
>>>>
>>>>>> I also suggest that urn:lsid:indexfungorum.org:Names:148860 has a
>>>>>> complementary tag such as
>>>>>>
>>>>>>      <TaxonNames:isBasionymOf rdf:resource =
>>>>>> "urn:lsid:indexfungorum.org:Names: 213649" />
>>>>>>
>>>> Godd idea.  The fields are based on the initial implementation of
>>>> TCS-RDF that Roger completed, and as he said, it is not a complete
>>>> schema at this stage.  BTW the reverse RDF pointers can be viewed
>>>> using
>>>> launchpad by going into the launchpad settings and turning on 'Show
>>>> back
>>>> links'.
>>>>
>>>>
>>>>
>>>>>> The attribute
>>>>>> TaxonNames:nomenclaturalCode="http://tdwg.org/2006/03/12/
>>>>>> TaxonNames/
>>>>>> NomenclaturalCode/#botanical" of the tag <TaxonNames:TaxonName> is
>>>>>> problematic. Firstly, I don't know why this is an attribute rather
>>>>>>
>>>> than
>>>>
>>>>>> just another tag,
>>>>>>
>>>> Due to my lack of understanding of RDF and when to use attributes as
>>>> opposed to tags - I was blindly following an example.
>>>>
>>>>
>>>>
>>>>>> and the URI
>>>>>> http://tdwg.org/2006/03/12/TaxonNames/NomenclaturalCode/#botanical
>>>>>> returns a 404. If this is just a made up URI then this is bad --
>>>>>>
>>>> EVERY
>>>>
>>>>>> URI in an RDF document must be real -- unlike XML schema where any
>>>>>>
>>>> old
>>>>
>>>>>> rubbish can be used.
>>>>>>
>>>> Also due to the prototyping stage of this 'project'.  Will be fixed
>>>> by
>>>> online TDWG ontologies at some stage I assume?
>>>>
>>>>
>>>>>>     <TaxonNames:publishedIn><i>Syll. fung.</i> (Abellini)
>>>>>> <b>1</b>: 148 (1882) (1882)</TaxonNames:publishedIn>
>>>>>> has formatting information (the <i></i> and <b></b> tags). I think
>>>>>>
>>>> this
>>>>
>>>>>> is in principle a bad thing(TM)
>>>>>>
>>>> We debated this a little and decided to leave the field text the
>>>> same as
>>>> has been returned by other services of IndexFungorum.  But you have
>>>> a
>>>> good point and it is something we will need to discuss further in
>>>> future.
>>>>
>>>> Kevin
>>>>
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> ++++++++
>>>>
>>>> WARNING: This email and any attachments may be confidential and/or
>>>> privileged. They are intended for the addressee only and are not to
>>>> be read,
>>>> used, copied or disseminated by anyone receiving them in error.  If
>>>> you are
>>>> not the intended recipient, please notify the sender by return email
>>>> and
>>>> delete this message and any attachments.
>>>>
>>>> The views expressed in this email are those of the sender and do not
>>>> necessarily reflect the official views of Landcare Research.
>>>>
>>>> Landcare Research
>>>> http://www.landcareresearch.co.nz
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> ++++++++
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> -------------------------------------
>>>  Roger Hyam
>>>  Technical Architect
>>>  Taxonomic Databases Working Group
>>> -------------------------------------
>>>  http://www.tdwg.org
>>>  roger at tdwg.org
>>>  +44 1578 722782
>>> -------------------------------------
>>>
>>>
>> ----------------------------------------------------------------------
>> ------------------------------------------
>>
>> Professor Roderic D. M. Page
>> Editor, Systematic Biology
>> DEEB, IBLS
>> Graham Kerr Building
>> University of Glasgow
>> Glasgow G12 8QP
>> United Kingdom
>>
>> Phone:    +44 141 330 4778
>> Fax:      +44 141 330 2792
>> email:    r.page at bio.gla.ac.uk
>> web:      http://taxonomy.zoology.gla.ac.uk/rod/rod.html
>> reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
>>
>> Subscribe to Systematic Biology through the Society of Systematic
>> Biologists Website:  http://systematicbiology.org
>> Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
>> Find out what we know about a species: http://ispecies.org
>> Rod's rants on phyloinformatics: http://iphylo.blogspot.com
>>
>>
>>
>>
>>
>>
>> ___________________________________________________________
>> Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with
>> voicemail http://uk.messenger.yahoo.com
>>
>
>
> --
>
> -------------------------------------
> Roger Hyam
> Technical Architect
> Taxonomic Databases Working Group
> -------------------------------------
> http://www.tdwg.org
> roger at tdwg.org
> +44 1578 722782
> -------------------------------------
>
>
------------------------------------------------------------------------
----------------------------------------
Professor Roderic D. M. Page
Editor, Systematic Biology
DEEB, IBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom

Phone:    +44 141 330 4778
Fax:      +44 141 330 2792
email:    r.page at bio.gla.ac.uk
web:      http://taxonomy.zoology.gla.ac.uk/rod/rod.html
reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html

Subscribe to Systematic Biology through the Society of Systematic
Biologists Website:  http://systematicbiology.org
Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
Find out what we know about a species: http://ispecies.org
Rod's rants on phyloinformatics: http://iphylo.blogspot.com




___________________________________________________________
Yahoo! For Good - Sponsor a London Marathon runner - http://uk.promotions.yahoo.com/charity/london-marathon




More information about the tdwg-tag mailing list