Maybe I'm missing something, but isn't the point of inverseOf that we don't have to include the link, if follows from the property? Hence, if I say specimen x is a part of a taxon concept TC, then it follows that TC includes x, but I don't need an actual triple saying that.
Hence, surely the point of ontologies is to free us from making the direct link? I've only advocated making the link within metadata for a single provider so we can do something without relying on ontologies at this stage.
In terms of the cost of adding triples, I don't think it will be too large overall. For example if we have reciprocal links between specimens and taxa within a source database (i.e., actual triples), I suspect it won't add too much -- some taxa will have lots of specimens, but most will have very few (a power law kind of thing)..
A good example of this is NCBI, where each sequence may have a link to the PubMed record for the paper in which they were published, and each PubMed record may list all the sequences published in that paper.
Given that the community hasn't actually built much yet, I'm not worried about "splits". Let's make some stuff and see what happens -- "suck it and see" is my maxim.
Rod
On 25 Apr 2006, at 09:35, Roger Hyam wrote:
Hi Rod,
(Here I have switched from using reciprocal - which is confusing to using the OWL terminology)
I think my point is where do we draw the line in developing workable ontologies. What rules do we have as to formally defined relationships? Some properties should have inverseOf (http://www.w3.org/TR/2004/REC-owl-guide-20040210/#inverseOf) properties and some properties should not - but how do we decide which is which?
Unless we have rules to apply then every time we want to create a new relationship the community will split over whether the inverseOf property should be defined or not.
Any suggestions as to what the rules should be?
Cheers,
Roger
Roderic Page wrote:
On 24 Apr 2006, at 16:42, Roger Hyam wrote:
The isBasionymOf property is an interesting one. This does not exist in the current vocabulary because it is based on the schema version of TCS. The thinking in TCS went that a name that is a basionym does not 'know' that it is a basionym and therefore it would be wrong to model this as a property of the object. There are many places where this might be an issue. Does a specimen know that it is the type of a name? Without the name it isn't a type.
What about the relationship? From my perspective, it's useful to know that the original name for Eutypella ventricosa is Valsa ventricosa, and this is what I model using the predicate "hasBasionym". So it's a relationship, not a property of an object.
If you called an LSID for a specimen would you expect to be told that the specimen was the type of a name? If we use Concise Bounded Descriptions ( http://swdev.nokia.com/uriqa/CBD.html) then we won't know unless there is a triple with a subject of the specimen and an object of the name (i.e. we have an isTypeSpecimenOf property or similar). But logically where does this stop? We can't add reciprocal properties to the object definition for everything that anyone may say of it. If I define a taxon with a list of specimens I wouldn't expect all the specimens to have reciprocal includedInTaxonConcept links back to my object. It would be impossible in an open system where some one else may own the record.
No, we don't want reciprocal record for everything, but there are cases where it is useful, especially WITHIN a single source. For example, if I have access to all of IndexFungorum then I can run a query that discovers all the names for which Valsa ventricosa is the basionym. But if I don't have a copy of IndexFungorum, and I'm relying on the metadata attached to a LSID, then if the metadata for Valsa ventricosa has "isBasionnymOf" tags connecting it to all names for which it is a basionym, I can discover those names. More importantly, I can infer whether two names are synonyms (e.g., if two names share Valsa ventricosa as a basionym, then those names are synonyms). Without this, I'm stuck. I think what we need to consider is whether two names that are synonyms (or whatever relationship we are interested in) are "reachable", that is given the metadata for the names we can go from name A to B and visa versa. This is related to the concept of a "scutter" :
"a scutter is simply a computer program that loads, parses, interprets and acts upon the contents of a Web of interconnected RDF/XML documents. In this sense it is just a Semantic Web variant on the old theme of distributed Web indexing, sometimes called a 'harvester', 'spider', or 'robot'. The links between RDF documents are usually, but not necessarily, expressed using RDF's 'rdfs:seeAlso' property." (see http://rdfweb.org/topic/Scutter)
So, I think we gain a lot of power if our metadata is sufficiently linked to support a scutter. For example, given metadata for a PubMed publication, we could get to sequences, via that to taxa (including names in numerous databases via LinkOut), to specimens, and so on, all via metadata. Indeed, one could let a scutter loose and aggregate data a la Google -- who needs GBIF anyway ;-).
In fact, this would be a cool challenge. Start a scutter and see what can be retrieved.
I am thinking aloud here but we have to be very careful in adding things to vocabularies - even when they seem really useful. Ultimately if a client wants to know everything about a object that a data source has it will have to ask the "Give me all the things that refer to X" question. Either that or we have to guarantee that all links are always reciprocal - which we can't.
No, you don't have to guarantee links are reciprocal, but you do want some degree of reachability -- that I can get from one object to another. If we aggregate everything into one central repository (a la Google indexing the web) then this isn't an issue, but it is if we don't. I agree that for some things we don't want to have reciprocal links -- but I'd suggest we'd need to think seriously about supporting basic search. As you point out, we need to support the "Give me all the things that refer to X" question.
Ultimately, I think we can't ignore search, or perhaps more generally "finability" (which depends on things being linked). See the wonderful book "Ambient Findability" by Peter Morville (http://www.oreilly.com/catalog/ambient/). If we don't make our stuff findable, we are wasting our time.
Rod
On the other hand things that are in people's data bases that are easy to pass should perhaps be represented in an ontology - if they are useful.
Well done for another LSID authority Kevin.
All the best,
Roger
Kevin Richards wrote:Thanks for those comments Rod.
As you have seen this is an initial attempt.
The syntax
<TaxonNames:hasBasionym> <rdf:Description
rdf:about="urn:lsid:indexfungorum.org:Names:148860" /> </TaxonNames:hasBasionym>
strikes me as odd.
This is due to an accidental omission of the RDF entity type of the basionym object. Will fix this.
I also suggest that urn:lsid:indexfungorum.org:Names:148860 has a complementary tag such as
<TaxonNames:isBasionymOf rdf:resource =
"urn:lsid:indexfungorum.org:Names: 213649" />
Godd idea. The fields are based on the initial implementation of TCS-RDF that Roger completed, and as he said, it is not a complete schema at this stage. BTW the reverse RDF pointers can be viewed using launchpad by going into the launchpad settings and turning on 'Show back links'.
The attribute TaxonNames:nomenclaturalCode="http://tdwg.org/2006/03/12/ TaxonNames/ NomenclaturalCode/#botanical" of the tag TaxonNames:TaxonName is problematic. Firstly, I don't know why this is an attribute rather
than
just another tag,
Due to my lack of understanding of RDF and when to use attributes as opposed to tags - I was blindly following an example.
and the URI http://tdwg.org/2006/03/12/TaxonNames/NomenclaturalCode/#botanical returns a 404. If this is just a made up URI then this is bad --
EVERY
URI in an RDF document must be real -- unlike XML schema where any
old
rubbish can be used.
Also due to the prototyping stage of this 'project'. Will be fixed by online TDWG ontologies at some stage I assume?
<TaxonNames:publishedIn><i>Syll. fung.</i> (Abellini)
<b>1</b>: 148 (1882) (1882)</TaxonNames:publishedIn> has formatting information (the <i></i> and <b></b> tags). I think
this
is in principle a bad thing(TM)
We debated this a little and decided to leave the field text the same as has been returned by other services of IndexFungorum. But you have a good point and it is something we will need to discuss further in future.
Kevin
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++
WARNING: This email and any attachments may be confidential and/or privileged. They are intended for the addressee only and are not to be read, used, copied or disseminated by anyone receiving them in error. If you are not the intended recipient, please notify the sender by return email and delete this message and any attachments.
The views expressed in this email are those of the sender and do not necessarily reflect the official views of Landcare Research.
Landcare Research http://www.landcareresearch.co.nz ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ++++++++
--
Roger Hyam Technical Architect Taxonomic Databases Working Group
http://www.tdwg.org roger@tdwg.org
+44 1578 722782
Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species: http://ispecies.org Rod's rants on phyloinformatics: http://iphylo.blogspot.com
Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com
--
Roger Hyam Technical Architect Taxonomic Databases Working Group
http://www.tdwg.org roger@tdwg.org
+44 1578 722782
------------------------------------------------------------------------ ---------------------------------------- Professor Roderic D. M. Page Editor, Systematic Biology DEEB, IBLS Graham Kerr Building University of Glasgow Glasgow G12 8QP United Kingdom
Phone: +44 141 330 4778 Fax: +44 141 330 2792 email: r.page@bio.gla.ac.uk web: http://taxonomy.zoology.gla.ac.uk/rod/rod.html reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html
Subscribe to Systematic Biology through the Society of Systematic Biologists Website: http://systematicbiology.org Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/ Find out what we know about a species: http://ispecies.org Rod's rants on phyloinformatics: http://iphylo.blogspot.com
___________________________________________________________ Yahoo! For Good - Sponsor a London Marathon runner - http://uk.promotions.yahoo.com/charity/london-marathon