[Tdwg-tag] RDF instead of xml schema

Roderic Page r.page at BIO.GLA.AC.UK
Fri Mar 24 23:31:37 CET 2006


Dear Gregor,

Various comments:

Relationship to ER
-------------------------

In many ways RDF is very like ER-modelling (e.g.,  
http://bit.csc.lsu.edu/~chen/pdf/Chen_Pioneers.pdf), indeed the W3C  
states that "RDF is a member of the Entity-Relationship modelling  
family." One could think of the modelling the link between two entities  
as a triple.

There's also a comment on this on Danny Ayers' blog  
(http://dannyayers.com/2005/03/16/xml-andor-rdf/) - embedded in some  
other extraneous stuff.

You might also want to look at D2R Map  
(http://www.wiwiss.fu-berlin.de/suhl/bizer/d2rmap/D2Rmap.htm) for  
mapping between database schema and RDF.

One way I like to think of this is that I think RDMS are geared towards  
locally held data, hence the emphasis on primary keys and  
normalisation. If, however, much of the data is elsewhere (as I believe  
it is in biodiversity informatics) then you need ways to refer to  
remotely held objects (i.e., URIs, which is another way of saying  
GUIDs). Hence, instead of locally defined primary keys you start to  
store GUIDs (which, by definition, are unique). If you then start to  
think about relationships between the objects identified by GUIDs, then  
you've pretty much got RDF. I would also say that in terms of managing  
local data, RDBMS are not going away. RDF, in my opinion, only really  
becomes relevant once you want to make the data available to others,  
and once you want to enable people to be able to aggregate that data.

RDMS and triple stores
-------------------------------

Regarding triple stores, most ones that I'm aware of are actually built  
on top of a RDBMS. For example, I use 3store, which uses MySQL as the  
backend.

Would we be first in line?
----------------------------------
Well, I've been serving RDF for taxonomic names since mid-2005, uBio  
are serving their own LSIDs, and DiGIR is moving that way. So, there  
are functioning examples to play with, and people making a start in  
this area.

Does bioinformatics use it?
------------------------------------

The biggest use I've seen is the port of UniProt to RDF  
(http://expasy3.isb-sib.ch/~ejain//rdf/) by Eric Bain. The EBI has some  
interest in RDF via the BioMoby project. Another project is BioPax (see  
commentary at Nodalpoint -  
http://www.nodalpoint.org/2006/03/12/ 
integrating_biopax_compliant_pathway_data).

GML/SVG not RDFS
--------------------------
(no idea)

Importing RDF
-------------------
Well, this is sort of what triple stores do. Given that RDF is composed  
of triples the trivial mapping to a relational database is to store the  
triples in one table with three columns (subject, predicate, object). I  
guess for most people the issue is getting data out of RDBMS into RDF,  
not the other way around.

Tools
-------

As one who positively hates the diagrams XMLSpy produces I'm biased,  
but when  I played with Altova SemanticWorks I was distinctly  
underwhelmed.

Hope this helps,

Regards

Rod


On 24 Mar 2006, at 17:36, Gregor Hagedorn wrote:

> Hi all,
>
> RDF to me appears on a level of abstraction making it very hard for me  
> to
> follow the documentation and discussion. Most of the examples are  
> embedded in
> an artificial intelligence / reasoning use cases that I have no  
> experience
> with.
>
> I am a biologist and I feel comfortable with UML, ER-modeling,  
> xml-schema-
> modeling, and - surprise - relational databases. I believe many others  
> are as
> well - how many datastores are actually build upon RDBMS technology?
>
> To me xml-schema maps nicely to both UML-like OO-modeling and  
> Relational DBMS.
> I can guess about the advantages of opening this all up and seeing the  
> world as
> a huge set of unstructured statement tupels. But it also scares me.
>
> Angst is a bad advisor. But then if only a minority of the current few  
> people
> involved can follow on the RDF abstraction level. A few questions I  
> have:
>
> * Would we be first in line to try rdf for such complex models as  
> biodiversity
> informatics?
>
> * Do Genbank/EMBL with their hundreds of employees and programmers use  
> rdf?
> Internally/externally? The molecular bioinformatics is probably 1000  
> times
> larger than our biodiversity informatics.
>
> * Why are GML, SVG etc. based on xml schema and not RDFS? Is this just
> historical?
>
> * Are there any tools around that let me import RDF into a relational  
> database
> (simple tools for xml-schema-based import/export are almost standard  
> part of
> databases now, or you can use comfortable graphical tools like Altova
> MapForce).
>
> -- I am just trying to test some tools to help me to visualize RDFS  
> productions 
> (like Roger has send around) on a level comparable with the UML-like  
> xml-schema
> editors (Spy, Stylus, Oracle, etc.) I will try Altova SemanticWorks  
> and Protege
> over the next week. The screenshot seem to be about AI and semantic  
> web much
> more than about information models (those creatures where you try to  
> simplify
> the world to make it manageable...).
>
> Gregor----------------------------------------------------------
> Gregor Hagedorn (G.Hagedorn at bba.de)
> Institute for Plant Virology, Microbiology, and Biosafety
> Federal Research Center for Agriculture and Forestry (BBA)
> Königin-Luise-Str. 19           Tel: +49-30-8304-2220
> 14195 Berlin, Germany           Fax: +49-30-8304-2203
>
>
> _______________________________________________
> Tdwg-tag mailing list
> Tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag_lists.tdwg.org
>
>
------------------------------------------------------------------------ 
----------------------------------------
Professor Roderic D. M. Page
Editor, Systematic Biology
DEEB, IBLS
Graham Kerr Building
University of Glasgow
Glasgow G12 8QP
United Kingdom

Phone:    +44 141 330 4778
Fax:      +44 141 330 2792
email:    r.page at bio.gla.ac.uk
web:      http://taxonomy.zoology.gla.ac.uk/rod/rod.html
reprints: http://taxonomy.zoology.gla.ac.uk/rod/pubs.html

Subscribe to Systematic Biology through the Society of Systematic
Biologists Website:  http://systematicbiology.org
Search for taxon names: http://darwin.zoology.gla.ac.uk/~rpage/portal/
Find out what we know about a species: http://ispecies.org
Rod's rants on phyloinformatics: http://iphylo.blogspot.com






___________________________________________________________
Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com




More information about the tdwg-content mailing list