[tdwg-content] Why we may need both a more XML-like DarwinCore and a to be developed truly semantic one later.

Peter DeVries pete.devries at gmail.com
Mon Oct 11 21:29:16 CEST 2010


I am starting to see some reasons why there might be a need for an more XML
type version of the DarwinCore and a separate, still undefined buy truly
"semantic", version that still need to be worked out.

Part of this is that I think it that most people don't understand the
semantic issues and will not for sometime.

So for those who are submitting their data to GBIF, I don't know if they
have to understand all the somewhat counterintuitive semantic issues.

Here is an example from one of our earlier discussions - I realized only
later that I should have brought this up at the time.

For XML using "scientificName" makes a lot of sense.

However:

1) The current semantic web cannot have literals as subjects.

 2) Also if you represent knowledge as triples without ontology-based
inferencing, you need to define predicates that for both "directions" in a
particular relationship. (In general)

So if we had a URI for each scientific name we would need to make both of
the following kinds of statements.

<concept> *hasScientificNameURI*      <name_uri>
<name_uri> *isScientificNameURI_Of*  <concept>

In the LOD cloud, you cannot assume that everyone will load and be able to
infer against your specific vocabulary.

This is, in part, the result of vocabularies not always playing well
together.

Also, it is not clear to what extent inferencing will work well on really
large data sets or for particular projects.

Eventually this will get figured out, but for now we need predicates that
can be used to represent both directions. (most of the time)

Without them you can only query in one direction.

This was not completely obvious to me until I had a test set and realized
why I was not able to query it in the ways I wanted.

So for the more XML-centric DarwinCore there is no problem with using *
scientificName.*

For the fully semantic web version, the following pattern might be easiest
to clearly interpret.

hasScientificName "Puma concolor (Linnaeus 1771)"
hasScientificNameURI <
http://gni.globalnames.org/name_strings/6c3dc35f-d901-5cc5-b9c8-ad241069b9f8
>

This makes it clear that the first form is to be used for a literal while
the second form is to be used for a conventionally resolvable URI.

Why do we need these name URI's at all?

Because some groups will need to relate name strings to each other and name
strings to concepts etc.

Hard to do when you can't use a literal as a subject. :-)

Respectfully,

- Pete
----------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20101011/6e7a8803/attachment.html 


More information about the tdwg-content mailing list