[tdwg-content] Potential predicates to relate species concepts to GNI namestring URI's

Peter DeVries pete.devries at gmail.com
Tue Sep 28 17:25:25 CEST 2010


I was working toward creating some consensus on a vocabulary on how to
relate the namestrings in the Global Names Index to some sort of species
concept.

The goal is to have a vocabulary that works with a number of different
"types" of species concept.

It was suggested to me that I just go ahead and make one.

Here is what I am proposing and hope to demonstrate at the Nomina section on
Thursday at TDWG.

One of the first things to decide is what should be the "best" form of a
scientific name string.

After looking through the GBIF info it seems that the following form which
has no comma between the author and year might be preferred form.

For example Puma concolor (Linnaeus 1771)

The other various forms of the "same" name can then be related to each other
using semantic web predicates in addition to the relationship between a name
string and species concept.

Using this as a model, I thought the following triple structure might work
well to map the concept to Accepted, Synonym etc.

Since the GNI namestrings are in the form of URI's they can serve as both
subjects and objects.

Literals can not be used as subjects.

I use the form hasAcceptedScientificNameURI to distinguish between the URI
form and the the literal form txn:hasAcceptedScientificName.

I was a little reluctant to do this within the txn namespace since it might
be something for the TDWG vocabulary.

GUID Forms
<concept> txn:hasAcceptedScientificNameURI <
http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9
>
<concept> txn:hasSynonymNameURI <
http://gni.globalnames.org/name_strings/35da7f30-25ff-5111-ab29-1a4f9988ef51
>
<concept> txn:hasBasionymNameURI <
http://gni.globalnames.org/name_strings/35da7f30-25ff-5111-ab29-1a4f9988ef51
>

<
http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9>
txn:isAcceptedScientificNameURI_Of <concept>
<
http://gni.globalnames.org/name_strings/35da7f30-25ff-5111-ab29-1a4f9988ef51>
txn:isAcceptedScientificNameURI_Of <concept>
<
http://gni.globalnames.org/name_strings/35da7f30-25ff-5111-ab29-1a4f9988ef51>
txn:isBasionymNameURI_Of <concept>

Humans, note that the RDF representation of these GNI namestring information
can be seen by adding ".rdf" to the end of the URI.
Semantic Web applications already understand the distinction between the
html and rdf representations.

For example
http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9.rdf

It is also important to discuss and make a decision about the  Domain and
Range for each of these predicates.

txn:hasAcceptedScientificNameURI

Domain - Some sort of TaxonConcept but may be best to leave the Domain
unspecified except that it should be a URI.

Range - Some sort of Namestring represented in RDF.

Within a typical browser, the RDF form of the URI

It might make some sense to have a range that follows some standard RDF
representation, but they do not necessarily have to be namestrings in the
GNI.

I was thinking that those RDF representations that follow a specific
structure could be seen as instances of the class "NameString"

If these are namestrings provided only by the GNI, then they might be in a
modeled as instances of the class "GNI_NameString"

==========
Literal Forms
txn:hasAcceptedScientificName "Ochlerotatus triseriatus (Say 1823)"
txn:hasAcceptedScientificName "Aedes triseriatus (Say 1823)"
txn:hasBasionymName "Culex triseriatus Say 1823"

GUID Forms

<concept> txn:hasAcceptedScientificNameURI <
http://gni.globalnames.org/name_strings/3f418cbf-358a-5174-ade1-25ae846a219e
>
<concept> txn:hasAcceptedScientificNameURI <
http://gni.globalnames.org/name_strings/e9f5e126-5cdb-52bf-9050-349df4a9dd18
>
<concept> txn:hasBasionymNameURI <
http://gni.globalnames.org/name_strings/6ce63b2c-58be-5335-826b-622b7b4bc328
>

<
http://gni.globalnames.org/name_strings/3f418cbf-358a-5174-ade1-25ae846a219e>
txn:isAcceptedScientificNameURI_Of <concept>
<
http://gni.globalnames.org/name_strings/e9f5e126-5cdb-52bf-9050-349df4a9dd18>
txn:isAcceptedScientificNameURI_Of <concept>
<
http://gni.globalnames.org/name_strings/3f418cbf-358a-5174-ade1-25ae846a219e>
txn:isBasionymNameURI_Of <concept>

==========

Do we want to also include information like that described below. In some
cases, each of the databases have their own version of the "Accepted Name"

This would allow one to determine which DB's have different "Accepted" names
<concept> txn:hasCOL2010_ScientificNameURI <>
<concept> txn:hasNCBI_ScientificNameURI <>
<concept> txn:hasITIS_ScientificNameURI <>
<concept> txn:hasGBIF_ScientificNameURI <>

Tracking the Catalog of Life LSID's over time. Since there is a new COL LSID
each year, you will need to create some mapping between these.
<concept> txn:hasCol2009LSID <urn:lsid:catalogueoflife.org:
taxon:24e7d624-60a7-102d-be47-00304854f810:ac2009>
<concept> txn:hasCol2010LSID <urn:lsid:catalogueoflife.org:
taxon:24e7d624-60a7-102d-be47-00304854f810:ac2010>

These LSID will work within VIrtuoso only because this quadstore has written
extensions that allow them. LSID's themselves are not really true LOD
identifiers.

Most semantic web tools do not know how to deal with these.

The relationships between these "properly formed" Accepted Names, Synonyms,
Basionyms can then be represented with additional predicates depending on
the relationship between the concept and the namestring.

I have choosen to not emphasize what kind of taxon concept we are using
since these should be usable with many "kinds" of taxon concepts.

The advantage of representing these namestrings as URI's is that once loaded
into a triplestore / quadstore one will be able to see and query what
namestrings are related to a specific concept and what namestrings are
related to various concepts.

Unless their is some disagreement, I plan to modify my ontology and upload
the example triples to the cloud so the namestring relationships are
viewable in SIndice etc by the time of the Nomina meeting on Thursday.

Respectively,

- Pete

-- 
----------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base <http://www.taxonconcept.org/> / GeoSpecies
Knowledge Base <http://lod.geospecies.org/>
About the GeoSpecies Knowledge Base <http://about.geospecies.org/>
------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20100928/79da5e70/attachment.html 
-------------- next part --------------
_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content


More information about the tdwg-content mailing list