Hi Markus,

I expect them to copy and paste the URI in, or import the list as a field in their database.

Current Name   Mapped to Concept
Puma concolor   se:v6n7p

That way in the future if the name changes without a change in the concept.
Eupuma concolor se:v6n7p

The data says linked.

Eventually se:v6n7p will be linked to the name variants in the GNI (it is to some extent already)

We have not done this yet except for the TDWG meeting example (http://lod.taxonconcept.org/tdwg_nomina.rdf) but eventually these will have something like

se:v6n7p hasAcceptedScientificNameURI <http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9>
se:v6n7p hasSynonymNameURI <http://gni.globalnames.org/name_strings/35da7f30-25ff-5111-ab29-1a4f9988ef51>
se:v6n7p hhasBasionymNameURI <http://gni.globalnames.org/name_strings/35da7f30-25ff-5111-ab29-1a4f9988ef51>

etc.

And the following relations can be made from these to Rich's "Name Uses"

<se:v6n7p> hasRelatedGNUB <GNUB ID>  - basically ties the concept to a particular name use without stating the nature of the relationship.

With something like this to show the nature of the relationships (the literal below would be replaced with resolvable GNUB ID's)

<se:v6n7p> equivalentGNUB  <"Puma concolor (Linnaeus 1771) sec. Brown" >
<se:v6n7p> narrowerGNUB    <"Puma concolor (Linnaeus 1771)" >

* Probably one of Rich Pyle's fish might make a better example.

In a sense you are not only linking your occurrence record to a particular concept you are also linking it to other data in a number of other related data sets.

Since each data set, in fact each triple, can be assigned to it's own graph you can choose to ignore triples or datasets that you don't think are appropriate.

I have limited set of related data on my endpoint for this species with the various data sets assigned to different graphs.

For example the species Puma concolor http://lod.taxonconcept.org/ses/v6n7p.html

Browsable Endpoint View http://lsd.taxonconcept.org/describe/?url=http://lod.taxonconcept.org/ses/v6n7p%23Species  bit.ly http://bit.ly/gMFqR1
Similar to what is in the Sig.ma example http://sig.ma/search?pid=f986676a3b575b165e1498fd25927eaf

The SPARQL endpoints are described here: http://www.taxonconcept.org/sparql-endpoint/

THis is in contrast to all the info in the LOD about Puma concolor via Sig.ma

http://sig.ma/search?pid=a8365fad23aa42aff4c8f3078f30028c

I guess one of the differences is that you are giving the data suppliers a choice as to what if any species concept they think is most appropriate for their data vs. having an algorithm assign one for them. A side effect is that their data would then be linked to other data and more easily findable by potential users.

Respectfully,

- Pete






I guess my point is that higher classifications also vary so you need to have some smarts behind the disambiguation.

The also allows groups to link there data even between potentially contentious names like Aedes triseriatus ?= Ochlerotatus triseriatus

On Thu, Nov 25, 2010 at 7:42 AM, "Markus Döring (GBIF)" <mdoering@gbif.org> wrote:
Peter,
if people misspell scientific names and get authorships wrong, would you expect them to use URLs more carefully?

Whatever terms we define, there will always be wrong data to deal with. The best guess we currently have for understanding the meaning of weird, often wrongly spelled names in occurrence records is to take the higher classification into the picture. Of course there are also synonyms, homonyms, misspellings and some names with authorships but most without.

Markus


PS: if anyone is interested in a rich variety of names we maintain a test file of scientific names for our parser here:
http://gbif-ecat.googlecode.com/svn/trunk/ecat-common/src/test/resources/scientific_names.txt
All names have been spotted in the wild...


On Nov 25, 2010, at 14:21, Peter DeVries wrote:

> This is in response to David's example here; http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis
>
> It demonstrates a number of issues, the main one being it is not clear what the intent of the GBIF submitter was.
>
> Mytilus edulis Linnaeus 1758  <= Appears to be the correct name
> Mytilus edulis Linné 1758       <= Appears to be a lexical variant of the correct name
> Mytilus edulis d' Orbigny        <= Is this an error, a different description or a different species?
> Mytilus edulis Pennant           <= Is this an error, a different description or a different species?
> Mytilus edulis                        <= Is this an ommission, an error, a different description or a different species?
>
>
> The intent of the GBIF contributor is in some cases not clear from the list above.
>
> It is unclear what species they actually mean, and it is unclear if intend on entailing a specific classification or not.
>
> Was it clear that they thought Mytilus edulis Linnaeus 1758 = Mytilus edulis Linné 1758 or not?
>
> How do you determine their intent from the simple name string?
>
> I think the intent be clearer if the submitter included a species concept URI with their submission.
>
> A URI that showed that a species has multiple often dynamic classifications, some of which are linkable via semantic web URI's.
>
> For example:
>
> Wikipedia (http://en.wikipedia.org/wiki/Blue_mussel) => http://dbpedia.org/resource/Blue_mussel
> Kingdom Animalia
> Phylum Mollusca
> Class Bivalvia
> Subclass Heterodonta
> Order Mytiloida
> Family Mytilidae
> Subfamily Mytilinae
> Genus Mytilus
> Species Mytilus edulis
>
> ITIS (79454)
> Kingdom Animalia
> Phylum Mollusca
> Class Bivalvia
> Subclass Pteriomorphia
> Order Mytiloida
> Family Mytilidae
> Genus Mytilus
> Species Mytilus edulis
>
> WORMS (140480)
> Biota
> Kingdom Animalia
> Phylum Mollusca
> Class Bivalvia
> Subclass Pteriomorphia
> Order Mytiloida
> Family Mytilidae
> Genus Mytilus
> Species Mytilus edulis
>
> CoL (6979242)
> Kingdom Animalia
> Phylum Mollusca
> Class Bivalvia
> Order Mytiloida
> Family Mytilidae
> Genus Mytilus
> Species Mytilus edulis
>
> NCBI (6550) => "http://purl.uniprot.org/taxonomy/6550"/
> cellular organisms
> Eukaryota
> Fungi/Metazoa group
> Metazoa
> Eumetazoa
> Bilateria
> Coelomata
> Protostomia
> Mollusca
> Bivalvia
> Pteriomorphia
> Mytiloida
> Mytiloidea
> Mytilidae
> Mytilinae
> Mytilus
> Mytilus edulis
>
> EUNIS
> Kingdom Animalia
> Phylum MOLLUSCA
> Class Bivalvia
> Order Anisomyaria
> Family Mytilidae
> Genus Mytilus
> Mytilus edulis Linné, 1758
>
>
> The example below is not the same species as above but it shows how the intent might be made clearer.
>
> Urosalpinx cinerea (Say 1822) => "by this I mean the same species concept as se:Zom2X" => http://lod.taxonconcept.org/ses/Zom2X.html
>
> * This would be better if the concept description had more data like photo's, type specimens etc, but it does link to them and DNA barcodes.
>   The next version will also link to GBIF => http://data.gbif.org/species/13750666/
>
> HTML  http://lod.taxonconcept.org/ses/Zom2X.html
> RDF    http://lod.taxonconcept.org/ses/Zom2X.rdf
> TXNburner* (http://lsd.taxonconcept.org/describe/?url=http%3A%2F%2Flod.taxonconcept.org%2Fses%2FZom2X%23Species)
> Sindice      http://sig.ma/search?pid=7a8281129fa5ae55af919c7b90e97544
>
> * If this did not go through your email correctly you can get to it from the  http://lod.taxonconcept.org/ses/Zom2X.html or http://bit.ly/hyTnXf
>
> These are not perfect but they already make it more clear what data sets relate to the same species and what data sets relate to different species.
> By linking to one of these, you are accepting that a species can have multiple classifications and there can be multiple names for the your intended concept.
>
> Respectfully,
>
> - Pete
> ---------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
> About the GeoSpecies Knowledge Base
> ------------------------------------------------------------
> _______________________________________________
> tdwg-content mailing list
> tdwg-content@lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content




--
---------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
About the GeoSpecies Knowledge Base
------------------------------------------------------------