Hi Markus,
Peter,
if people misspell scientific names and get authorships wrong, would you expect them to use URLs more carefully?
Whatever terms we define, there will always be wrong data to deal with. The best guess we currently have for understanding the meaning of weird, often wrongly spelled names in occurrence records is to take the higher classification into the picture. Of course there are also synonyms, homonyms, misspellings and some names with authorships but most without.
Markus
PS: if anyone is interested in a rich variety of names we maintain a test file of scientific names for our parser here:
http://gbif-ecat.googlecode.com/svn/trunk/ecat-common/src/test/resources/scientific_names.txt
All names have been spotted in the wild...
> _______________________________________________
On Nov 25, 2010, at 14:21, Peter DeVries wrote:
> This is in response to David's example here; http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis
>
> It demonstrates a number of issues, the main one being it is not clear what the intent of the GBIF submitter was.
>
> Mytilus edulis Linnaeus 1758 <= Appears to be the correct name
> Mytilus edulis Linné 1758 <= Appears to be a lexical variant of the correct name
> Mytilus edulis d' Orbigny <= Is this an error, a different description or a different species?
> Mytilus edulis Pennant <= Is this an error, a different description or a different species?
> Mytilus edulis <= Is this an ommission, an error, a different description or a different species?
>
>
> The intent of the GBIF contributor is in some cases not clear from the list above.
>
> It is unclear what species they actually mean, and it is unclear if intend on entailing a specific classification or not.
>
> Was it clear that they thought Mytilus edulis Linnaeus 1758 = Mytilus edulis Linné 1758 or not?
>
> How do you determine their intent from the simple name string?
>
> I think the intent be clearer if the submitter included a species concept URI with their submission.
>
> A URI that showed that a species has multiple often dynamic classifications, some of which are linkable via semantic web URI's.
>
> For example:
>
> Wikipedia (http://en.wikipedia.org/wiki/Blue_mussel) => http://dbpedia.org/resource/Blue_mussel
> Kingdom Animalia
> Phylum Mollusca
> Class Bivalvia
> Subclass Heterodonta
> Order Mytiloida
> Family Mytilidae
> Subfamily Mytilinae
> Genus Mytilus
> Species Mytilus edulis
>
> ITIS (79454)
> Kingdom Animalia
> Phylum Mollusca
> Class Bivalvia
> Subclass Pteriomorphia
> Order Mytiloida
> Family Mytilidae
> Genus Mytilus
> Species Mytilus edulis
>
> WORMS (140480)
> Biota
> Kingdom Animalia
> Phylum Mollusca
> Class Bivalvia
> Subclass Pteriomorphia
> Order Mytiloida
> Family Mytilidae
> Genus Mytilus
> Species Mytilus edulis
>
> CoL (6979242)
> Kingdom Animalia
> Phylum Mollusca
> Class Bivalvia
> Order Mytiloida
> Family Mytilidae
> Genus Mytilus
> Species Mytilus edulis
>
> NCBI (6550) => "http://purl.uniprot.org/taxonomy/6550"/
> cellular organisms
> Eukaryota
> Fungi/Metazoa group
> Metazoa
> Eumetazoa
> Bilateria
> Coelomata
> Protostomia
> Mollusca
> Bivalvia
> Pteriomorphia
> Mytiloida
> Mytiloidea
> Mytilidae
> Mytilinae
> Mytilus
> Mytilus edulis
>
> EUNIS
> Kingdom Animalia
> Phylum MOLLUSCA
> Class Bivalvia
> Order Anisomyaria
> Family Mytilidae
> Genus Mytilus
> Mytilus edulis Linné, 1758
>
>
> The example below is not the same species as above but it shows how the intent might be made clearer.
>
> Urosalpinx cinerea (Say 1822) => "by this I mean the same species concept as se:Zom2X" => http://lod.taxonconcept.org/ses/Zom2X.html
>
> * This would be better if the concept description had more data like photo's, type specimens etc, but it does link to them and DNA barcodes.
> The next version will also link to GBIF => http://data.gbif.org/species/13750666/
>
> HTML http://lod.taxonconcept.org/ses/Zom2X.html
> RDF http://lod.taxonconcept.org/ses/Zom2X.rdf
> TXNburner* (http://lsd.taxonconcept.org/describe/?url=http%3A%2F%2Flod.taxonconcept.org%2Fses%2FZom2X%23Species)
> Sindice http://sig.ma/search?pid=7a8281129fa5ae55af919c7b90e97544
>
> * If this did not go through your email correctly you can get to it from the http://lod.taxonconcept.org/ses/Zom2X.html or http://bit.ly/hyTnXf
>
> These are not perfect but they already make it more clear what data sets relate to the same species and what data sets relate to different species.
> By linking to one of these, you are accepting that a species can have multiple classifications and there can be multiple names for the your intended concept.
>
> Respectfully,
>
> - Pete
> ---------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> TaxonConcept Knowledge Base / GeoSpecies Knowledge Base
> About the GeoSpecies Knowledge Base
> ------------------------------------------------------------
> tdwg-content mailing list
> tdwg-content@lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content