Peter, if people misspell scientific names and get authorships wrong, would you expect them to use URLs more carefully?
Whatever terms we define, there will always be wrong data to deal with. The best guess we currently have for understanding the meaning of weird, often wrongly spelled names in occurrence records is to take the higher classification into the picture. Of course there are also synonyms, homonyms, misspellings and some names with authorships but most without.
Markus
PS: if anyone is interested in a rich variety of names we maintain a test file of scientific names for our parser here: http://gbif-ecat.googlecode.com/svn/trunk/ecat-common/src/test/resources/sci... All names have been spotted in the wild...
On Nov 25, 2010, at 14:21, Peter DeVries wrote:
This is in response to David's example here; http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis
It demonstrates a number of issues, the main one being it is not clear what the intent of the GBIF submitter was.
Mytilus edulis Linnaeus 1758 <= Appears to be the correct name Mytilus edulis Linné 1758 <= Appears to be a lexical variant of the correct name Mytilus edulis d' Orbigny <= Is this an error, a different description or a different species? Mytilus edulis Pennant <= Is this an error, a different description or a different species? Mytilus edulis <= Is this an ommission, an error, a different description or a different species?
The intent of the GBIF contributor is in some cases not clear from the list above.
It is unclear what species they actually mean, and it is unclear if intend on entailing a specific classification or not.
Was it clear that they thought Mytilus edulis Linnaeus 1758 = Mytilus edulis Linné 1758 or not?
How do you determine their intent from the simple name string?
I think the intent be clearer if the submitter included a species concept URI with their submission.
A URI that showed that a species has multiple often dynamic classifications, some of which are linkable via semantic web URI's.
For example:
Wikipedia (http://en.wikipedia.org/wiki/Blue_mussel) => http://dbpedia.org/resource/Blue_mussel Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Heterodonta Order Mytiloida Family Mytilidae Subfamily Mytilinae Genus Mytilus Species Mytilus edulis
ITIS (79454) Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Pteriomorphia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
WORMS (140480) Biota Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Pteriomorphia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
CoL (6979242) Kingdom Animalia Phylum Mollusca Class Bivalvia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
NCBI (6550) => "http://purl.uniprot.org/taxonomy/6550%22/ cellular organisms Eukaryota Fungi/Metazoa group Metazoa Eumetazoa Bilateria Coelomata Protostomia Mollusca Bivalvia Pteriomorphia Mytiloida Mytiloidea Mytilidae Mytilinae Mytilus Mytilus edulis
EUNIS Kingdom Animalia Phylum MOLLUSCA Class Bivalvia Order Anisomyaria Family Mytilidae Genus Mytilus Mytilus edulis Linné, 1758
The example below is not the same species as above but it shows how the intent might be made clearer.
Urosalpinx cinerea (Say 1822) => "by this I mean the same species concept as se:Zom2X" => http://lod.taxonconcept.org/ses/Zom2X.html
- This would be better if the concept description had more data like photo's, type specimens etc, but it does link to them and DNA barcodes. The next version will also link to GBIF => http://data.gbif.org/species/13750666/
HTML http://lod.taxonconcept.org/ses/Zom2X.html RDF http://lod.taxonconcept.org/ses/Zom2X.rdf TXNburner* (http://lsd.taxonconcept.org/describe/?url=http%3A%2F%2Flod.taxonconcept.org%...) Sindice http://sig.ma/search?pid=7a8281129fa5ae55af919c7b90e97544
- If this did not go through your email correctly you can get to it from the http://lod.taxonconcept.org/ses/Zom2X.html or http://bit.ly/hyTnXf
These are not perfect but they already make it more clear what data sets relate to the same species and what data sets relate to different species. By linking to one of these, you are accepting that a species can have multiple classifications and there can be multiple names for the your intended concept.
Respectfully,
- Pete
Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base / GeoSpecies Knowledge Base About the GeoSpecies Knowledge Base
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content