Does a species entail a specific classification or does it have many classifications.
This is in response to David's example here; http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis
It demonstrates a number of issues, the main one being it is not clear what the intent of the GBIF submitter was.
Mytilus edulis Linnaeus 1758 <= Appears to be the correct name Mytilus edulis Linné 1758 <= Appears to be a lexical variant of the correct name Mytilus edulis d' Orbigny <= Is this an error, a different description or a different species? Mytilus edulis Pennant <= Is this an error, a different description or a different species? Mytilus edulis <= Is this an ommission, an error, a different description or a different species?
The intent of the GBIF contributor is in some cases not clear from the list above.
It is unclear what species they actually mean, and it is unclear if intend on entailing a specific classification or not.
Was it clear that they thought *Mytilus edulis* Linnaeus 1758 = *Mytilus edulis* Linné 1758 or not?
How do you determine their intent from the simple name string?
I think the intent be clearer if the submitter included a species concept URI with their submission.
A URI that showed that a species has multiple often dynamic classifications, some of which are linkable via semantic web URI's.
For example:
Wikipedia (http://en.wikipedia.org/wiki/Blue_mussel) => http://dbpedia.org/resource/Blue_mussel Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Heterodonta Order Mytiloida Family Mytilidae Subfamily Mytilinae Genus Mytilus Species Mytilus edulis
ITIS (79454) Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Pteriomorphia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
WORMS (140480) Biota Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Pteriomorphia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
CoL (6979242) Kingdom Animalia Phylum Mollusca Class Bivalvia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
NCBI (6550) => "http://purl.uniprot.org/taxonomy/6550%22/ cellular organisms Eukaryota Fungi/Metazoa group Metazoa Eumetazoa Bilateria Coelomata Protostomia Mollusca Bivalvia Pteriomorphia Mytiloida Mytiloidea Mytilidae Mytilinae Mytilus Mytilus edulis
EUNIS Kingdom Animalia Phylum MOLLUSCA Class Bivalvia Order Anisomyaria Family Mytilidae Genus Mytilus Mytilus edulis Linné, 1758
The example below is not the same species as above but it shows how the intent might be made clearer.
Urosalpinx cinerea (Say 1822) => "by this I mean the same species concept as se:Zom2X" => http://lod.taxonconcept.org/ses/Zom2X.html
* This would be better if the concept description had more data like photo's, type specimens etc, but it does link to them and DNA barcodes. The next version will also link to GBIF => http://data.gbif.org/species/13750666/
HTML http://lod.taxonconcept.org/ses/Zom2X.html RDF http://lod.taxonconcept.org/ses/Zom2X.rdf http://lod.taxonconcept.org/ses/Zom2X.rdfTXNburner* ( http://lsd.taxonconcept.org/describe/?url=http%3A%2F%2Flod.taxonconcept.org%... ) Sindice http://sig.ma/search?pid=7a8281129fa5ae55af919c7b90e97544
* If this did not go through your email correctly you can get to it from the http://lod.taxonconcept.org/ses/Zom2X.html or http://bit.ly/hyTnXf
These are not perfect but they already make it more clear what data sets relate to the same species and what data sets relate to different species. By linking to one of these, you are accepting that a species can have multiple classifications and there can be multiple names for the your intended concept.
Respectfully,
- Pete --------------------------------------------------------------- Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base http://www.taxonconcept.org/ / GeoSpecies Knowledge Base http://lod.geospecies.org/ About the GeoSpecies Knowledge Base http://about.geospecies.org/ ------------------------------------------------------------
Peter, if people misspell scientific names and get authorships wrong, would you expect them to use URLs more carefully?
Whatever terms we define, there will always be wrong data to deal with. The best guess we currently have for understanding the meaning of weird, often wrongly spelled names in occurrence records is to take the higher classification into the picture. Of course there are also synonyms, homonyms, misspellings and some names with authorships but most without.
Markus
PS: if anyone is interested in a rich variety of names we maintain a test file of scientific names for our parser here: http://gbif-ecat.googlecode.com/svn/trunk/ecat-common/src/test/resources/sci... All names have been spotted in the wild...
On Nov 25, 2010, at 14:21, Peter DeVries wrote:
This is in response to David's example here; http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis
It demonstrates a number of issues, the main one being it is not clear what the intent of the GBIF submitter was.
Mytilus edulis Linnaeus 1758 <= Appears to be the correct name Mytilus edulis Linné 1758 <= Appears to be a lexical variant of the correct name Mytilus edulis d' Orbigny <= Is this an error, a different description or a different species? Mytilus edulis Pennant <= Is this an error, a different description or a different species? Mytilus edulis <= Is this an ommission, an error, a different description or a different species?
The intent of the GBIF contributor is in some cases not clear from the list above.
It is unclear what species they actually mean, and it is unclear if intend on entailing a specific classification or not.
Was it clear that they thought Mytilus edulis Linnaeus 1758 = Mytilus edulis Linné 1758 or not?
How do you determine their intent from the simple name string?
I think the intent be clearer if the submitter included a species concept URI with their submission.
A URI that showed that a species has multiple often dynamic classifications, some of which are linkable via semantic web URI's.
For example:
Wikipedia (http://en.wikipedia.org/wiki/Blue_mussel) => http://dbpedia.org/resource/Blue_mussel Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Heterodonta Order Mytiloida Family Mytilidae Subfamily Mytilinae Genus Mytilus Species Mytilus edulis
ITIS (79454) Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Pteriomorphia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
WORMS (140480) Biota Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Pteriomorphia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
CoL (6979242) Kingdom Animalia Phylum Mollusca Class Bivalvia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
NCBI (6550) => "http://purl.uniprot.org/taxonomy/6550%22/ cellular organisms Eukaryota Fungi/Metazoa group Metazoa Eumetazoa Bilateria Coelomata Protostomia Mollusca Bivalvia Pteriomorphia Mytiloida Mytiloidea Mytilidae Mytilinae Mytilus Mytilus edulis
EUNIS Kingdom Animalia Phylum MOLLUSCA Class Bivalvia Order Anisomyaria Family Mytilidae Genus Mytilus Mytilus edulis Linné, 1758
The example below is not the same species as above but it shows how the intent might be made clearer.
Urosalpinx cinerea (Say 1822) => "by this I mean the same species concept as se:Zom2X" => http://lod.taxonconcept.org/ses/Zom2X.html
- This would be better if the concept description had more data like photo's, type specimens etc, but it does link to them and DNA barcodes. The next version will also link to GBIF => http://data.gbif.org/species/13750666/
HTML http://lod.taxonconcept.org/ses/Zom2X.html RDF http://lod.taxonconcept.org/ses/Zom2X.rdf TXNburner* (http://lsd.taxonconcept.org/describe/?url=http%3A%2F%2Flod.taxonconcept.org%...) Sindice http://sig.ma/search?pid=7a8281129fa5ae55af919c7b90e97544
- If this did not go through your email correctly you can get to it from the http://lod.taxonconcept.org/ses/Zom2X.html or http://bit.ly/hyTnXf
These are not perfect but they already make it more clear what data sets relate to the same species and what data sets relate to different species. By linking to one of these, you are accepting that a species can have multiple classifications and there can be multiple names for the your intended concept.
Respectfully,
- Pete
Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base / GeoSpecies Knowledge Base About the GeoSpecies Knowledge Base
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi Markus,
I expect them to copy and paste the URI in, or import the list as a field in their database.
Current Name Mapped to Concept Puma concolor se:v6n7p http://lod.taxonconcept.org/ses/v6n7p.html
That way in the future if the name changes without a change in the concept. Eupuma concolor se:v6n7p http://lod.taxonconcept.org/ses/v6n7p.html
The data says linked.
Eventually se:v6n7p http://lod.taxonconcept.org/ses/v6n7p.html will be linked to the name variants in the GNI (it is to some extent already)
We have not done this yet except for the TDWG meeting example ( http://lod.taxonconcept.org/tdwg_nomina.rdf) but eventually these will have something like
se:v6n7p http://lod.taxonconcept.org/ses/v6n7p.html hasAcceptedScientificNameURI < http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9
se:v6n7p http://lod.taxonconcept.org/ses/v6n7p.html hasSynonymNameURI < http://gni.globalnames.org/name_strings/35da7f30-25ff-5111-ab29-1a4f9988ef51
se:v6n7p http://lod.taxonconcept.org/ses/v6n7p.html hhasBasionymNameURI < http://gni.globalnames.org/name_strings/35da7f30-25ff-5111-ab29-1a4f9988ef51
etc.
And the following relations can be made from these to Rich's "Name Uses"
<se:v6n7p http://lod.taxonconcept.org/ses/v6n7p.html> hasRelatedGNUB <GNUB ID> - basically ties the concept to a particular name use without stating the nature of the relationship.
With something like this to show the nature of the relationships (the literal below would be replaced with resolvable GNUB ID's)
<se:v6n7p http://lod.taxonconcept.org/ses/v6n7p.html> equivalentGNUB <"Puma concolor (Linnaeus 1771) sec. Brown" > <se:v6n7p http://lod.taxonconcept.org/ses/v6n7p.html> narrowerGNUB <"Puma concolor (Linnaeus 1771)" >
* Probably one of Rich Pyle's fish might make a better example.
In a sense you are not only linking your occurrence record to a particular concept you are also linking it to other data in a number of other related data sets.
Since each data set, in fact each triple, can be assigned to it's own graph you can choose to ignore triples or datasets that you don't think are appropriate.
I have limited set of related data on my endpoint for this species with the various data sets assigned to different graphs.
For example the species Puma concolor http://lod.taxonconcept.org/ses/v6n7p.html
Browsable Endpoint View http://lsd.taxonconcept.org/describe/?url=http://lod.taxonconcept.org/ses/v6... bit.ly http://bit.ly/gMFqR1 Similar to what is in the Sig.ma example http://sig.ma/search?pid=f986676a3b575b165e1498fd25927eaf
The SPARQL endpoints are described here: http://www.taxonconcept.org/sparql-endpoint/
THis is in contrast to all the info in the LOD about Puma concolor via Sig.ma
http://sig.ma/search?pid=a8365fad23aa42aff4c8f3078f30028c
I guess one of the differences is that you are giving the data suppliers a choice as to what if any species concept they think is most appropriate for their data vs. having an algorithm assign one for them. A side effect is that their data would then be linked to other data and more easily findable by potential users.
Respectfully,
- Pete
I guess my point is that higher classifications also vary so you need to have some smarts behind the disambiguation.
The also allows groups to link there data even between potentially contentious names like Aedes triseriatus ?= Ochlerotatus triseriatus
On Thu, Nov 25, 2010 at 7:42 AM, "Markus Döring (GBIF)" mdoering@gbif.orgwrote:
Peter, if people misspell scientific names and get authorships wrong, would you expect them to use URLs more carefully?
Whatever terms we define, there will always be wrong data to deal with. The best guess we currently have for understanding the meaning of weird, often wrongly spelled names in occurrence records is to take the higher classification into the picture. Of course there are also synonyms, homonyms, misspellings and some names with authorships but most without.
Markus
PS: if anyone is interested in a rich variety of names we maintain a test file of scientific names for our parser here:
http://gbif-ecat.googlecode.com/svn/trunk/ecat-common/src/test/resources/sci... All names have been spotted in the wild...
On Nov 25, 2010, at 14:21, Peter DeVries wrote:
This is in response to David's example here;
http://code.google.com/p/gbif-ecat/wiki/Nom5ExampleMytilusedulis
It demonstrates a number of issues, the main one being it is not clear
what the intent of the GBIF submitter was.
Mytilus edulis Linnaeus 1758 <= Appears to be the correct name Mytilus edulis Linné 1758 <= Appears to be a lexical variant of the
correct name
Mytilus edulis d' Orbigny <= Is this an error, a different
description or a different species?
Mytilus edulis Pennant <= Is this an error, a different
description or a different species?
Mytilus edulis <= Is this an ommission, an error,
a different description or a different species?
The intent of the GBIF contributor is in some cases not clear from the
list above.
It is unclear what species they actually mean, and it is unclear if
intend on entailing a specific classification or not.
Was it clear that they thought Mytilus edulis Linnaeus 1758 = Mytilus
edulis Linné 1758 or not?
How do you determine their intent from the simple name string?
I think the intent be clearer if the submitter included a species concept
URI with their submission.
A URI that showed that a species has multiple often dynamic
classifications, some of which are linkable via semantic web URI's.
For example:
Wikipedia (http://en.wikipedia.org/wiki/Blue_mussel) =>
http://dbpedia.org/resource/Blue_mussel
Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Heterodonta Order Mytiloida Family Mytilidae Subfamily Mytilinae Genus Mytilus Species Mytilus edulis
ITIS (79454) Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Pteriomorphia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
WORMS (140480) Biota Kingdom Animalia Phylum Mollusca Class Bivalvia Subclass Pteriomorphia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
CoL (6979242) Kingdom Animalia Phylum Mollusca Class Bivalvia Order Mytiloida Family Mytilidae Genus Mytilus Species Mytilus edulis
NCBI (6550) => "http://purl.uniprot.org/taxonomy/6550%22/ cellular organisms Eukaryota Fungi/Metazoa group Metazoa Eumetazoa Bilateria Coelomata Protostomia Mollusca Bivalvia Pteriomorphia Mytiloida Mytiloidea Mytilidae Mytilinae Mytilus Mytilus edulis
EUNIS Kingdom Animalia Phylum MOLLUSCA Class Bivalvia Order Anisomyaria Family Mytilidae Genus Mytilus Mytilus edulis Linné, 1758
The example below is not the same species as above but it shows how the
intent might be made clearer.
Urosalpinx cinerea (Say 1822) => "by this I mean the same species concept
as se:Zom2X" => http://lod.taxonconcept.org/ses/Zom2X.html
- This would be better if the concept description had more data like
photo's, type specimens etc, but it does link to them and DNA barcodes.
The next version will also link to GBIF =>
http://data.gbif.org/species/13750666/
HTML http://lod.taxonconcept.org/ses/Zom2X.html RDF http://lod.taxonconcept.org/ses/Zom2X.rdf TXNburner* (
http://lsd.taxonconcept.org/describe/?url=http%3A%2F%2Flod.taxonconcept.org%... )
Sindice http://sig.ma/search?pid=7a8281129fa5ae55af919c7b90e97544
- If this did not go through your email correctly you can get to it from
the http://lod.taxonconcept.org/ses/Zom2X.html or http://bit.ly/hyTnXf
These are not perfect but they already make it more clear what data sets
relate to the same species and what data sets relate to different species.
By linking to one of these, you are accepting that a species can have
multiple classifications and there can be multiple names for the your intended concept.
Respectfully,
- Pete
Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base / GeoSpecies Knowledge Base About the GeoSpecies Knowledge Base
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
A side remark, about where I believe the whole discussion is misleading:
Puma concolor se:v6n7p That way in the future if the name changes without a change in the concept. Eupuma concolor se:v6n7p The data says linked.
This always looks nice... However, with such proposals we, the computer guys, make the concept-assessment someone elses problem (i.e. the taxonomists, ecologists, pathologist, etc.), and, at the same time, do not provide them the means to communicate. The assumption is that a scientist or applied worker would know whether to add se:v6n7p to a given taxon name or not.
With my taxonomer/pathologist hat on: I mostly have no clue which concept XXX concolor is - and whether it is changed or not. Puma concolor may be a different concept than Puma concolor. We are, of course, guilty of communicating in a shamefully loose way (s.str., s. lat. etc.), which could and should be improved by citing a secundum, but beyond that: mapping concepts is a taxonomic opinion, no objective truth.
So given that any trivial mapping mechanism can map multiple IDs (Puma concolor, Eupuma concolor) to a single concept - the proposal saves this trivial processing time, but does not contribute to the problem of communicating in a way that is suitable to assess taxon concepts.
----
Aside: Please compare the highly linked, and generally correctly linked Wikipedias with other content management system for the advantage of human legible IDs [[Puma concolor]] over http://x.y.net/node/234872561 - links. My own observation is that in the latter case only a fraction of the desirable links are created, and that these are quite often going to wrong, or perhaps obsoleted places.
I therefore think: se:Puma_concolor_sec._Smith would be a much more useful mechanism than all the computer-scientists-only proposals like se:v6n7p or http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9
Gregor
What you propose works except that it is not stable to changes in nomenclature.
It is really no better than using the name.
You are still left with is *Aedes triseriatus* the same as *Ochlerotatus triseriatus*?
Data linked to http://example.org/Aedes_triseriatus is lost when the name is changed to http://example.org/Ochlerotatus_triseriatus
by citing a secundum
Yes if that publication is generally available and contains enough information that a set of general taxonomists will assign the same *secundum * to the same set of specimens. and if everyone uses the exact same string of characters to when citing that *secundum*
the proposal saves this trivial processing time, but does not contribute
to the problem
If this is so trivial then why does GBIF have one map for *Felis concolor*and one map for *Puma concolor*?
also why does the Barcode of life not have all their *Aedes triseriatus* and *Ochlerotatus triseriatus* mapped to one id rather than one for each name and misspelling?
se:Puma_concolor_sec._Smith
How many lexical variants of the string above are there likely to be?
The GNI name string URI's are simply a way to create a correctly formed URI from a name
I have the name string "Puma concolor (Linnaeus 1771)" in my database, the GNI has the same namestring.
We share a common algorithm to convert this into a standard uuid > '772d5162-f5aa-596c-98e0-a1c6c5a29bb9'
Now as long as the GNI has the same namestring I have, I can link our two databases together using the following URI.
http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9
These URI are simply to allow different databases to link to each other.
Names as they are structured do not make good URI's since certain characters need to be encoded so (Schmoe) becomes something like %28Schmoe%29 in a URL.
What you see as http://en.wikipedia.org/wiki/Cassia_(legume) is actually http://en.wikipedia.org/wiki/Cassia_%28legume%29
Also what about heterotypic synonyms?
Things like & need to be replace with *&*; Note that if you are not careful this can become *&*;*&*;*&*;*&*;*&*; as it is re-encoded.
The GNI URI's are not for human consumption and if you view the entity above in a semantic web browser (see link below) it will substitute the URI for the human label.
Note that this is what Sig.ma has done with both the GNI URI's and the Geonames URI's.
The difference is that in this case the RDF has made it clear that we mean " *the geospatial entity that is the state of Texas*."
http://sig.ma/search?pid=013030d4a93abdd6206234b683c51b31
So these have a human label, unfortunately your current browser does not know to display it.
Also aren't TDWG URI's supposed to be opaque?
To an extent the TaxonConcept URI's are designed for humans in that it is easier to copy and paste and type *V2ldt* or http://lod.taxonconcept.org/ses/V2ldt.html
than * * *Centrocercus minimus* J. R. Young, C. E. Braun, S. J. Oyler-McCance, J. R. Hupp and T. W. Quinn 2000
and then recognize that this is the same thing as
*Centrocercus minimus* J. R. Young, C. E. Braun, S. J. Oyler-McCance, J. R. Hupp and T. W. Quinn, 2000
*Centrocercus minimus* J. R. Young, C. E. Braun, S. J. Oyler-McCance, J. R. Hupp & T. W. Quinn, 2000
*Centrocercus minimus* J. R. Young, C. E. Braun, S. J. Oyler-McCance, J. R. Hupp & T. W. Quinn 2000
*Centrocercus minimus* J. R. Young, C. E. Braun, S. J. Oyler-McCance, J. R. Hupp et T. W. Quinn 2000
*Centrocercus minimus* J. R. Young, C. E. Braun, S. J. Oyler-McCance, J. R. Hupp et T. W. Quinn, 2000
*Centrocercus minimus* Young et al., 2000
It is also clear (or will be) that http://lod.taxonconcept.org/ses/V2ldt.html is about a species that is known by all those names and other synonyms through the GNI.
In contrast is not clear that *Centrocercus minimus* Young et al., 2000 is the same as all the other name variants listed above or that it may have other synonyms.
This is without getting into the variations in *secundum*.
So are people going to include all the possible name variants in their data sets or simply choose a uri that links to those variants for them?
The current species concepts are not complete in the sense that they do not make it clear as to what is and what is not an instance of a given species concept.
However, they are a start and since they contain links to both Wikipedia and Wikispecies, as well as a number of other annotative information sources, they will provide humans a simple way to determine if what they have in their records is the same "kind of thing" as what is represented by the species concept URI.
Also there is nothing that says that taxonconcept.org could not be moved to some other institution in the future.
Respectfully,
- Pete
On Thu, Nov 25, 2010 at 10:06 AM, Gregor Hagedorn g.m.hagedorn@gmail.comwrote:
A side remark, about where I believe the whole discussion is misleading:
Puma concolor se:v6n7p That way in the future if the name changes without a change in the
concept.
Eupuma concolor se:v6n7p The data says linked.
This always looks nice... However, with such proposals we, the computer guys, make the concept-assessment someone elses problem (i.e. the taxonomists, ecologists, pathologist, etc.), and, at the same time, do not provide them the means to communicate. The assumption is that a scientist or applied worker would know whether to add se:v6n7p to a given taxon name or not.
With my taxonomer/pathologist hat on: I mostly have no clue which concept XXX concolor is - and whether it is changed or not. Puma concolor may be a different concept than Puma concolor. We are, of course, guilty of communicating in a shamefully loose way (s.str., s. lat. etc.), which could and should be improved by citing a secundum, but beyond that: mapping concepts is a taxonomic opinion, no objective truth.
So given that any trivial mapping mechanism can map multiple IDs (Puma concolor, Eupuma concolor) to a single concept - the proposal saves this trivial processing time, but does not contribute to the problem of communicating in a way that is suitable to assess taxon concepts.
Aside: Please compare the highly linked, and generally correctly linked Wikipedias with other content management system for the advantage of human legible IDs [[Puma concolor]] over http://x.y.net/node/234872561 - links. My own observation is that in the latter case only a fraction of the desirable links are created, and that these are quite often going to wrong, or perhaps obsoleted places.
I therefore think: se:Puma_concolor_sec._Smith would be a much more useful mechanism than all the computer-scientists-only proposals like se:v6n7p or
http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9
Gregor
If this is so trivial then why does GBIF have one map for Felis concolor and one map for Puma concolor? also why does the Barcode of life not have all their Aedes triseriatus and Ochlerotatus triseriatus mapped to one id rather than one for each name and misspelling?
I think it shows that the real problem is not URLs versus strings, the problem is the knowledge behind these strings.
se:Puma_concolor_sec._Smith
How many lexical variants of the string above are there likely to be?
This is a misunderstanding, my example is a URI, with owl:sameAs etc. behind
What you see as http://en.wikipedia.org/wiki/Cassia_(legume) is actually http://en.wikipedia.org/wiki/Cassia_%28legume%29 Things like & need to be replace with & Note that if you are not careful this can become &&&&& as it is re-encoded.
It certainly happens that bugs appear even after 20 years of using URLs, but it is a class of highly generic bugs in any web-software. It does not appear a good argument to me.
I am not aware that people have problems linking to Wikipedia or DBpedia, which happens to use exactly these human-proofreadable URIs.
-------------
Also aren't TDWG URI's supposed to be opaque?
Why do we use dwc:scientificName instead of dwc:entity013030d4a93abdd6206234b683c51b31 ?
I am sure the semantic vocabulary management system for DarwinCore would show the proper label for the opaque URI... :-)
Basically, programmers demand human readability for their own domain, but deny it to the biodiversity domain itself...
I fully believe you and all who are doing it do it with careful consideration of the needs as they see it. I just believe that those taking these decisions have a specific perspective and use case scenarios, that involves biologists only after the perfect software user interface system is finished. I challenge the last assumption ...
Redesign tdwg vocabularies and Darwincore with opaque dwc:concept013030d4a93abdd6206234b683c51b31 URIs instead of dwc:commonName (where I really prefer the synonym vernacularName - or is it the other way round?) and proof that it works well for communication and discussion.
I believe Opaque IDs work OK if they can be systematically and unambiguously assigned. Taxon names and concepts can not, they need to be discussed and "debugged" probably over decades. Just like tdwg vocabularies -- just 6 orders of magnitude greater scope.
Gregor
On Thu, Nov 25, 2010 at 3:13 PM, Gregor Hagedorn g.m.hagedorn@gmail.comwrote:
If this is so trivial then why does GBIF have one map for Felis concolor
and
one map for Puma concolor? also why does the Barcode of life not have all their Aedes triseriatus
and
Ochlerotatus triseriatus mapped to one id rather than one for each name
and
misspelling?
I think it shows that the real problem is not URLs versus strings, the problem is the knowledge behind these strings.
Actually I think the problem lies with the co-mingled use of a scientific name. It is serving as both a stable identifier and as a phylogenetic hypothesis.
The concept of a "species" should have been separated from the hypothesis about where in the phylogenetic tree that concept fits.
The mosquito example above also demonstrates the conflict between those who want to use names as stable identifiers and those who want them to reflect the latest phylogenetic thinking.
The mosquito community seems split on this, so I would say that their is either no accepted name or two accepted names.
It certainly happens that bugs appear even after 20 years of using URLs, but it is a class of highly generic bugs in any web-software. It does not appear a good argument to me.
Yes, it would have been better if things like & or () did not need to be encoded for use in URL's but as a community we could simply decide to use some other characters at least for URL's etc. If we wanted to use full names in URLs.
I am not aware that people have problems linking to Wikipedia or DBpedia, which happens to use exactly these human-proofreadable URIs.
The URLs for different Wikipedia and DBpedia entries change all the time.
I have had problems, the URLs for several mosquitoes have changed along with the URL for Carl Linnaeus since I started this.
Also aren't TDWG URI's supposed to be opaque?
Why do we use dwc:scientificName instead of dwc:entity013030d4a93abdd6206234b683c51b31 ?
I think there are LSID'ers who are doing something like that right now.
I am sure the semantic vocabulary management system for DarwinCore would show the proper label for the opaque URI... :-)
This is a feature of the TripleStore (actually a Quadstore) you just have to tweak the settings to get it to work.
Basically, programmers demand human readability for their own domain, but deny it to the biodiversity domain itself...
I think you might be mischaracterizing me. I started down this road while trying to solve biological problems.
I argued years ago on TDWG for some identifier for the species concept as opposed to an identifier for a name.
When that failed I started creating my own.
Intuitively you know that records for *Felis concolor* and records for *Puma concolor* should appear on the same map, but without some concept id, it can't happen.
In an ideal world, the name should be for the concept and the changing phylogeny should simply be new assertions about that concept.
Then you would not have millions of collection labels with the wrong name on them.
I would argue that the current nomenclatural system with all its codes and rules has little to do with real biology.
Many of the problems we are now dealing with would go away if a new system was created that was based on the biology we know today rather than trying to work around a system that was based on what people thought 250 years ago.
I don't see that happening except in a few areas like Bacteriology, so for now we have to devise complex informatics systems to work around a fundamentally flawed system.
*(My opinion)
I just believe that those taking these decisions have a specific
perspective and use case scenarios, that involves biologists only after the perfect software user interface system is finished. I challenge the last assumption
I also wish that the TDWG standards would include specific semantic uses cases and test data so that people could actually see if it works for them.
That is what I am trying to do with my examples, which are constantly being revised; but I am clearly not a member of the mysterious TDWG Illuminati.
Respectfully,
- Pete
I fully believe you and all who are doing it do it with careful consideration of the needs as they see it. I just believe that those taking these decisions have a specific perspective and use case scenarios, that involves biologists only after the perfect software user interface system is finished. I challenge the last assumption ...
Redesign tdwg vocabularies and Darwincore with opaque dwc:concept013030d4a93abdd6206234b683c51b31 URIs instead of dwc:commonName (where I really prefer the synonym vernacularName - or is it the other way round?) and proof that it works well for communication and discussion.
I believe Opaque IDs work OK if they can be systematically and unambiguously assigned. Taxon names and concepts can not, they need to be discussed and "debugged" probably over decades. Just like tdwg vocabularies -- just 6 orders of magnitude greater scope.
Gregor
Hi Pete,
To answer your question:
Does a species entail a specific classification or does it have many classifications.
A species "concept" (or the various applicable names) is independent of classifications, one or many, which may be applied to it, with one exception: the name that is applicable at any time will depend on the genus within which it is placed in a particular preferred classification scheme. Above that level, it is truly independent. In other words, the task of delimiting and naming species (taxonomy, also nomenclature) and the task of arranging taxa in groups, at least above genus level (systematics) are separate and almost completely independent of one another.
Actually there is one other exception - if a species is moved between codes (e.g. something described as a protist turns out to be a bacterium or an animal fossil turns out to be a plant), the applicable name may also need to be changed because of code-specific nomenclatural provisions. However again, the species concept (taxonomy) has not changed, just the name (nomenclature).
So as a consequence of the above, one is free to apply either a single preferred higher classification, or multiple ones, or none (so long as the placement in a particular genus is clearly indicated) to a particular species concept, or change these through time, without affecting the latter in the main.
Not sure if this helps...
Regards - Tony
-----Original Message----- From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content- bounces@lists.tdwg.org] On Behalf Of Gregor Hagedorn Sent: Friday, 26 November 2010 8:14 AM To: Peter DeVries Cc: tdwg-content@lists.tdwg.org; Markus Döring (GBIF) Subject: Re: [tdwg-content] Does a species entail a specific classification or does it have many classifications.
If this is so trivial then why does GBIF have one map for Felis concolor
and
one map for Puma concolor? also why does the Barcode of life not have all their Aedes triseriatus
and
Ochlerotatus triseriatus mapped to one id rather than one for each name
and
misspelling?
I think it shows that the real problem is not URLs versus strings, the problem is the knowledge behind these strings.
se:Puma_concolor_sec._Smith
How many lexical variants of the string above are there likely to be?
This is a misunderstanding, my example is a URI, with owl:sameAs etc. behind
What you see as http://en.wikipedia.org/wiki/Cassia_(legume) is actually http://en.wikipedia.org/wiki/Cassia_%28legume%29 Things like & need to be replace with & Note that if you are not
careful
this can become &&&&& as it is re-encoded.
It certainly happens that bugs appear even after 20 years of using URLs, but it is a class of highly generic bugs in any web-software. It does not appear a good argument to me.
I am not aware that people have problems linking to Wikipedia or DBpedia, which happens to use exactly these human-proofreadable URIs.
Also aren't TDWG URI's supposed to be opaque?
Why do we use dwc:scientificName instead of dwc:entity013030d4a93abdd6206234b683c51b31 ?
I am sure the semantic vocabulary management system for DarwinCore would show the proper label for the opaque URI... :-)
Basically, programmers demand human readability for their own domain, but deny it to the biodiversity domain itself...
I fully believe you and all who are doing it do it with careful consideration of the needs as they see it. I just believe that those taking these decisions have a specific perspective and use case scenarios, that involves biologists only after the perfect software user interface system is finished. I challenge the last assumption ...
Redesign tdwg vocabularies and Darwincore with opaque dwc:concept013030d4a93abdd6206234b683c51b31 URIs instead of dwc:commonName (where I really prefer the synonym vernacularName - or is it the other way round?) and proof that it works well for communication and discussion.
I believe Opaque IDs work OK if they can be systematically and unambiguously assigned. Taxon names and concepts can not, they need to be discussed and "debugged" probably over decades. Just like tdwg vocabularies -- just 6 orders of magnitude greater scope.
Gregor _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Hi Gregor,
I have had some more time to think about this issue.
In an ideal world the combination of genus and specific epithet should map to one concept.
I believe that was Linnaeus's original intent.
Unfortunately the current system seems to have strayed from this original goal.
Also, it was unanticipated that changes in phylogenetic thinking would cause as many changes to the "identifier: or name for a given species.
Dima and I came up with the somewhat strange URI's for name strings so we could look at ways to connect some sort of species concept to the various names in the GNI.
The GNI itself does not always know what kind of thing a given name string represents.
It also does not know if a given name string is valid or not.
The GNI needs some additional parts to make sense of these. I will leave it to others to describe those additional parts.
After looking through a large set of species related publications etc. it is clear that the vast majority of information sources only list the genus and specific epithet.
To make sense of those and do entity extraction etc. I think the following service would be useful.
You have a name, lets say Duplictus annoyii.
To get information about what that name might be you reference a semantic web service using a URL form of the name.
http://example.org/names/Duplictus_annoyii
If you a human using a web browser this will return a page that lists all the things it might be.
Kingdom Animalia Duplictus annoyii Schmoe 1886 => appears in database in X, Y, Z Kingdom Animalia Duplictus annoyii Schmoe 1886 sec Thomas 1920 => appears in database in A, I, P Kingdom Plantae Duplictus annoyii Carefulnot => appears in database M
If there are no exact matches it could return:
Similar Name Strings Kingdom Animalia Duplictus annoyi Schmoe 1886 => appears in database in X, Y, Z Kingdom Plantae Duplictus annous Carefulnot => appears in database in A, I, P Kingdom Plantae Duplictus annous Carefulnot => appears in database M Kingdom Plantae Duplicta annous (Carefulnot) => appears in database in X, Y, Z
A semantic web program would receive an RDF version of the information above and could use various heuristics to determine which are the most likely full names.
Eventually the service could also include opinions about which name variants are valid and which are not.
Felis concolor => Felis concolor Linnaeus 1771
"Valid" names would include properly formed synonyms so Felis concolor would return Felis concolor Linnaeus 1771.
The associated URLs for each of these forms would be
HTML Page http://example.org/names/Duplictus_annoyii.html
RDF Page http://example.org/names/Duplictus_annoyii.rdf
This service would help map a given genus specific epithet combination to possible full scientific names with authorship.
You would need an additional even more opinionated web service that would tell you what the most current full name is for a given scientific name with authorship.
One that recognizes that "Felis concolor Linnaeus 1771" is a older synonym for "Puma concolor (Linnaeus, 1771)"
In GNUB / GNITE these two URI's could be considered reconciliation groups.
http://example.org/recgroups/Animal/Felis_concolor => PrefName:""Felis concolor Linnaeus 1771" http://example.org/recgroups/Animal/Puma_concolor => PrefName:"Puma concolor (Linnaeus, 1771)"
This allows a different service or different groups to make statements about these.
http://example.org/recgroups/Animal/Felis_concolor olderSynonymFor http://example.org/recgroups/Animal/Puma_concolor
or
http://example.org/recgroups/Animal/Felis_concolor isBasionymOf http://example.org/recgroups/Animal/Puma_concolor
Note that this system allows for different groups to make different an potentially conflicting statements about the different reconcillation groups.
These different statements are from different groups so can assigned to different "graphs"
<subject> <predicate> <object> <graph>
For example:
http://example.org/recgroups/Animal/Aedes_triseriatus isPreferedNameVariantOf <speciesXYZ> urn:frustrated.medical.entomologists.org http://example.org/recgroups/Animal/Ochlerotatus_triseriatus isPreferedNameVariantOf <speciesXYZ> urn:frustrated.dipterist.taxonomist.org
For the vast majority of people that simply care about mapping these names to the same "thing:
They simply do the following in their own knowledge base.
http://example.org/recgroups/Animal/Aedes_triseriatus sameAs http://example.org/recgroups/Animal/Ochlerotatus_triseriatus
Now for them facts like:
http://example.org/recgroups/Animal/Aedes_triseriatus larvalDevelopmentIn stdVocab:TreeHoles
http://example.org/recgroups/Animal/Ochlerotatus_triseriatus takesBloodMealsFrom stdVocab:Birds
http://example.org/recgroups/Animal/Ochlerotatus_triseriatus knownVectorOf stdVocab:LacCrosseEncephalitis
Are interpreted as being statements about the same "thing"
Respectfully,
- Pete
On Thu, Nov 25, 2010 at 10:06 AM, Gregor Hagedorn g.m.hagedorn@gmail.comwrote:
A side remark, about where I believe the whole discussion is misleading:
Puma concolor se:v6n7p That way in the future if the name changes without a change in the
concept.
Eupuma concolor se:v6n7p The data says linked.
This always looks nice... However, with such proposals we, the computer guys, make the concept-assessment someone elses problem (i.e. the taxonomists, ecologists, pathologist, etc.), and, at the same time, do not provide them the means to communicate. The assumption is that a scientist or applied worker would know whether to add se:v6n7p to a given taxon name or not.
With my taxonomer/pathologist hat on: I mostly have no clue which concept XXX concolor is - and whether it is changed or not. Puma concolor may be a different concept than Puma concolor. We are, of course, guilty of communicating in a shamefully loose way (s.str., s. lat. etc.), which could and should be improved by citing a secundum, but beyond that: mapping concepts is a taxonomic opinion, no objective truth.
So given that any trivial mapping mechanism can map multiple IDs (Puma concolor, Eupuma concolor) to a single concept - the proposal saves this trivial processing time, but does not contribute to the problem of communicating in a way that is suitable to assess taxon concepts.
Aside: Please compare the highly linked, and generally correctly linked Wikipedias with other content management system for the advantage of human legible IDs [[Puma concolor]] over http://x.y.net/node/234872561 - links. My own observation is that in the latter case only a fraction of the desirable links are created, and that these are quite often going to wrong, or perhaps obsoleted places.
I therefore think: se:Puma_concolor_sec._Smith would be a much more useful mechanism than all the computer-scientists-only proposals like se:v6n7p or
http://gni.globalnames.org/name_strings/772d5162-f5aa-596c-98e0-a1c6c5a29bb9
Gregor
Hi Peter,
since you address me, I have a hard time to refrain from polluting other peoples mailboxes...
I believe that was Linnaeus's original intent. Unfortunately the current system seems to have strayed from this original goal.
Linné believed that species types were objective (they are not, they depend on a species concept, which sadly is usually not mentioned in publications...) and he did not take scientific progress (and the versioning this demands) into account. Versioning is still a challenge for much of scientific communication, I think we do not need to be too self deprecating.
You have a name, lets say Duplictus annoyii. To get information about what that name might be you reference a semantic web service using a URL form of the name. http://example.org/names/Duplictus_annoyii If you a human using a web browser this will return a page that lists all the things it might be. Kingdom Animalia Duplictus annoyii Schmoe 1886 => appears in database in X, Y, Z
Yes, that is name usage indexing, which is very useful.
Eventually the service could also include opinions about which name variants are valid and which are not. Felis concolor => Felis concolor Linnaeus 1771 "Valid" names would include properly formed synonyms so Felis concolor would return Felis concolor Linnaeus 1771.
If you mean nomenclaturally validly formed name, yes, I believe that is a service that IPNI and others should perform.
The associated URLs for each of these forms would be HTML Page http://example.org/names/Duplictus_annoyii.html RDF Page http://example.org/names/Duplictus_annoyii.rdf This service would help map a given genus specific epithet combination to possible full scientific names with authorship.
I think of these names, among possible name variants, as "canonical" names, i.e. names that are formed according to best practice, e.g. as defined by the code (but possible also by others).
In GNUB / GNITE these two URI's could be considered reconciliation groups. http://example.org/recgroups/Animal/Felis_concolor%C2%A0%C2%A0 => PrefName:""Felis concolor Linnaeus 1771" http://example.org/recgroups/Animal/Puma_concolor%C2%A0%C2%A0%C2%A0=%3E PrefName:"Puma concolor (Linnaeus, 1771)" This allows a different service or different groups to make statements about these. http://example.org/recgroups/Animal/Felis_concolor%C2%A0olderSynonymFor http://example.org/recgroups/Animal/Puma_concolor or http://example.org/recgroups/Animal/Felis_concolor%C2%A0isBasionymOf%C2%A0ht...
Whereas the basionym- or replacement-name-relation is homotypic unambiguous, the heterotypic synonym relation is a function of species concepts.
What I try to say about taxon concept management, is that we usually don't have the biological knowledge to document the circumscription of a taxon in an objective way. There are trivial and easy cases, like two morphoforms, long recognized, well understood, that are now understood to be truly only one species with varieties. In practice, this is, in my opinion, rare. More likely is that a new species is now detected. However, now it depends on the phrasing and structure of the identification tools previously used, whether earlier identifications according to a given publications would have included the new species or not. That is, in the light of a new species, every publication that aids identification defines its own taxon concept.
I am rather certain that biologists don't have the resources to investigate this in its full breadth. However, in some important cases, where necessary or particularly interesting, these resources have been found, and opinions about the concept mapping have been expressed. To me the function of managing taxon concepts is to be able to manage this, while still allowing to reason (with greater margin of error) on the "default assumption" that - unless otherwise known - most taxon concepts for most purposes can be "roughly" set equivalent.
This is why I was commenting on the attempt to introduce a single class of taxon-concept URIs for all living things, without contemplating the mismatch of resources to actually document and verify them.
Are interpreted as being statements about the same "thing"
What you describe is the goal!
An interesting question I have is whether I can distinguish reasoning that was made based on those cases where taxon concepts have been investigated, from those based on default nomenclatural equivalence (i.e. homotypic synonyms)?
Also, I think I am missing timelines. If I make the default assumption: "names match" for the last 20 years, I am likely to get good results. If I do this for the last 200 years, I probably end up with garbage only.
One think I have been thinking of: provided I know when a new treatment has been published which is now generally accepted, assuming that it takes 10 years to spread (plenty of counter-examples to that...), then such a timeframe might be used to qualify the quality of inferences. Perhaps...
Also, the quality of inferences on species having been recently lumped versus species has been split is different. In the first case, pretty naive heterotypic synonymy works quite reliably, in the latter case only details (and usually unavailable) study of taxon concept works.
I think the present work of listing the accepted names with their present heterotypic synonymy in CoL will be a big progress. I wonder whether these data can be used to identify validity time periods and information about splitting or lumping changes?
I have no ready solutions to these problems and appreciate all who tackle it!
Gregor
Hi Gregor,
Thanks for this. :-)
What I try to say about taxon concept management, is that we usually don't have the biological knowledge to document the circumscription of a taxon in an objective way. There are trivial and easy cases, like two morphoforms, long recognized, well understood, that are now understood to be truly only one species with varieties. In practice, this is, in my opinion, rare. More likely is that a new species is now detected. However, now it depends on the phrasing and structure of the identification tools previously used, whether earlier identifications according to a given publications would have included the new species or not. That is, in the light of a new species, every publication that aids identification defines its own taxon concept.
Yes, and unfortunately the are written in a way that makes it difficult to determine which specimens are instances of this concept and which are not. In part, because they are interpreted differently be different taxonomists.
I am rather certain that biologists don't have the resources to investigate this in its full breadth. However, in some important cases, where necessary or particularly interesting, these resources have been found, and opinions about the concept mapping have been expressed. To me the function of managing taxon concepts is to be able to manage this, while still allowing to reason (with greater margin of error) on the "default assumption" that - unless otherwise known - most taxon concepts for most purposes can be "roughly" set equivalent.
This is why I was commenting on the attempt to introduce a single class of taxon-concept URIs for all living things, without contemplating the mismatch of resources to actually document and verify them.
My intent was to have a set of taxonconcept URI's but this does not prevent someone from creating other ones that are broader or narrower.
It would be best if these were done in a way that was compatible so that you could make the following statements.
Some of the existing "concept" like things that I link to using skos:closeMatch are not really the same "kind of thing".
But if someone followed the same general form but had their concept entail a slightly different set of individuals you could do the following.
<*txn:*SpeciesA> <*vocab:*sensu_lato_To> <*bio:*SpeciesY*>* * * In this sense the *txn* concepts and "*bio*" concept are the same kind of species concept but they entail different set of individual organisms. * * Within each of these namespaces there should be little overlap between concepts.
By that I mean very few individuals would fall within two or more concepts.
But a bio concept could map to multiple txn concepts (sensu lato) or the other way around.
So within a given name space like txn, or bio the concepts are relatively clean and separate, but they don't have to be clean between namespaces.
Also it is not obvious in their current form, but the txn concepts will eventually provide some information that helps determine what individuals are instances of that concept and what individuals are not instances of that concept.
Some one identifying specimens could use this to determine which species concept is most appropriate for their specimen.
They should be able to say that their specimen is a instance of a txn:SpeciesX *and* an instance of bio:SpeciesY.
In the same way that specimen might be and instance of both a sensu lato concept and a sensu stricto concept.
This will take a while :-)
Respectively,
- Pete
* * * *
Are interpreted as being statements about the same "thing"
What you describe is the goal!
An interesting question I have is whether I can distinguish reasoning that was made based on those cases where taxon concepts have been investigated, from those based on default nomenclatural equivalence (i.e. homotypic synonyms)?
Also, I think I am missing timelines. If I make the default assumption: "names match" for the last 20 years, I am likely to get good results. If I do this for the last 200 years, I probably end up with garbage only.
One think I have been thinking of: provided I know when a new treatment has been published which is now generally accepted, assuming that it takes 10 years to spread (plenty of counter-examples to that...), then such a timeframe might be used to qualify the quality of inferences. Perhaps...
Also, the quality of inferences on species having been recently lumped versus species has been split is different. In the first case, pretty naive heterotypic synonymy works quite reliably, in the latter case only details (and usually unavailable) study of taxon concept works.
I think the present work of listing the accepted names with their present heterotypic synonymy in CoL will be a big progress. I wonder whether these data can be used to identify validity time periods and information about splitting or lumping changes?
I have no ready solutions to these problems and appreciate all who tackle it!
Gregor
In an ideal world the combination of genus and specific epithet should map to one concept.
That world doesn't exist, as long as taxonomists will declare one name to be a heterotypic synonym of another.
The type specimen of the name "Centropyge fisheri" was collected in Hawaii.
The type specimen of the name "Centropyge flavicauda" was collected at Macclesfield Banks, China Sea.
Some treatments regard these as distinct species. Some regard them as synonyms. The name with priority is C. fisheri.
Thus, the combination of genus and specific epithet "Centropyge fisheri" can therefore refer to at least two different concepts: one (sensu stricto) is the set of organisms in Hawaii; the other (sensu lato) is the set of organisms in Hawaii and the China Sea (and places in-between).
As long as there are heterotypic synonyms in the world (i.e., as long as taxonomists disagree...), there ideal world will never exist.
And, of course, this doesn't even take into account homonyms.
I believe that was Linnaeus's original intent.
Only because he was a creationist.
Unfortunately the current system seems to have strayed from this original
goal.
Only because taxonomists insist on believing in evolution, and have differing opinions about how best to apply names to aggregations of organisms in nature (lumpers v. splitters).
The GNI needs some additional parts to make sense of these. I will leave it to others to describe those additional parts.
I assume you mean GNUB? Or something else?
"Valid" names would include properly formed synonyms so Felis concolor would return Felis concolor Linnaeus 1771.
Careful with the use of the word "valid". It means two entirely different things to botaniststs and zoologists, and neither of them mean it in the same way you are using it.
Aloha, Rich
On Mon, Nov 29, 2010 at 6:36 AM, Richard Pyle deepreef@bishopmuseum.org wrote:
[..snip..]
As long as there are heterotypic synonyms in the world (i.e., as long as taxonomists disagree...), there ideal world will never exist.
What? Taxonomists disagree? What an astonishing idea ! I'm gonna send your post to WikiLeaks. ;-)
Hi Rich,
The type specimen of the name "Centropyge fisheri" was collected in Hawaii.
The type specimen of the name "Centropyge flavicauda" was collected at Macclesfield Banks, China Sea.
Some treatments regard these as distinct species. Some regard them as synonyms. The name with priority is C. fisheri.
Thus, the combination of genus and specific epithet "Centropyge fisheri" can therefore refer to at least two different concepts: one (sensu stricto) is the set of organisms in Hawaii; the other (sensu lato) is the set of organisms in Hawaii and the China Sea (and places in-between).
I am not familiar with the reasoning behind this example but the problem starts when
*Centropyge fisheri* <= *Centropyge flavicauda*
Since the majority of publications use just Centropyge fisheri, it is not clear if they meant
The original *Centropyge fisheri* or *Centropyge flavicauda*
These are two concepts:
*Centropyge fisheri* anchored by one instance of the species concept an individual organism
*Centropyge flavicauda* anchored by one instance of the species concept an individual organism
The descriptions probably do not provide much guidance as to where one concept starts and another concept begins.
By this I mean a reliable and repeatable guide to what specimens are instances of each of these concepts.
If *Centropyge fisheri* and *Centropyge flavicauda* were kept separate then entailed concept would be clearer.
Later someone could hypothesize that these two concepts overlap and provide evidence that these two specimens that are members of the same species.
In the process they would produce some testable guide that clarifies what other individuals are also instances of this species concept.
But for now, if you are trying to assemble facts, the mapping of *Centropyge flavicauda* to *Centropyge fisheri *makes it unclear what concept was intended.
In summary, those who assert that something is a heterotypic synonym of something else should provide a guide that clarifies what other individuals would or would not be instances of that same species concept.
I assume you mean GNUB? Or something else?
Yes, I think I wrote GNITE/GNUB, or I intended to.
"Valid" names would include properly formed synonyms so Felis concolor would return Felis concolor Linnaeus 1771.
Careful with the use of the word "valid". It means two entirely different things to botaniststs and zoologists, and neither of them mean it in the same way you are using it.
Yes, there should be a way to output code compliant names etc, but the entire process of mapping things to other things does not need to be code compliant.
There will need to be intermediate namestrings created in the workflow that may not be code compliant.
There are for intermediate entities that are not covered by the codes.
For instance, it would make a lot of sense that somewhere in this process there was a link, or documentation to the publication and author behind "Linnaeus 1771" etc.
In these examples, those links exist http://lod.taxonconcept.org/ses/2mqjL.html http://lod.taxonconcept.org/ses/v6n7p.html
Respectfully,
- Pete
--------------------------------------------------------------- Pete DeVries Department of Entomology University of Wisconsin - Madison 445 Russell Laboratories 1630 Linden Drive Madison, WI 53706 TaxonConcept Knowledge Base http://www.taxonconcept.org/ / GeoSpecies Knowledge Base http://lod.geospecies.org/ About the GeoSpecies Knowledge Base http://about.geospecies.org/ ------------------------------------------------------------
participants (6)
-
"Markus Döring (GBIF)"
-
Bob Morris
-
Gregor Hagedorn
-
Peter DeVries
-
Richard Pyle
-
Tony.Rees@csiro.au