Hi Pete (et al.):

   I can unfortunately only comment on some of this. One thing I'd like to clarify, though perhaps not needed. -- There's a way in which taxonomists sometimes say "different names, same concept". Meaning: there's a synonymy relationship between two or more Latin (species) names, but not a real disagreement about the underlying circumscription of the taxon that's being referred to. I suppose your "Felis concolor, Puma concolor and Puma conncolor" example fits this situation.

   Now, that meaning of "sameness" ("same concept" as above) doesn't strictly work in the taxon concept "world"; because concepts are in the first instance distinguished by their LABELS, i.e. the Latin name (name author) + sec. author/publication combo, as in Quercus robur L. sec. Nixon 2003 (making this one up). So if you have Quercus robur L. sec Linnaeus 1758 and Quercus robur L. sec. Nixon 2003; these are two different labels and thus two different concepts in TCS speak, even if their meaning (referential extension) is "the same". Sorry if this was obvious and thus missed the mark. The problem is that there are at least two common meanings of "concept" in the mix.

   The way I recall discussions as the TCS was designed, the role for the DarwinCore was to allow data providers to include sufficient information in the DC so that the vouchers/observations could be identified to a suitably authoritative concept. In another realm of the biodiversity informatics net, that concept would be represented in more depth, and ideally have multiple relationships mapped and/or inferred to relevant past, present, and future concepts.

   For what it's worth (really not much), I agree that the eBird project is finding a (the?) pragmatic solution to expanding their contributor base, and in all likelihood have a pretty good to excellent taxonomy to work with already. Keep in mind that I study weevils, with 65k species described and 220k species (conservatively) estimated to exist, and a ~ 200 tribes mid-level classifications based largely on Lacordaire's 1-2 external character system established in 1863 (claws single, versus paired; virtually none of these have any phylogenetic value). So experientially I come from that part of the knowledge spectrum where we're likely centuries away from a sufficiently stable naming system that includes, say, more than 2/3 of the actual species diversity.

   I'm not opposed to pragmatic solutions for taxa where it makes sense (again, as if anyone cared..). But, trying to foresee the very substantive classificatory shifts that many other groups likely still will experience 10, 20, 50, 100 years down the road from now, I think just the same that there are solid grounds for working out an admittedly non-pragmatic, but sematically maximally powerful solution. I think it's not unreasonable to assume that from some groups, classification in 250 years from now will look just as different from today's system as Linnaeus' 1758 system looks to us today. He recognized 2 weevil genera and about 90 species. Now we have 5800 genera. We may end up with 15,000.

   The problem that at present only taxonomic experts can understand how to retrace the meanings of concepts proposed in the history of the field, and computers can't yet do this because the data are not marked up and linked to each other precisely enough (as precisely as an ontological representation would demand it), is not just going to go away by some top-down adherence to a "consensus". Most taxonomists jump into the field precisely because they come to realize that the "consensus" has serious problems (read: "sucks"). So, as long as there is justified taxonomic research, there will be reclassification. ]And no, IMO coming up with a synapomorphy-based or node-pointing naming system will not miraculously allow us to have a reliable system.]

   To the extent that there is a discussion here on the list as to where the DC and a possible successor of the TCS is going, I think that's a worthwhile discussion to have.

Regards,

Nico


On 5/13/2011 4:41 PM, Peter DeVries wrote:
Hi Nico,

Thanks for posting this.

I have something in the concept model to indicate the basis for the species concept.

For now I have three types. An individual species concept can have a combination of one, two or all three

In the RDF they look like this

<txn:speciesConceptBasedOn rdf:resource="http://lod.taxonconcept.org/ontology/txn.owl#ObjectiveSpeciesModel"/>

The first is what I call the #ObjectiveSpeciesModel - this indicates that it is a species concept because we say it is.

All the species concepts are at least an #ObjectiveSpeciesModel

*This is in part a way to handle things like the domestic cat which you want to be seen as different from the African Wildcat.

There are also tags for 

txn:PhylogeneticSpeciesModel
txn:BiologicalSpeciesModel

For now I don't have these other models set in the example data, but fields are in the database and the code for that an editor could state the basis for the model.

I can think of a couple of different ways to handle the issue of alternative species concepts.

* Note that the identifications as proposed by DarwinCore don't seem to indicate what kind of model the identifications were based on.
  So it is not clear to me if a straight DarwinCore data set would allow the analysis above.

Instead of having multiple different statements like 

txn:occurrenceHasSpeciesConcept <> in the record for each occurrence

one could use different predicates to link to different kinds of species concepts.

txn:occurrenceHasUniprotConcept => <http://purl.uniprot.org/taxonomy/9696>

This would allow someone to query for the occurrences of <http://purl.uniprot.org/taxonomy/9696>

That said, it is not clear to me what people mean by different identifications.

Is the intent to have identifications with different homotypic synonyms to be an identification of the same thing or not?

The way it works now in many data sets is that Felis concolor, Puma concolor and Puma conncolor are treated as identifications of different things.

This is another way of saying is the namestring the concept?

My understanding of the eBird project is that it allows citizen scientists to contribute their own observations. This creates a much larger data set for analysis etc.

They have a created a curated list of species and a ~6 letter code for each. This serves as a guide for observers on how to encode their observations.

I think their progress would be inhibited, the occurrence coding inconsistant, and contributors frustrated, if they have a list that included many overlapping species concepts.

Thanks again for you comments,

- Pete
 
On Fri, May 13, 2011 at 3:05 AM, Nico Franz <nico.franz@upr.edu> wrote:
Hello Pete (et al.):

   For birds, Town Peterson at KU and colleagues have published these papers showing how alternative bird taxonomies affect the ranking of conservation priorities.

http://specify5.specifysoftware.org/Informatics/bios/biostownpeterson/PN_CB_1999.pdf
http://specify5.specifysoftware.org/Informatics/bios/biostownpeterson/NP_BN_2004.pdf
http://specify5.specifysoftware.org/Informatics/bios/biostownpeterson/P_BCI_2006.pdf

   Here's the abstract of the 1999 paper:

Analysis of geographic concentrations of endemic taxa is often used to determine priorities for conservation
action; nevertheless, assumptions inherent in the taxonomic authority list used as the basis for
analysis are not always considered. We analyzed foci of avian endemism in Mexico under two alternate species
concepts. Under the biological species concept, 101 bird species are endemic to Mexico and are concentrated
in the mountains of the western and southern portions of the country. Under the phylogenetic species
concept, however, total endemic species rises to 249, which are concentrated in the mountains and lowlands
of western Mexico. Twenty-four narrow endemic biological species are concentrated on offshore islands, but
97 narrow endemic phylogenetic species show a concentration in the Transvolcanic Belt of the mainland and
on several offshore islands. Our study demonstrates that conservation priorities based on concentrations of
endemic taxa depend critically on the particular taxonomic authority employed and that biodiversity evaluations
need to be developed in collaboration or consultation with practicing systematic specialists.

   There was a debate recently on Taxacom that was started and subsequently neatly summarized by Fabian Haas. The topic was "let's summarize reasons why 'donors' seem to not fund taxonomy". One point from the summary was this:

3) Taxonomy is over-accurate for most applications

Most (not all) decisions in e.g. modelling and conservation are done and can be done without complete knowledge of taxa. As it is, decisions for conservation areas are often based on flagship species (e.g. elephants), on taxa which have an excellent research background, e.g. birds (IBAs), on availability of land (e.g. land with a high Tsetse burden), importance as corridor and other factors, but never on a complete view on an all biodiversity in a specific area. Even if an inventory existed, it would be an illusion that we could collect data on ecological requirements and population dynamics for most of the species necessary for informed decisions. A complete inventory does not seem to provide an advantage for conservation.

   I personally think there's some truth to that. I also think that, while it's understandable that an accurate representation of the (sometimes) fleetingness of taxonomic consensus it not a priority for applied ecological projects, if taxonomists themselves don't find better ways to document and link these alternatives perspectives, then it's not the best science we can do. That would be fine too if adopted outright as a pragmatic stance.

Regards,

Nico



On 5/13/2011 1:08 AM, Peter DeVries wrote:
I thought that I would also mention that in addition to The Plants List, the eBird project also uses on overlapping concepts in its bird list (it does have concepts for common hybrids)

What is clear to me is that you cannot create graphs like these if every observation can have X number of species (especially those that overlapping ) without any indication which is is the most appropriate one.

eBird Occurrence Maps Northern Cardinal
NCBI is also similar.

Perhaps a member of the consensus committee can comment?

-- Pete
------------------------------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
Email: pdevries@wisc.edu
TaxonConcept  &  GeoSpecies Knowledge Bases
A Semantic Web, Linked Open Data  Project
--------------------------------------------------------------------------------------
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content


_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content




--
------------------------------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
Email: pdevries@wisc.edu
TaxonConcept  &  GeoSpecies Knowledge Bases
A Semantic Web, Linked Open Data  Project
--------------------------------------------------------------------------------------