[tdwg-content] Another example of non-overlapping concepts

Peter DeVries pete.devries at gmail.com
Thu May 19 21:22:54 CEST 2011


Hi Nico,

I don't think that we are very far apart.

   The way I recall discussions as the TCS was designed, the role for the
> DarwinCore was to allow data providers to include sufficient information in
> the DC so that the vouchers/observations could be identified to a suitably
> authoritative concept. In another realm of the biodiversity informatics net,
> that concept would be represented in more depth, and ideally have multiple
> relationships mapped and/or inferred to relevant past, present, and future
> concepts.
>

In my mind it would be best to link to an open resolvable concept since even
the original descriptions often not very informative regarding what
individuals should considered instances of the same concept.
One reading of the TCS is that this is something that is applied long after
the specimen is collected and identified, most likely by someone that did
not collect or may not have even seen the actual specimen.
So in some sense this is probably more error prone than the imperfectly
defined concepts I am proposing. Or at least more likely to differ in the
intended meaning of the original observer / identifier.


>
>    For what it's worth (really not much), I agree that the eBird project is
> finding a (the?) pragmatic solution to expanding their contributor base, and
> in all likelihood have a pretty good to excellent taxonomy to work with
> already. Keep in mind that I study weevils, with 65k species described and
> 220k species (conservatively) estimated to exist, and a ~ 200 tribes
> mid-level classifications based largely on Lacordaire's 1-2 external
> character system established in 1863 (claws single, versus paired; virtually
> none of these have any phylogenetic value). So experientially I come from
> that part of the knowledge spectrum where we're likely centuries away from a
> sufficiently stable naming system that includes, say, more than 2/3 of the
> actual species diversity.
>

Yes this is one reason most of my examples involve entities that are not
particularly controversial in their own right.

For now the idea is to concentrate on the model and how different things are
related to each other.

For instance,I noticed that I had some parrot's that were incorrectly marked
up as "expected in" North American.

This will be fixed in the next RDF dump, along with the freebase cougar link
which is now fixed in the online RDF put not in the RDF dump.

This is less about having everything perfect and more about figuring out how
one might markup these kinds of relations.

I had been watching for URI changes in DBpedia and Wikipedia but not
Freebase.

I read a blog recently that described GBIF's efforts to clean up their
occurrence records and thought that their 1 degree areas might make good
candidates for URI's.

http://ff.im/-DwvMu

This would allow them to make cleaned version of species occurrences that
tags each species occurrence to a particular "degree_area"

If they existed I would try to add them to my examples.

Also note that I include name variations that link to the GNI. These are not
true synonyms they should be interpreted as "*someone said this might be a
synonym of the current name*",

In addition, what I call BasionymName is not the same as what others call
it.

I use this field to put in what appears to be the first name used for the
species. In Zoology this would be the name that does not contain ().

It might be best to change to some other term in the future to avoid
confusion.


   I'm not opposed to pragmatic solutions for taxa where it makes sense
> (again, as if anyone cared..). But, trying to foresee the very substantive
> classificatory shifts that many other groups likely still will experience
> 10, 20, 50, 100 years down the road from now, I think just the same that
> there are solid grounds for working out an admittedly non-pragmatic, but
> sematically maximally powerful solution. I think it's not unreasonable to
> assume that from some groups, classification in 250 years from now will look
> just as different from today's system as Linnaeus' 1758 system looks to us
> today. He recognized 2 weevil genera and about 90 species. Now we have 5800
> genera. We may end up with 15,000.
>
>
Yes, you are correct. I am particularly concerned about tying the concepts
to one taxonomic hierarchy.

For your weevil's, I think that there would be some advantages in marking up
what you think exists, documenting those with photos and links specimens and
name variations.
Open accessible versions of these are much more likely to be improved that
ones that are hidden in a dispersed collection of hard to find journals -
many which have limits on the number of photographs etc.

This woud certainly make it easier for someone with a specimen that they
think applies to find you and the other potential concept candidates.

I don't have many weevils so you might want to think about doing this
yourself. My only concern is how to keep the vocabularies etc in sync while
everything is being discussed.

   The problem that at present only taxonomic experts can understand how to
> retrace the meanings of concepts proposed in the history of the field, and
> computers can't yet do this because the data are not marked up and linked to
> each other precisely enough (as precisely as an ontological representation
> would demand it), is not just going to go away by some top-down adherence to
> a "consensus". Most taxonomists jump into the field precisely because they
> come to realize that the "consensus" has serious problems (read: "sucks").
> So, as long as there is justified taxonomic research, there will be
> reclassification. ]And no, IMO coming up with a synapomorphy-based or
> node-pointing naming system will not miraculously allow us to have a
> reliable system.]
>
>
Yes, this will never really go away. My point with eBird and The Plant List
is that there are groups doing this, seemingly without much controversy, so
why is what I propose so controversial?

   To the extent that there is a discussion here on the list as to where the
> DC and a possible successor of the TCS is going, I think that's a worthwhile
> discussion to have.
>
>
I hope that others agree with you. :-)

- Pete



> Regards,
>
> Nico
>
>
>
> On 5/13/2011 4:41 PM, Peter DeVries wrote:
>
> Hi Nico,
>
>  Thanks for posting this.
>
>  I have something in the concept model to indicate the basis for the
> species concept.
>
>  For now I have three types. An individual species concept can have a
> combination of one, two or all three
>
>  In the RDF they look like this
>
>  <txn:speciesConceptBasedOn rdf:resource="
> http://lod.taxonconcept.org/ontology/txn.owl#ObjectiveSpeciesModel"/>
>
>  The first is what I call the #ObjectiveSpeciesModel - this indicates that
> it is a species concept because we say it is.
>
>  All the species concepts are at least an #ObjectiveSpeciesModel
>
>  *This is in part a way to handle things like the domestic cat which you
> want to be seen as different from the African Wildcat.
>
>  There are also tags for
>
>  txn:PhylogeneticSpeciesModel
> txn:BiologicalSpeciesModel
>
>  For now I don't have these other models set in the example data, but
> fields are in the database and the code for that an editor could state the
> basis for the model.
>
>  I can think of a couple of different ways to handle the issue of
> alternative species concepts.
>
>  * Note that the identifications as proposed by DarwinCore don't seem to
> indicate what kind of model the identifications were based on.
>   So it is not clear to me if a straight DarwinCore data set would allow
> the analysis above.
>
>  Instead of having multiple different statements like
>
>  *txn:occurrenceHasSpeciesConcept <> *in the record for each occurrence
>
>  one could use different predicates to link to different kinds of species
> concepts.
>
>  *txn:occurrenceHasUniprotConcept* => <
> http://purl.uniprot.org/taxonomy/9696>
>
>  This would allow someone to query for the occurrences of <
> http://purl.uniprot.org/taxonomy/9696>
>
>  That said, it is not clear to me what people mean by different
> identifications.
>
>  Is the intent to have identifications with different homotypic synonyms
> to be an identification of the same thing or not?
>
>  The way it works now in many data sets is that Felis concolor, Puma
> concolor and Puma conncolor are treated as identifications of different
> things.
>
>  This is another way of saying* is the namestring the concept?*
> *
> *
> My understanding of the eBird project is that it allows citizen scientists
> to contribute their own observations. This creates a much larger data set
> for analysis etc.
>
>  They have a created a curated list of species and a ~6 letter code for
> each. This serves as a guide for observers on how to encode their
> observations.
>
>  I think their progress would be inhibited, the occurrence coding
> inconsistant, and contributors frustrated, if they have a list that included
> many overlapping species concepts.
>
>  Thanks again for you comments,
>
>  - Pete
>
> On Fri, May 13, 2011 at 3:05 AM, Nico Franz <nico.franz at upr.edu> wrote:
>
>>  Hello Pete (et al.):
>>
>>    For birds, Town Peterson at KU and colleagues have published these
>> papers showing how alternative bird taxonomies affect the ranking of
>> conservation priorities.
>>
>>
>>
>> http://specify5.specifysoftware.org/Informatics/bios/biostownpeterson/PN_CB_1999.pdf
>>
>> http://specify5.specifysoftware.org/Informatics/bios/biostownpeterson/NP_BN_2004.pdf
>>
>> http://specify5.specifysoftware.org/Informatics/bios/biostownpeterson/P_BCI_2006.pdf
>>
>>    Here's the abstract of the 1999 paper:
>>
>> Analysis of geographic concentrations of endemic taxa is often used to
>> determine priorities for conservation
>> action; nevertheless, assumptions inherent in the taxonomic authority list
>> used as the basis for
>> analysis are not always considered. We analyzed foci of avian endemism in
>> Mexico under two alternate species
>> concepts. Under the biological species concept, 101 bird species are
>> endemic to Mexico and are concentrated
>> in the mountains of the western and southern portions of the country.
>> Under the phylogenetic species
>> concept, however, total endemic species rises to 249, which are
>> concentrated in the mountains and lowlands
>> of western Mexico. Twenty-four narrow endemic biological species are
>> concentrated on offshore islands, but
>> 97 narrow endemic phylogenetic species show a concentration in the
>> Transvolcanic Belt of the mainland and
>> on several offshore islands. Our study demonstrates that conservation
>> priorities based on concentrations of
>> endemic taxa depend critically on the particular taxonomic authority
>> employed and that biodiversity evaluations
>> need to be developed in collaboration or consultation with practicing
>> systematic specialists.
>>
>>    There was a debate recently on Taxacom that was started and
>> subsequently neatly summarized by Fabian Haas. The topic was "let's
>> summarize reasons why 'donors' seem to not fund taxonomy". One point from
>> the summary was this:
>>
>> 3) Taxonomy is over-accurate for most applications
>>
>> Most (not all) decisions in e.g. modelling and conservation are done and
>> can be done without complete knowledge of taxa. As it is, decisions for
>> conservation areas are often based on flagship species (e.g. elephants), on
>> taxa which have an excellent research background, e.g. birds (IBAs), on
>> availability of land (e.g. land with a high Tsetse burden), importance as
>> corridor and other factors, but never on a complete view on an all
>> biodiversity in a specific area. Even if an inventory existed, it would be
>> an illusion that we could collect data on ecological requirements and
>> population dynamics for most of the species necessary for informed
>> decisions. A complete inventory does not seem to provide an advantage for
>> conservation.
>>    I personally think there's some truth to that. I also think that, while
>> it's understandable that an accurate representation of the (sometimes)
>> fleetingness of taxonomic consensus it not a priority for applied ecological
>> projects, if taxonomists themselves don't find better ways to document and
>> link these alternatives perspectives, then it's not the best science we can
>> do. That would be fine too if adopted outright as a pragmatic stance.
>>
>> Regards,
>>
>> Nico
>>
>>
>>
>> On 5/13/2011 1:08 AM, Peter DeVries wrote:
>>
>>  I thought that I would also mention that in addition to The Plants List,
>> the eBird project also uses on overlapping concepts in its bird list (it
>> does have concepts for common hybrids)
>>
>>  What is clear to me is that you cannot create graphs like these if every
>> observation can have X number of species (especially those that overlapping
>> ) without any indication which is is the most appropriate one.
>>
>>  eBird Occurrence Maps Northern Cardinal
>> http://ebird.org/content/ebird/about/occurrence-maps/northern-cardinal
>>
>>  NCBI is also similar.
>>
>>  Perhaps a member of the consensus committee can comment?
>>
>> -- Pete
>>
>> ------------------------------------------------------------------------------------
>> Pete DeVries
>> Department of Entomology
>> University of Wisconsin - Madison
>> 445 Russell Laboratories
>> 1630 Linden Drive
>> Madison, WI 53706
>> Email: pdevries at wisc.edu
>> TaxonConcept <http://www.taxonconcept.org/>  &  GeoSpecies<http://about.geospecies.org/> Knowledge
>> Bases
>> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
>>
>> --------------------------------------------------------------------------------------
>>
>>
>> _______________________________________________
>> tdwg-content mailing listtdwg-content at lists.tdwg.orghttp://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>>
>>
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>
>>
>
>
> --
>
> ------------------------------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> Email: pdevries at wisc.edu
> TaxonConcept <http://www.taxonconcept.org/>  &  GeoSpecies<http://about.geospecies.org/> Knowledge
> Bases
> A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
>
> --------------------------------------------------------------------------------------
>
>
>


-- 
------------------------------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
Email: pdevries at wisc.edu
TaxonConcept <http://www.taxonconcept.org/>  &
GeoSpecies<http://about.geospecies.org/> Knowledge
Bases
A Semantic Web, Linked Open Data <http://linkeddata.org/>  Project
--------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20110519/ab5864d6/attachment.html 


More information about the tdwg-content mailing list