Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?

21 Nov 2010

      Dear Bob:

I think that your "metapoint" as I understand it, i.e. (1) that 
different levels of semantic resolution will be necessary and/or 
sufficient for a particular task, and (2) that considerations of 
specific use cases should take into account how much resolution is 
needed, is solid. The argument from example can work the other way as 
well, however:

http://onlinelibrary.wiley.com/doi/10.1046/j.1523-1739.1999.013002427.x/pdf

I also think that your anecdote touches on another subject that might 
need more attention. Here's a set of related (and similarly 
self-exploring) quotes from my "Letter to Linnaeus":

"We’re at a juncture in systematics when more precise phylogenetic 
estimates are published at an increasing rate. There is a concomitant 
trend to archive the results in networked repositories intended to serve 
as the primary ‘hubs’ for systematic information. Both the systematic 
and the computer science community seem to have bought into this vision. 
However it is likely that each community underestimates just how much we 
need to adjust our linguistic habits in order to achieve long-term 
integration of systematic products. Computer scientists use a formal 
language (description logic) to build highly structured networks 
(ontologies) that may include classes, instances, parts, properties, 
relationships, and other components and qualifiers. Once the structure 
is in place then powerful algorithms can ‘reason’ about the constituent 
elements, connect them to other ontologies created for related subject 
areas, and so on.

As computer scientists learn about systematics they must initially see a 
strong match between an ontology and a published taxonomy. However, as 
we’ve seen, a classification is never entirely comprehensible in 
isolation, and instead represents a complex mosaic of previous and new 
elements with implicit identities and relationships to each other. Too 
often such expert-made classifications are only comprehensible to other 
expert speakers, i.e., persons who share an intimate understanding of 
the contextuality of the new system and are thus able to make explicit 
the implicit semantic links to previous systems." [...]

"Why have systematists relied so much on painstakingly acquired, 
implicit assumptions about the taxonomic history of particular groups 
when presenting their new classifications? I believe the reason is 
neither some form of elitism (“take that, users!”) nor a lack of 
self-esteem (“who wants to read about all these subtle similarities and 
differences?”). More likely, it’s simply human habit—we make things just 
as explicit as we think is needed at the moment—paired with the 
similarly human notion that the latest perspective is really the one 
that’s going to last for a long time, in spite of all historical 
evidence to the contrary. And so we pass the burden of full semantic 
resolution, both looking backward and forward, on to future 
specialists." [...]

"However, the Linnaean system is not capable of capturing the entirety 
of semantic adjustments that occur when a previous classification is 
revised in light of new evidence. [...] Instead of abandoning the 
Linnaean system, this observation should lead us to express more clearly 
and more consistently what we mean when presenting a new classification. 
[...] At the human level, this requires that we routinely acknowledge 
the ephemerality of our latest insights, spend more time comparing our 
perspective to a previous one that we no longer think holds true, and 
generally pay more attention to the context in which we use taxonomic 
names. [...] If we supplement the Linnaean system with these 
conventions, there will be more linguistic transparency and less 
mistaken urgency to purge the idiosyncrasies of the past or legislate a 
wrong consensus."

http://academic.uprm.edu/~franz/publications/LetterLinnaeus.pdf

---------------

Few additional comments:

The view that taxonomic concepts represent hypotheses about how certain 
names, types, and descriptions relate to perceived entities in nature is 
by no means incompatible with your point that people understand each 
other sufficiently well in many particular situations using even crude 
shorthands for names and concepts (euphorb versus cactus). The 
reliability of the notion that those two lineages are phylogenetically 
distinct is in the ballpark of that of the law of gravity (well, 
actually I am rather foolishly relying on the veracity of your judgment 
of Dr. Thiele's expertise...making all kinds of ancillary assumptions 
that undermine deductive reasoning, sorry Prof. Popper). So yes, we are 
advancing.

But, isn't the point of some of the most critical use cases (e..g., the 
EEA one), not just to properly spell names, but to load up the "system" 
(ontologies, databases, metadata annotations, what have you) with some 
degree of specific taxonomic insight? If and when so then we shouldn't 
assume that matters of contextuality are going to be largely 
insignificant, and instead at some level will have to "teach" the system 
that contextuality.

Binomials and informal names are shorthands that can hold the water in 
most casual conversations among humans, especially if and when the 
involved speakers share a similar scientific and even taxon-specific 
training (in addition to all the other semantic and inferential 
expertise they share just by having been born into and raised in 
society; see Quine). I do feel, however, that our reluctance (if there 
actually is one) to go deeper with ontological representations, is 
neither necessarily due to an obvious limitation of computers - that 
remains to be shown - nor is it the most prudent way to move ahead.

Nico Franz

On 11/20/2010 5:45 PM, Bob Morris wrote:
...
What puzzles me about the highly taxonomically technical parts of
these threads is not that the codes of nomenclature seem difficult to
parse in the sense of formal languages---that's true of lots of
human-produced legislation.  It is that in 15 years of hanging out
with biologists, I have rarely heard them use anything other than
binomials in conversation about anything other than whether binomials
are adequate. Why, I wonder, are they not utterly confused during all
those other conversations, and if they are, does that mean that
conversations about biological topics can not advance biology? (This
seems unlikely to me, else why do they keep doing it?). Does it mean
that "only" hypotheses can come out of these discussions, but that
support for hypotheses can only come from data that is rigorously tied
delicate name formalisms?  It is hard to believe that only hypotheses
can be the subject of these conversations, except for the position
that everything in science is "only" hypothesis. But  maybe when the
amateurs leave the room, they suddenly start talking in  more
code-compliant names.
There are plenty of use cases--and successful information
systems---that don't depend on rigorous names.  Some aspects of
morphology form a simple example. For some uses, it is not a problem
to  illustrate what a sepal is with several  images of different taxa
which are either not named, inadequately named, or even incorrectly
named. Furthermore, this wouldn't change if those images were fetched
from a database in which it is impossible to decide which of those
name defects is in play, e.g. one in which there is nothing other than
binomials as names.
Another example I was personally party to was this conversation, from
memory, that  I was party to in Morocco a few years ago:
Bob Morris: Ooh, that's a beautiful cactus.
Kevin Thiele: It's a Euphorb, not a cactus.  There are no cacti here.
Bob: Why does it look like a cactus?
Kevin: It's pretty much the most successful way to deal with very dry
environments. But they are pretty distant from an phylogenetic point
of view.
Since most of the listeners were biologists, I imagine I was the only
one this was news to.  But what I don't believe is that some of the
party had a radically different understanding of the conversation than
I did.
So, the importance of code-compliant names not withstanding, I would
find it very interesting to see a resource devoted to use cases and
competency questions that are independent of them, along with
accompanying "not fit for use X" annotations. Sort of like warnings on
pharmaceutals.
Bob Morris
[...]