<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=windows-1252"

 http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

I read Rich's email as quoted in Nico's reply - I think maybe Rich's

post didn't actually go out on the tdwg-tag or RDF group lists.  Rich

mentions that he is swamped and will reply later.  For the moment it

may be helpful to cite an earlier email of Rich's which it took me some

time to dig out of the tdwg-content email list:<br>

<br>

<a class="moz-txt-link-freetext" href="http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001703.html">http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001703.html</a><br>

<br>

In that post, Rich was responding to a thread that started when I asked

how one would handle a real-life situation (the specimen pictured in

<a class="moz-txt-link-freetext" href="http://images.cyberfloralouisiana.com/images/specimensheets/lsu/0/0/4/28/LSU00000428_l.jpg">http://images.cyberfloralouisiana.com/images/specimensheets/lsu/0/0/4/28/LSU00000428_l.jpg</a>). 

The relevant part begins about half way down the page with "In the web

example given by Steve, we have... ".  In that section, Rich notes that

<br>

<br>

"Eventually, a third party may be able to deduce (perhaps through a

suite of<br>

other, external information) a RelationshipAssertion that maps the TNU<br>

"[Juncus] diffusissimus Buckl. sec L. Urbatsch 2009" to some other,

perhaps<br>

published and well-defined taxon concept (of the same or different

name).<br>

Also, if there are 100 specimens in the collection that L. Urbatsch<br>

identified as "Juncus diffusissimus Buckl." in 2009, then anchoring all

100<br>

Identification instances to the one TNU, allows all of those specimens

to<br>

inherit the mapping of the one "[Juncus] diffusissimus Buckl. sec L.<br>

Urbatsch 2009" TNU instance to some other better-defined taxon concept."<br>

<br>

>From that post, I understood that a TNU (a.k.a. "assertion" in Pyle

2004 <a class="moz-txt-link-freetext" href="http://systbio.org/files/phyloinformatics/1.pdf">http://systbio.org/files/phyloinformatics/1.pdf</a>) can be as vague

as an idea that some determiner had in his/her head about how

organism/specimen instances should be mapped to a name.  I think from

what Rich said there that there is the potential that we as metadata

aggregators may at some later point be able to map how that idea in the

determiner's head fits in with a more well-defined (e.g. published)

taxon description which one may choose to call a taxon concept rather

than a TNU.  <br>

<br>

As so often is the case, I think the problem here boils down to

identifiers and the metadata that we associate with them.  Let's say in

the real-life example above, somebody (we can say GNUB) assigns a

persistent identifier (perhaps a URI constructed from an opaque UUID)

to "Juncus diffusissimus Buckl. sec L. Urbatsch 2009".  We could say

with an rdf:type statement that the resource identified by the URI is a

TNU.  We can give that resource a tc:hasName property linking it to the

name which is represented by the string "Juncus diffusissimus Buckl.". 

(I'm not sure what property we use to say that L. Urbatch made the

assertion).  Now let's say that L. Urbatsch publishes a paper

describing in detail her concept of Juncus diffusissimus Buckl.  We can

now assign the resource identified by the URI a tc:accordingTo property

whose value is the DOI of the paper she wrote.  If we want, we can

replace the previous rdf:type statement with different one stating that

the resource is a taxon concept rather than a TNU, or if we believe

that all taxon concepts are also TNUs we can leave the rdf:type

statement that we had before and just add a second one saying that the

resource is also of type taxon concept.  <br>

<br>

The point I'm trying to make is that as long as this "thing" that we

are variously calling "taxon name usage", "taxon concept", "shallow

taxonomic concept", or "deep taxonomic concept" can be assigned an

identifier, what really matters is the metadata we associate with it,

not really what we call it.  The more metadata that we can connect with

it, either through datatype properties like name strings or object

properties that describe how the "thing" is related to other resources,

the "deeper" the concept.  On the other extreme, we may know nothing

more than the name string.  In that case we could call it a "nominal

concept", but we could still assign it an identifier and maybe with

luck we could associate more metadata with it (make it "deeper") at

some point in the future.  <br>

<br>

Returning to the original question of the thread (which was about the

utility of TCS), TCS tries to deal with this problem using a thing

called "signatures" (section 17.2, see

<a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu/pages/TCS-Schema-UserGuide-v1.3.pdf">http://bioimages.vanderbilt.edu/pages/TCS-Schema-UserGuide-v1.3.pdf</a>)

which are a somewhat crude attempt to make identifying strings unique

by standardizing their format.  However, TCS was written in 2005-2006. 

Since then, the development of DOIs, the TDWG GUID Applicability

Statement standard, and best practices in the Linked Data world have

provided well-established and standardized ways to create persistent

and dereferenciable identifiers.  So there isn't any reason I can see

why we can't use them.  <br>

<br>

I am going to be bold and say that we already have the minimum tools

required to get started implementing TNUs/TaxonConcepts: <br>

- URI GUIDs (which if one preferred could be UUIDs or  LSIDs -- HTTP

proxied to make Linked Data people happy; see the TDWG GUID

Applicability Statement standard if you don't know how to do this) to

identify the TNU/concepts, <br>

- the two terms tc:hasName and tc:accordingTo (from the TDWG Taxon

Concept ontology) to relate the TNUs/TaxonConcepts to names and sec.

references, and <br>

- some sources for name and publication URI GUIDs.  <br>

There are deficiencies all over the place for that last item, but they

can be addressed over time by improving the scope of the relevant

databases and the quality of the metadata provided.  uBio has URIs for

almost every name I've ever looked for.  BHL has a growing collection

of old literature which has been assigned identifiers by  Rod Page's

BioStor, new literature usually has an assigned, dereferenceable

proxied DOI, and one can even make valid URIs from ISBNs of books

(although they aren't resolvable).  I'm not sure how one should address

the situation where the "sec." reference of a TNU is a person and date

since there isn't a standard database of people (as far as I know). 

But that could be remedied.  Ultimately, one could create the kinds of

mapping tools that Nico and Rich are talking about which relate

different taxon concepts/TNUs which have set theory relationships. 

Whether that would be done with RDF, OWL, or something completely

different I don't know, but the basic anchoring of persistent

identifiers for the TNU/concepts to the names and sec. references

wouldn't have to wait on that.  We could also get hung up about what

terms to use to express the metadata describing the basic TNU/name/sec.

resources, but there is nothing that says that metadata can't change or

be improved over time.  It's the identifier that shouldn't change.  <br>

<br>

Am I wrong about this???<br>

<br>

Steve<br>

<br>

Nico Franz wrote:

<blockquote

 cite="mid:CALZMekkx431h=OB7t0dOtcvate85BgLQQiF2uCLALJxCP+Sp1g@mail.gmail.com"

 type="cite">Thank you, Rich.<br>

  <br>

   So we seem to agree on something like this:<br>

  <br>

Rich                                    Nico<br>

taxon name usage   &lt;===&gt;   "shallow" taxonomic concept<br>

taxon concept         &lt;===&gt;   "deep" taxonomic concept<br>

  <br>

Both: labeling is via name sec. author<br>

Both: authoring concepts/usages vs. identifying to those =&gt; slippery

issue; ideally requires proper speaker awareness.<br>

  <br>

   Why the latter? - well, because (again) the desirable effect of

using concepts - the desirable situation where these would have a

justification that goes beyond just really meticulous data management

and advances to the level of facilitating better science qua more

precise taxonomic semantics - only obtains if a great number of name

occurrences in a wide range of shallow-ish sources is linked via

identification to a presumably smaller number of occurrences where

those names are well defined and where successive definitions of names

are semantically linked. So there needs to be an emerging culture of

minimizing concept inflation. Otherwise we obtain what we have now

(mostly just names) and on top of that add new baggage (lots of really

shallow concepts) that nobody can do good semantics with. <br>

  <br>

   Here is where I think we disagree, perhaps just in terms of sales

strategy:<br>

  <br>

   You seem to suggest that making an a priori distinction between TNUs

and concepts is (1) possible in a good number of cases, (2) is

desirable perhaps in the form of registry, and (3) even necessary for

building and populating databases, etc.<br>

  <br>

   Here I disagree, for a number of reasons. First off I do believe

that not defining certain things too soon or too narrowly is sometimes

actually really good science and on the other hand, doing so can be a

show stopper if other people don't share this narrowness and find it

limiting. Second, while we can perhaps readily agree that a lengthy

monograph published in American Museum Novitates rises to the level of

authoring next concepts whereas a label saying "Family Carabidae" on a

specimen submitted as part of an insect student collection does not,

there are enough in-between cases where only time will tell.<br>

  <br>

   Example: USDA Plants promotes a particular perspective of

groundcherry taxonomy, genus-level concept Physalis - <a

 moz-do-not-send="true"

 href="http://plants.usda.gov/java/profile?symbol=physa" target="_blank">http://plants.usda.gov/java/profile?symbol=physa</a>

- with some 29 species-level concepts recognized. ASU's herbarium

curator Les Landrum is a bit of a groundcherry nerd (I say this with

admiration). If you go here: <a moz-do-not-send="true"

 href="http://swbiodiversity.org/seinet/index.php" target="_blank">http://swbiodiversity.org/seinet/index.php</a>,

then Search Collections =&gt; Select All =&gt; Next =&gt; Scientific

Name = Physalis =&gt; Search, you get some 3700 pertinent specimen

records. If you then switch to the Species List tab, you see 115

concept listed overall. Switching to the USDA Plants Thesaurus will

give you only 46 concepts that these 3700 specimens are mapped to.

Using instead the ASU Taxonomic Thesaurus will yield 89 concepts

linking variously to those specimens. This is based on Les'

classification of groundcherries which is not further documented in the

SEINet environment at this moment.<br>

  <br>

   Now, saying a a priori whether Les' list represents a set of TNUs

versus concepts would presumably require you to assert that there is

nobody who is Les or very much like him that can provide a semantically

very accurate mapping of the 89 name usages in the SEINet-ASU Physalis

list to the much more thoroughly circumscribed USDA Plants concepts.

That could seem like a daring prediction given how little Les might

think of the USDA perspective. At the very moment that Les or someone

very much like him DOES provide the mapping, what looked like a list of

TNUs then all of a sudden acquires - via the mapping - a much deeper

semantic status where others can readily go from one classification to

the next, even though each come with very different amounts of

information in their original appearances. Some people may prefer Les'

concepts at least for Arizonan groundcherries, and in either case, the

mapping put both on an even playing field.<br>

  <br>

   So this exemplifies IMO why so far the concept approach has been too

abstract, the TCN has been too depauperate on the relationships/mapping

side (worrying instead almost needlessness about what constitutes a

concept per se), and definitions between identifications, name usages,

shallow, deep concepts have been too abstract as well. I believe we

should focus less discussion on those issues and more emphasis on

building mapping tools that can carry a wide range of input and

logically infer additional implied mappings from the initial

expert-given set. The actual semantic properties of that input will

emerge a posteriori and will be hard to predict in some cases. Some

descriptions are lengthy but nobody understands them. Some names lists

are profoundly informative if the context of their origin is well known

to an expert. <br>

  <br>

   There will be some obvious overreaches in both directions (too many

unconnected items, some items that are connected more precisely than

their inherent information would seem to justify). I think these

overreaches would be tolerable. What's less productive to me is a

restrictive set of definitions that provide an early blockage in they

way towards an environment where mapping is supposed to occur very

frequently. We're not at the registry stage yet. More at the "can this

work in principle" stage. As I mentioned before, the mappings ARE the

concepts under a certain viewpoint. We don't want to pre-determine

their fate by separating TNUs from concepts in a great number of cases.<br>

  <br>

   I hope this was not a misrepresentation of your view and also a

clarification of my view. In the end, we both advocate some sort of

balance for the same concerns, but perhaps disagree only strategically

about the moment where/when that balance will materialize - upfront via

precise definitions and registration or later on via the presence/lack

of actual mappings.<br>

  <br>

Best,<br>

  <br>

Nico<br>

  <br>

  <br>

  <div class="gmail_quote">On Mon, Nov 26, 2012 at 5:18 PM, Richard

Pyle <span dir="ltr">&lt;<a moz-do-not-send="true"

 href="mailto:deepreef@bishopmuseum.org" target="_blank">deepreef@bishopmuseum.org</a>&gt;</span>

wrote:<br>

  <blockquote class="gmail_quote"

 style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

    <div link="blue" vlink="purple" lang="EN-US">

    <div>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">I

want to get into this topic in more detail (going back to Steve’s

original post), but this week is hell-week for me, so only a quick

comment now.</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"> </span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">I

generally agree with everything Nico says, but I think we need to be a

little more clear of what we mean by “name sec. author”</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"> </span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">The

core unit of the data model we’ve been building towards (GNUB, which

underlies ZooBank) uses as its fundamental unit something we’ve been

calling a “Taxon Name Usage Instance” (TNU).  The scope of what can be

a TNU is intentionally very broad – anything from an original taxon

name description, to a mention in a newspaper article, and potentially

even a scribbled hand-written label or letter.  The only requirement is

that it be static – that is, a snapshot in time.  I mention this

because database records can be represented as TNUs, but only as a

static snapshot of the record.  If the essence of the database record

changes over time (e.g., due to changing taxonomic opinion), then a new

TNU is generated for a different snapshot in time.</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"> </span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">A

very small subset of the universe of TNUs represent Code-governed

Nomenclatural Acts (original descriptions of new names and other

code-governed nomenclatural actions). In the case of such TNUs

involving the ICZN Code (for example), the TNUs are registered in

ZooBank.  But the point is, one subset of all TNUs are those that

involve actions governed by a Code of nomenclature.</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"> </span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">The

reason I mention this is that, if I read Nico’s email correctly, I

think he’s saying that not all TNUs de-facto represent taxon concepts. 

Rather, analogous to the nomenclatural subset of TNUs, there is a

subset of TNUs that rise to the level of representing Taxon Concept

definitions.  In the case of nomenclatural acts, someone must make some

sort of declaration (assertion) that a specific TNU constitutes a

Code-governed nomenclatural act, along with relevant metadata relating

to that assertion and the nature of the Act.  In the case of zoological

names, ZooBank is intended to facilitate this role (i.e., when a person

registers a TNU in ZooBank, there is an implied assertion that the TNU

represents a nomenclatural act under the ICZN Code).</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"> </span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">What

would be nice to have (and what TDWG could play a helpful role in

facilitating), is a registry of sorts (analogous to ZooBank) for those

TNUs that represent taxon concepts.  In other words, a mechanism for

people to “register” the subset of all TNUs that represent taxon

concepts. Secondarily, there would also be a mechanism to make

assertions about how registered taxon concepts map to each other (via

some sort of set theory relationship[s]).</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"> </span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">In

summary, my points are </span></p>

    <p><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"><span>1)<span

 style="font-family: &quot;Times New Roman&quot;; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">     

    </span></span></span><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">We

should be clear when we say “name sec. author” whether we mean it sensu

lato (i.e., all TNUs); or sensu stricto (i.e., only those TNUs that

rise to the level of representing taxon concepts).</span></p>

    <p><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"><span>2)<span

 style="font-family: &quot;Times New Roman&quot;; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">     

    </span></span></span><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">There

ought to be a registry (perhaps administered by CoL?) for identifying

the subset of TNUs that represent concept definitions, and it should

include a mechanism for making set-theory relationship assertions among

registered concept-TNUs.</span></p>

    <p><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"><span>3)<span

 style="font-family: &quot;Times New Roman&quot;; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">     

    </span></span></span><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">The

two things mentioned in #2 should be separate; that is, one can assert

that a particular TNU represents a taxon concept separately from

(potentially multiple) assertions about how that taxon concept relates

to other taxon concepts.</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"> </span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">More

later.</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"> </span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">Aloha,</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">Rich</span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);"> </span></p>

    <p class="MsoNormal"><span

 style="font-size: 11pt; font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31, 73, 125);">P.S

By my standards that WAS quick!</span></p>

    </div>

    </div>

  </blockquote>

  </div>

</blockquote>

<br>

<pre class="moz-signature" cols="72">-- 

Steven J. Baskauf, Ph.D., Senior Lecturer

Vanderbilt University Dept. of Biological Sciences

postal mail address:

VU Station B 351634

Nashville, TN  37235-1634,  U.S.A.

delivery address:

2125 Stevenson Center

1161 21st Ave., S.

Nashville, TN 37235

office: 2128 Stevenson Center

phone: (615) 343-4582,  fax: (615) 343-6707

<a class="moz-txt-link-freetext" href="http://bioimages.vanderbilt.edu">http://bioimages.vanderbilt.edu</a>

</pre>

</body>

</html>