Taxonomic name usage files
Hi TDWG Content group:
Perhaps someone can answer these question? Suppose I submitted a new biodiversity study to a journal for peer review and ultimately publication. My study mentions taxonomic names, but some names are used in more than one specific sense throughout the manuscript. As part of my study's data body, I want to say things like: at this point or these sections in my manuscript, I am using the name in the sense of authors X. And: later on in the Discussion, I am using the name in "my new sense" (as an example). I want to submit a table with structured metadata on the various usages of names in my manuscript, as part of the supplementary data provided to the Journal. I believe part of what the table would have to reflect, for each usage, is whether this is my usage, or that of someone else that I am ok with (=> define speaker role).
Is there a best TDWG standard to glean terms and definitions from to draft up that table? I assume it is Darwin Core and/or the TCS, but then has someone actually tried this (= extract the subset of terms needed to identify names, usages, speaker roles) in conjunction with (e.g.) a biodiversity inventory or taxonomic revision to be published? The key purpose here would be to facilitate better name usage data practices, tied to the process of publishing new data via journals. To make data about name usages part of the supplementary data, in a structured and rather explicit format.
Thanks and best,
Nico
On 18 Apr 2016, at 4:18 PM, Nico Franz nico.franz@asu.edu wrote:
Hi TDWG Content group:
Perhaps someone can answer these question? Suppose I submitted a new biodiversity study to a journal for peer review and ultimately publication. My study mentions taxonomic names, but some names are used in more than one specific sense throughout the manuscript. As part of my study's data body, I want to say things like: at this point or these sections in my manuscript, I am using the name in the sense of authors X. And: later on in the Discussion, I am using the name in "my new sense" (as an example). I want to submit a table with structured metadata on the various usages of names in my manuscript, as part of the supplementary data provided to the Journal. I believe part of what the table would have to reflect, for each usage, is whether this is my usage, or that of someone else that I am ok with (=> define speaker role).
Is there a best TDWG standard to glean terms and definitions from to draft up that table? I assume it is Darwin Core and/or the TCS, but then has someone actually tried this (= extract the subset of terms needed to identify names, usages, speaker roles) in conjunction with (e.g.) a biodiversity inventory or taxonomic revision to be published? The key purpose here would be to facilitate better name usage data practices, tied to the process of publishing new data via journals. To make data about name usages part of the supplementary data, in a structured and rather explicit format.
Thanks and best,
Nico
I don’t know if this is the standard way to do it, but the relevant Darwin Core term is http://rs.tdwg.org/dwc/terms/nameAccordingTo (or http://rs.tdwg.org/dwc/terms/nameAccordingToID). You could establish a set of taxonIDs in any Darwin Core files that go with your publication, some of which would then share the same scientificName but have a different nameAccordingTo.
I don’t know the best way to represent different agents (outside of citing them in nameAccordingTo), but you can always include that information in http://rs.tdwg.org/dwc/terms/taxonRemarks
cheers, Gaurav
Hi Nico,
I think it would be helpful to define a couple of terms first. I know you already know much of this, but for the benefit of other I think it would help set a baseline for convsersation.
Name-String: A literal string of characters representing a Taxon Name. These may include authorships, rank indicators and other qualifiers, and concept qualifiers (e.g., the “sec.” bits.)
Reference: A “reference” is any form of static documentation, at any granularity (usually a publication in the traditional sense, but unpublished documents are also included in the scope of “reference”, as are subsections of traditionally cited publication units, such as a particular section of a single article).
Taxon Name Usage (TNU): The Global Names Usage Bank (GNUB) defines this as essentially the treatment of a taxon name within a Reference. The primary components of each TNU are:
1. A link to the TNU that represents the original establishment of the name (called the “Protonym”, analogous to “basionym” in some ways). When a TNU is itself a Protonym, then this link is self-referential. This is effectively the “Name” part of the TNU.
2. A link to the Reference (as defined above). Again, this usually corresponds to a traditionally-citable publication unit (Article, Book, Chapter, etc.), but may represent an unpublished document (manuscript, correspondence, etc.) or a finer granularity of a more traditionally-cited work (e.g., a few specific pages within an article).
3. The verbatim spelling of the name as it appears in the Reference (as best as can be represented by the UTF-8 Character set).
4. An indication of the taxonomic rank at which a name was used (e.g., species vs. subspecies vs. variety, etc.)
5. An indication of the immediate hierarchical parent of the name, as it was used (always another TNU with a higher taxon rank, within the same Reference)
6. An indication of the subjective validity of the name (recursive to the same record when the name is used as a valid taxon, or to another TNU within the same Reference representing the senior heterotypic synonym).
There are some other properties of TNUs, but this is the core.
Appearance: This term was established by James Ytow for his Nomencurator data model, and represents the actual appearance of a text-string name within a Reference. Each TNU (the conceptual treatment of a name within a Reference) might involve only a single Appearance (i.e., the name-string appeared only once within the Reference); or it might involve many appearances of the name within a single TNU (e.g., a study on the biology of Aus bus might include dozens of instances of the text string “Aus bus” or its implied surrogates such as “A. bus” or in some cases, in context, even just “bus”).
Usage Citation: These are less common, but fall into your realm: these are the conceptual citations within a Reference to other specific usages in other References. These usually take the form of “Aus bus L. sec. Smith” (or “sensu” instead of “sec.”), but it’s not the literal name-string itself; rather it’s the implied citation linking a TNU in one Reference to a TNU in a different Reference (best represented in TDWG-land via “relationshipAssertion” of TCS).
If we can keep these four very different concepts separate, it may be helpful to address the example you gave.
For simplicity, let’s call your “study” a single reference. That is, we will pick a snapshot static version of the manuscript or publication. If you wanted to track multiple versions of a MS, then each version would be a separate Reference.
For further simplicity, let’s stick with two qualified Name-String for you example: “Aus bus L. sec. Smith” and “Xus bus (L.) sec. Jones” Obviously, they may show up sometimes as “A. bus”, ”X. bus (L.)”, etc. – all of which are represented in their respective Appearances; but again, for simplicity, let’s assume that two name-strings are used consistently in different parts of the Reference.
Hokay… with THAT out of the way…..
Let’s say that within in your Reference: there are 12 Appearances of the Name-String “Aus bus L. sec. Smith” on pages 6-15 of your Reference, and 8 Appearances of the name-string “Xus bus (L.) sec. Jones” on pages 16-20 (again, trying to keep it simple).
You indicated in your example that the latter is being used as "my new sense"; in which case you (author of MS) are Jones. You don’t have to be Jones – you could be Franz and your study compares two separate senses of “bus” by earlier authors “Smith” and “Jones”. In any case, we know the following exist:
Name-Stings:
- “Aus bus L. sec. Smith”
- “Xus bus (L.) sec. Jones”
References:
- Reference represented by “L.”
- Reference represented by “Smith”
- Reference represented by “Jones”
TNUs:
- Usage of “Aus” by L.
- Usage of “bus” by L. (Protonym of “bus”)
- Usage of “Aus” by Smith
- Usage of “bus L.” by Smith
- Usage of “Xus” by Jones
- Usage of “bus L.” by Jones
[You can work out the other properties of each of these 6 TNUs.]
Appearances:
- 12 instances of Name-String “Aus bus L. sec. Smith” among pp. 6-15 of Jones
- 8 instances of Name-String “Xus bus (L.) sec. Jones” among pp. 16-20 of Jones
In summary, so far we have two Name-Strings, three References, six TNUS, and 20 Appearances.
Conspicuously, I have not indicated how many Usage Citations there are in this example. There’s a very simple reason for this: whereas I have spent shamefully large numbers of hours thinking about the other classes of items, I have yet to get my head around how best to model “Usage Citation”. It’s not that I don’t know how to do it; it’s that there are many different ways to do it, and I haven’t yet worked out which may makes the most sense. However, not all is lost, because in all probability they are not needed to solve your issue.
Now (finally) getting back to the answer to your question…. I see two different ways to approach it. The first is probably the most robust and appropriate, which is to create twenty Rows; one for each Appearance, with all the respective properties of each. However, I suspect that would be seen by most people (possibly even myself) as overkill.
What it seems like you’re really looking for is two rows: one representing the TNU of “bus L.” by Smith, and one representing the TNU of “bus L.” by Jones. Columns would break out all the respective bits of these two TNUs, and you’d add another column to indicate which subsections of the MS manage each.
A compromise would be to iterate all six TNUs, but that might add more stuff without addressing your specific stated desire to distinguish the two usages of “bus”, and how to refer to them.
I’d be happy to discuss further if none of these quite addresses your main question.
DwC/TCS covers most of what you need, but is missing the “Appearance” class and all of its associated properties. I would refer you to James Ytow’s works on Nomencurator to see how he defines those terms. I’m not aware of anyone who has (yet) tried to tackle the “Usage Citation” in the real world, but as I said earlier, it’s best to start with the “relationshipAssertion” parts of TCS.
If I’m missing some aspect of what you’re asking about, please let me know.
Apologies to everyone else on the list for this massive missive.
Aloha,
Rich
Richard L. Pyle, PhD Database Coordinator for Natural Sciences | Associate Zoologist in Ichthyology | Dive Safety Officer Department of Natural Sciences, Bishop Museum, 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
From: tdwg-content [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Nico Franz Sent: Monday, April 18, 2016 12:19 PM To: tdwg-content@lists.tdwg.org; vocab@noreply.github.com Subject: [tdwg-content] Taxonomic name usage files
Hi TDWG Content group:
Perhaps someone can answer these question? Suppose I submitted a new biodiversity study to a journal for peer review and ultimately publication. My study mentions taxonomic names, but some names are used in more than one specific sense throughout the manuscript. As part of my study's data body, I want to say things like: at this point or these sections in my manuscript, I am using the name in the sense of authors X. And: later on in the Discussion, I am using the name in "my new sense" (as an example). I want to submit a table with structured metadata on the various usages of names in my manuscript, as part of the supplementary data provided to the Journal. I believe part of what the table would have to reflect, for each usage, is whether this is my usage, or that of someone else that I am ok with (=> define speaker role).
Is there a best TDWG standard to glean terms and definitions from to draft up that table? I assume it is Darwin Core and/or the TCS, but then has someone actually tried this (= extract the subset of terms needed to identify names, usages, speaker roles) in conjunction with (e.g.) a biodiversity inventory or taxonomic revision to be published? The key purpose here would be to facilitate better name usage data practices, tied to the process of publishing new data via journals. To make data about name usages part of the supplementary data, in a structured and rather explicit format.
Thanks and best,
Nico
Thank you, Gaurav, Rich & Greg!
Really appreciate the detailed information, which will take some time to digest but is very helpful and concrete. I should note that my colleague Mike Rosenberg brought this up (you might be aware of this: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0101704); I was just fishing the TDWG waters for input - productively, as it turns out. Greg - those files are great - thanks! Personal opinion: not the first time that you & Australian colleagues are ahead of the pack.
At a fairly high level, it does seems to me that if certain kinds of speaker intentions facilitate particular semantic integration services (or not so much), then finding a structured way to records these intentions may be worth exploring. For instance, could I possibly be the author of a biodiversity data paper where all TNUs are explicitly not "mine"? Or: how do I record a case where I say "our higher-level classification follows X", but then X's family-level concepts show up as paraphyletic on my phylogeny. Apparently I only followed X so much (at which point I 'said': "screw X, this is better"). All this might mean getting into taxonomists' business, which makes it ambitious.
Best, Nico
On Mon, Apr 18, 2016 at 4:47 PM, Richard Pyle deepreef@bishopmuseum.org wrote:
Hi Nico,
I think it would be helpful to define a couple of terms first. I know you already know much of this, but for the benefit of other I think it would help set a baseline for convsersation.
[...]
*From:* tdwg-content [mailto:tdwg-content-bounces@lists.tdwg.org] *On Behalf Of *Nico Franz *Sent:* Monday, April 18, 2016 12:19 PM *To:* tdwg-content@lists.tdwg.org; vocab@noreply.github.com *Subject:* [tdwg-content] Taxonomic name usage files
Hi TDWG Content group:
Perhaps someone can answer these question? Suppose I submitted a new biodiversity study to a journal for peer review and ultimately publication. My study mentions taxonomic names, but some names are used in more than one specific sense throughout the manuscript. As part of my study's data body, I want to say things like: at this point or these sections in my manuscript, I am using the name in the sense of authors X. And: later on in the Discussion, I am using the name in "my new sense" (as an example). I want to submit a table with structured metadata on the various usages of names in my manuscript, as part of the supplementary data provided to the Journal. I believe part of what the table would have to reflect, for each usage, is whether this is my usage, or that of someone else that I am ok with (=> define speaker role).
Is there a best TDWG standard to glean terms and definitions from to draft up that table? I assume it is Darwin Core and/or the TCS, but then has someone actually tried this (= extract the subset of terms needed to identify names, usages, speaker roles) in conjunction with (e.g.) a biodiversity inventory or taxonomic revision to be published? The key purpose here would be to facilitate better name usage data practices, tied to the process of publishing new data via journals. To make data about name usages part of the supplementary data, in a structured and rather explicit format.
Thanks and best,
Nico
Hi Nico,
At a fairly high level, it does seems to me that if certain kinds of speaker intentions facilitate particular semantic integration services (or not so much), then finding a structured way to records these intentions may be worth exploring.
DEFINITELY agree!
For instance, could I possibly be the author of a biodiversity data paper where all TNUs are explicitly not "mine"?
By the definition of a TNU, no. The implication of a TNU is that the associated Reference explicitly asserts taxonomic opinion concerning spelling, rank, classification, and validity/synonymy, as captured in the core properties of a TNU. In cases where a Reference does not make an explicit assertion about one or more core properties (e.g., "I recognize this as a distinct species, but I don't know what genus it belongs to."; or "Some authors treat this as a valid species, and other treat it as a junior heterotypic synonym of another species, and I have no opinion on the matter."), the TNU simply records the ambiguous property as "Unspecified". However, the TNU itself, as anchored to Reference X, does not directly defer to a treatment by Reference Y. Such statements would be captured via a "Usage Citation" mechanism that I described in my previous message. In other words, even if Reference X ("my" reference) simply states, "With regard to Aus bus L., I follow the treatment of Smith", then we'd have two TNUs: one for Reference X, and one for Smith, both of which would otherwise have effectively identical TNU properties. Then we would need to leverage the "Usage Citation” mechanism (TCS: relationshipAssertion), to effectively state "[Aus bus L. sec. Reference X] is congruent to [Aus bus L. sec. Smith]". So, even if the taxon concept is not "mine" per se, the TNU most definitely is.
I am unaware of examples where someone asserts "Aus is a valid genus, and Xus is a valid genus (separate from Aus), and the species bus simultaneously belongs to both genera"; or "Aus bus L. is both a valid species and a junior heterotypic synonym of Aus dus". The latter is sometimes incorrectly assumed in cases where there are "In Part" synonymies. However, it's a Taxon NAME Usage, not a Taxon CONCEPT Usage; and as such, a NAME is anchored to a name-bearing type, so can only belong to one species taxon or the other. Edge cases of types that are of hybrid origins, and syntype series with multiple taxa represented, are rectified through mechanisms prescribed by the various Codes.
Or: how do I record a case where I say "our higher-level classification follows X", but then X's family-level concepts show up as paraphyletic on my phylogeny. Apparently I only followed X so much (at which point I 'said': "screw X, this is better").
This is why TNUs are clean and straightforward (almost always unambiguous), whereas relationshipAssertions are a bit messier (i.e., how MUCH do I actually follow of Smith's treatment of Aus bus? That it's a valid species? What classification to place it? What rank to treat it as? The spelling of the name? The taxon concept circumscription?). We have our five set-theory relationship types (congruent, includes, included in, overlaps, excludes), but those are somewhat limited when we have a long tradition of conflating hierarchical classification with taxon circumscriptions under the umbrella term of "taxon concept".
Aloha, Rich
P.S. I did not see Greg's reply or files on the list -- was that off-list? I would, of course, love to see them!
Richard L. Pyle, PhD Database Coordinator for Natural Sciences | Associate Zoologist in Ichthyology | Dive Safety Officer Department of Natural Sciences, Bishop Museum, 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
Thanks, Rich.
Again, this helps a lot. Not really with permission but probably ok, Greg gave this link: https://www.anbg.gov.au/ibis25/x/UYG2 (see e.g. "Attribute description" files).
Here are some other intentions that I think I sometimes have:
1. Given a multitude of well established and precise historical name usages, I explicitly don't want to commit to one in particular that my present usage is congruent with, or not. (Indeed, I kind of think this is what the name withOUT a sec. does, "explicitly"). I choose to be vague - any past usage is ok with me, here. I think we can presently model the vagueness (by integrating on the strings alone), but not the deliberateness thereof (in contrast to other situations where vagueness is not intended)?
2. Franz. 2010. Revision of Apotomoderes (Insecta: Coleoptera). => Actually, "Insecta" here is more of a social concession to an outdated data filing paradigm than a claim to an active speaker role (related to name usages that I actually care about). I am not intending to apply my taxonomic expertise to "Insecta"; that is *out of scope* (though the string is being written).
Best, Nico
On Tue, Apr 19, 2016 at 1:17 PM, Richard Pyle deepreef@bishopmuseum.org wrote:
Hi Nico,
At a fairly high level, it does seems to me that if certain kinds of
speaker
intentions facilitate particular semantic integration services (or not
so much),
then finding a structured way to records these intentions may be worth
exploring.
DEFINITELY agree!
For instance, could I possibly be the author of a biodiversity data
paper where
all TNUs are explicitly not "mine"?
By the definition of a TNU, no. The implication of a TNU is that the associated Reference explicitly asserts taxonomic opinion concerning spelling, rank, classification, and validity/synonymy, as captured in the core properties of a TNU. In cases where a Reference does not make an explicit assertion about one or more core properties (e.g., "I recognize this as a distinct species, but I don't know what genus it belongs to."; or "Some authors treat this as a valid species, and other treat it as a junior heterotypic synonym of another species, and I have no opinion on the matter."), the TNU simply records the ambiguous property as "Unspecified". However, the TNU itself, as anchored to Reference X, does not directly defer to a treatment by Reference Y. Such statements would be captured via a "Usage Citation" mechanism that I described in my previous message. In other words, even if Reference X ("my" reference) simply states, "With regard to Aus bus L., I follow the treatment of Smith", then we'd have two TNUs: one for Reference X, and one for Smith, both of which would otherwise have effectively identical TNU properties. Then we would need to leverage the "Usage Citation” mechanism (TCS: relationshipAssertion), to effectively state "[Aus bus L. sec. Reference X] is congruent to [Aus bus L. sec. Smith]". So, even if the taxon concept is not "mine" per se, the TNU most definitely is.
I am unaware of examples where someone asserts "Aus is a valid genus, and Xus is a valid genus (separate from Aus), and the species bus simultaneously belongs to both genera"; or "Aus bus L. is both a valid species and a junior heterotypic synonym of Aus dus". The latter is sometimes incorrectly assumed in cases where there are "In Part" synonymies. However, it's a Taxon NAME Usage, not a Taxon CONCEPT Usage; and as such, a NAME is anchored to a name-bearing type, so can only belong to one species taxon or the other. Edge cases of types that are of hybrid origins, and syntype series with multiple taxa represented, are rectified through mechanisms prescribed by the various Codes.
Or: how do I record a case where I say "our higher-level classification
follows X", but then
X's family-level concepts show up as paraphyletic on my phylogeny.
Apparently I only
followed X so much (at which point I 'said': "screw X, this is better").
This is why TNUs are clean and straightforward (almost always unambiguous), whereas relationshipAssertions are a bit messier (i.e., how MUCH do I actually follow of Smith's treatment of Aus bus? That it's a valid species? What classification to place it? What rank to treat it as? The spelling of the name? The taxon concept circumscription?). We have our five set-theory relationship types (congruent, includes, included in, overlaps, excludes), but those are somewhat limited when we have a long tradition of conflating hierarchical classification with taxon circumscriptions under the umbrella term of "taxon concept".
Aloha, Rich
P.S. I did not see Greg's reply or files on the list -- was that off-list? I would, of course, love to see them!
Richard L. Pyle, PhD Database Coordinator for Natural Sciences | Associate Zoologist in Ichthyology | Dive Safety Officer Department of Natural Sciences, Bishop Museum, 1525 Bernice St., Honolulu, HI 96817 Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef@bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html
- Given a multitude of well established and precise historical name usages, I explicitly don't want to commit to one in particular
that my present usage is congruent with, or not. (Indeed, I kind of think this is what the name withOUT a sec. does, "explicitly").
Yes! Exactly!
I choose to be vague - any past usage is ok with me, here.
During the TCS days, this is what we referred to as a "Nomenclatural Concept", which is roughly the sum/average of all historical treatments, more or less (ambiguity and vagueness deliberate/intentional).
I think we can presently model the vagueness (by integrating on the strings alone), but not the deliberateness thereof (in contrast to other situations where vagueness is not intended)?
Yes - a TNU without any relationshipAssertions. Basically, the only implied associations with other TNUs is via the Protonym link (i.e., a Nomenclatural Assertion).
- Franz. 2010. Revision of Apotomoderes (Insecta: Coleoptera). => Actually, "Insecta" here is more of a social concession to an outdated data filing paradigm than
a claim to an active speaker role (related to name usages that I actually care about). I am not intending to apply my taxonomic expertise to "Insecta"; that is *out of scope* (though the string is being written).
So... in that case, the question is whether to represent "Insecta" as used in Franz 2010 as a scientific name, or vernacular name (two different ways of modelling names, as the latter do not have structured Codes). Even if you fall on the side of using it as a scientific (taxon) name, you can still create a TNU for it. In the spirit of Walter's pioneering work on this stuff, TNUs only represent "potential" taxa.
Aloha, Rich
Thanks, Rich, for catching me up.
So then if we can apparently cover a good bit of ground (with what I assume you'd call a "smart [or the only] way of implementing GNUB"), this does raise the question of how to bring this as close as possible to the peer-review/publication process where an author team may be motivated to express their intentions related to name usages in various explicit, structurally recorded ways.
Best, Nico
On Tue, Apr 19, 2016 at 2:09 PM, Richard Pyle deepreef@bishopmuseum.org wrote:
- Given a multitude of well established and precise historical name
usages, I explicitly don't want to commit to one in particular
that my present usage is congruent with, or not. (Indeed, I kind of
think this is what the name withOUT a sec. does, "explicitly").
Yes! Exactly!
I choose to be vague - any past usage is ok with me, here.
During the TCS days, this is what we referred to as a "Nomenclatural Concept", which is roughly the sum/average of all historical treatments, more or less (ambiguity and vagueness deliberate/intentional).
I think we can presently model the vagueness (by integrating on the
strings alone), but not the deliberateness thereof (in contrast to other situations where vagueness is not intended)?
Yes - a TNU without any relationshipAssertions. Basically, the only implied associations with other TNUs is via the Protonym link (i.e., a Nomenclatural Assertion).
- Franz. 2010. Revision of Apotomoderes (Insecta: Coleoptera). =>
Actually, "Insecta" here is more of a social concession to an outdated data filing paradigm than
a claim to an active speaker role (related to name usages that I
actually care about). I am not intending to apply my taxonomic expertise to "Insecta";
that is *out of scope* (though the string is being written).
So... in that case, the question is whether to represent "Insecta" as used in Franz 2010 as a scientific name, or vernacular name (two different ways of modelling names, as the latter do not have structured Codes). Even if you fall on the side of using it as a scientific (taxon) name, you can still create a TNU for it. In the spirit of Walter's pioneering work on this stuff, TNUs only represent "potential" taxa.
Aloha, Rich
Good questions! Note that the stuff I was describing isn’t just GNUB; it’s embedded in the DwC terms as well. And the few elements that are missing from DwC are embedded in TCS. The main other aspects are the Reference metadata (the ill-fated TDWG Reference standard; now looking towards NLMS), and Agents (most people I know follow FOAF). Other than those, the main missing holes (in both GNUB and DwC) are the Appearance stuff, and perhaps a more robust approach to the relationshipAssertion stuff.
Maybe a task for TCS 2.0?
Once the standards fleshed out, I imagine the way to implement this stuff in new literature is via something along the lines of Pensoft’s PWT (now arpha): http://arpha.pensoft.net/ The natural group to focus on this would be PLAZI, which has been going gangbusters in recent months on capturing structured treatments from existing literature. [Pssst… Donat…. That’s your cue….]
Rich
From: Nico Franz [mailto:nico.franz@asu.edu] Sent: Tuesday, April 19, 2016 11:58 AM To: Richard Pyle Cc: Gaurav Vaidya; greg whitbread; tdwg-content@lists.tdwg.org; vocab@noreply.github.com; Michael Rosenberg (Faculty) Subject: Re: [tdwg-content] Taxonomic name usage files
Thanks, Rich, for catching me up.
So then if we can apparently cover a good bit of ground (with what I assume you'd call a "smart [or the only] way of implementing GNUB"), this does raise the question of how to bring this as close as possible to the peer-review/publication process where an author team may be motivated to express their intentions related to name usages in various explicit, structurally recorded ways.
Best, Nico
On Tue, Apr 19, 2016 at 2:09 PM, Richard Pyle deepreef@bishopmuseum.org wrote:
- Given a multitude of well established and precise historical name usages, I explicitly don't want to commit to one in particular
that my present usage is congruent with, or not. (Indeed, I kind of think this is what the name withOUT a sec. does, "explicitly").
Yes! Exactly!
I choose to be vague - any past usage is ok with me, here.
During the TCS days, this is what we referred to as a "Nomenclatural Concept", which is roughly the sum/average of all historical treatments, more or less (ambiguity and vagueness deliberate/intentional).
I think we can presently model the vagueness (by integrating on the strings alone), but not the deliberateness thereof (in contrast to other situations where vagueness is not intended)?
Yes - a TNU without any relationshipAssertions. Basically, the only implied associations with other TNUs is via the Protonym link (i.e., a Nomenclatural Assertion).
- Franz. 2010. Revision of Apotomoderes (Insecta: Coleoptera). => Actually, "Insecta" here is more of a social concession to an outdated data filing paradigm than
a claim to an active speaker role (related to name usages that I actually care about). I am not intending to apply my taxonomic expertise to "Insecta"; that is *out of scope* (though the string is being written).
So... in that case, the question is whether to represent "Insecta" as used in Franz 2010 as a scientific name, or vernacular name (two different ways of modelling names, as the latter do not have structured Codes). Even if you fall on the side of using it as a scientific (taxon) name, you can still create a TNU for it. In the spirit of Walter's pioneering work on this stuff, TNUs only represent "potential" taxa.
Aloha, Rich
I am trying to answer two questions in this tread.
Taxonx. Chuck asked about TaxonX. We are moving away from TaxonX to use an archival version of TaxPuB, TaxPub-green (https://github.com/plazi/TaxPubJATS-green ), which is related to TaxPub that Pensoft already uses for the publishing, and which is based on NLM JATS, that is a widely used “standard” albeit in different flavors. But to have all based on JATS-TaxPub makes it easier to make use of legacy (articles converted into TaxPub) and prospective (born digital based on TaxPub) articles. Once we have this move finished we will release and note and explain in detail. But in brief, let’s focus on TaxPub.
Taxonomic names For our work with taxonomic publications we tag all scientific taxonomic names we discover in the articles. Our main focus are names for which a taxonomic treatment is provided in the respective article, or which are cited as a reference to an earlier usage and with that a taxonomic treatment.
As an example: http://treatment.plazi.org/id/87F5BDFA-6C63-C7B7-4A85-A1B39AE6CACE Agosti (1990) provides a treatment of Catagylphis turcomanicus (Emery). He cites all the earlier usages of this name, beginning from the protonym to all the latter changes in combination. With that he says, that under this understanding of this taxon, he subsumes all the earlier name usages. He does this by citing them. A visualization is provided on the right hand und “taxonomy”
A more recent example is this crab: http://treatment.plazi.org/id/F017F605-AC1E-BB43-FF2A-90F9555CF9CD
This treatmentCitation allows us to build a citation graph which is specific in a sense that this type of taxonomic names have a relationship to an earlier specified usage. Whilst names that are found in the next have this relationship implicit, known probably to the author in most cases, and few exceptional cases to a community that maintains a name server including the currently accepted name (and then there the history that eventually leads to the protonym, such as is the case for ants (and mony more hymenoptera) on the Hymenopter Name Server). But this is a discussion in itself.
In Taxpub this is dealt with (somewhat cryptic)
<nomenclature-citation-list> <nomenclature-citation> </nomenclature-citation> <nomenclature-citation> </nomenclature-citation> </nomenclature-citation-list>
See https://htmlpreview.github.io/?https://raw.githubusercontent.com/tcatapano/T...
In RDF we make use of CITO (http://purl.org/spar/cito/) to model this relationship
https://github.com/plazi/TreatmentOntologies and here an example
http://tb.plazi.org/GgServer/rdf/87F5BDFA6C63C7B74A85A1B39AE6CACE
Donat
From: tdwg-content [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Richard Pyle Sent: Wednesday, April 20, 2016 12:30 AM To: 'Nico Franz' nico.franz@asu.edu Cc: tdwg-content@lists.tdwg.org; vocab@noreply.github.com; 'Michael Rosenberg (Faculty)' msr@asu.edu Subject: Re: [tdwg-content] Taxonomic name usage files
Good questions! Note that the stuff I was describing isn’t just GNUB; it’s embedded in the DwC terms as well. And the few elements that are missing from DwC are embedded in TCS. The main other aspects are the Reference metadata (the ill-fated TDWG Reference standard; now looking towards NLMS), and Agents (most people I know follow FOAF). Other than those, the main missing holes (in both GNUB and DwC) are the Appearance stuff, and perhaps a more robust approach to the relationshipAssertion stuff.
Maybe a task for TCS 2.0?
Once the standards fleshed out, I imagine the way to implement this stuff in new literature is via something along the lines of Pensoft’s PWT (now arpha): http://arpha.pensoft.net/ The natural group to focus on this would be PLAZI, which has been going gangbusters in recent months on capturing structured treatments from existing literature. [Pssst… Donat…. That’s your cue….]
Rich
From: Nico Franz [mailto:nico.franz@asu.edu] Sent: Tuesday, April 19, 2016 11:58 AM To: Richard Pyle Cc: Gaurav Vaidya; greg whitbread; tdwg-content@lists.tdwg.orgmailto:tdwg-content@lists.tdwg.org; vocab@noreply.github.commailto:vocab@noreply.github.com; Michael Rosenberg (Faculty) Subject: Re: [tdwg-content] Taxonomic name usage files
Thanks, Rich, for catching me up.
So then if we can apparently cover a good bit of ground (with what I assume you'd call a "smart [or the only] way of implementing GNUB"), this does raise the question of how to bring this as close as possible to the peer-review/publication process where an author team may be motivated to express their intentions related to name usages in various explicit, structurally recorded ways.
Best, Nico
On Tue, Apr 19, 2016 at 2:09 PM, Richard Pyle <deepreef@bishopmuseum.orgmailto:deepreef@bishopmuseum.org> wrote:
- Given a multitude of well established and precise historical name usages, I explicitly don't want to commit to one in particular
that my present usage is congruent with, or not. (Indeed, I kind of think this is what the name withOUT a sec. does, "explicitly").
Yes! Exactly!
I choose to be vague - any past usage is ok with me, here.
During the TCS days, this is what we referred to as a "Nomenclatural Concept", which is roughly the sum/average of all historical treatments, more or less (ambiguity and vagueness deliberate/intentional).
I think we can presently model the vagueness (by integrating on the strings alone), but not the deliberateness thereof (in contrast to other situations where vagueness is not intended)?
Yes - a TNU without any relationshipAssertions. Basically, the only implied associations with other TNUs is via the Protonym link (i.e., a Nomenclatural Assertion).
- Franz. 2010. Revision of Apotomoderes (Insecta: Coleoptera). => Actually, "Insecta" here is more of a social concession to an outdated data filing paradigm than
a claim to an active speaker role (related to name usages that I actually care about). I am not intending to apply my taxonomic expertise to "Insecta"; that is *out of scope* (though the string is being written).
So... in that case, the question is whether to represent "Insecta" as used in Franz 2010 as a scientific name, or vernacular name (two different ways of modelling names, as the latter do not have structured Codes). Even if you fall on the side of using it as a scientific (taxon) name, you can still create a TNU for it. In the spirit of Walter's pioneering work on this stuff, TNUs only represent "potential" taxa.
Aloha, Rich
participants (4)
-
Donat Agosti
-
Gaurav Vaidya
-
Nico Franz
-
Richard Pyle