Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
Just re-sending the message below because it bounced the first time...
Markus/all,
I guess my point is that (as I understand it) scientificName is a required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists somewhere and you can just point to it. in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.aumailto:Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities: http://www.cmar.csiro.au/datacentre/biodiversity.htm Personal info: http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566
Sorry if I wasnt clear, but definitely b) Not all names can be easily reassembled with just the atoms. Autonyms need a bit of caution, hybrid formulas surely wont fit into the atoms and things like Inula L. (s.str.) or Valeriana officinalis s. str. wont be possible either. dwc:scientificName should be the most complete representation of the full name. The (redundant) atomised parts are a recommended nice to have to avoid any name parsing.
As a consumer this leads to trouble as there is no guarantee that all terms exist. But the same problem exists with all of the ID terms and their verbatim counterpart. Only additional best practice guidelines can make sure we have the most important terms such as taxonRank or taxonomicStatus available.
Markus
On Nov 18, 2010, at 23:26, Tony.Rees@csiro.au wrote:
Just re-sending the message below because it bounced the first time…
Markus/all,
I guess my point is that (as I understand it) scientificName is a required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists somewhere and you can just point to it. in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities: http://www.cmar.csiro.au/datacentre/biodiversity.htm Personal info: http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Well, that sounds fine to me, however you may note that the ICZN Code at least expressly states that authorship is *not* part of the scientific name:
"Article 51. Citation of names of authors.
51.1. Optional use of names of authors. The name of the author does not form part of the name of a taxon and its citation is optional, although customary and often advisable."
I vaguely remember this has been discussed before - would anyone care to comment further?
Cheers - Tony
-----Original Message----- From: Markus Döring [mailto:m.doering@mac.com] Sent: Friday, 19 November 2010 9:49 AM To: Rees, Tony (CMAR, Hobart); David Remsen Cc: tdwg-content@lists.tdwg.org List Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
Sorry if I wasnt clear, but definitely b) Not all names can be easily reassembled with just the atoms. Autonyms need a bit of caution, hybrid formulas surely wont fit into the atoms and things like Inula L. (s.str.) or Valeriana officinalis s. str. wont be possible either. dwc:scientificName should be the most complete representation of the full name. The (redundant) atomised parts are a recommended nice to have to avoid any name parsing.
As a consumer this leads to trouble as there is no guarantee that all terms exist. But the same problem exists with all of the ID terms and their verbatim counterpart. Only additional best practice guidelines can make sure we have the most important terms such as taxonRank or taxonomicStatus available.
Markus
On Nov 18, 2010, at 23:26, Tony.Rees@csiro.au wrote:
Just re-sending the message below because it bounced the first time.
Markus/all,
I guess my point is that (as I understand it) scientificName is a
required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists
somewhere and you can just point to it.
in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities:
http://www.cmar.csiro.au/datacentre/biodiversity.htm
Personal info:
http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Including the authors, dates and any thing else (with the exception of the infraspecific rank and teh hybrid symbol and in botany) as part of a thing called "the name" is an unholy abomination, a lexical atrocity, an affront to logic and an insult the natural order of the cosmos and any deity conceived by humankind.
In botany at least, the "name" (which I take to be the basic communication handle for a taxon) is conventionally the genus plus the species epithet (plus the infraspecies rank and the infraspecies name, if present). All else is protologue and other metadata (e. s.l. s.s, taxonomic qualifiers) some of which may be essential for name resolution, but metadata nevertheless. In much communication, the name can and does travel in the absense of its metadata; that is not to say it is a good or a bad thing, only that it happens. I am not saying thi binominal approach is a good thing, in many respects Linnaeus and the genus have a lot to answer for; but it what we have been given to work with.
in zoology... well, who can say what evil lurks within... but if what you say below is right, at least they got it right with the authorship... ;)
I think it is a really bad move to attempt to redefine "name" so as to include the name metadata to achieve some degree of name resolution (basically the list of attributes does not end until you have almost a complete bibliographic citation - is author abbreviation enough? no, add the full author surname? no, add the author initials? no, add the first name? no, add the transferring author? no, add the year of the publication? no, add the journal? no, add the article title? no, add the type specimen? no, add the... )
That is not to say these strings of the name and selected metadata are not useful, perhaps even essential, in certain contexts; only that we should not pretend or declare they are the "name". They are something else and we should find another "name" for them. "Scientific name" is not good enough as a normal person would interpret this as the latin name
Fortunately I think nearly every modern application keeps all the bits of the name and publication metadata separate in some form, so it is just a matter of geekery to glue them together in whatever combination we might require...
jim
On Fri, Nov 19, 2010 at 9:57 AM, Tony.Rees@csiro.au wrote:
Well, that sounds fine to me, however you may note that the ICZN Code at least expressly states that authorship is *not* part of the scientific name:
"Article 51. Citation of names of authors.
51.1. Optional use of names of authors. The name of the author does not form part of the name of a taxon and its citation is optional, although customary and often advisable."
I vaguely remember this has been discussed before - would anyone care to comment further?
Cheers - Tony
-----Original Message----- From: Markus Döring [mailto:m.doering@mac.com] Sent: Friday, 19 November 2010 9:49 AM To: Rees, Tony (CMAR, Hobart); David Remsen Cc: tdwg-content@lists.tdwg.org List Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
Sorry if I wasnt clear, but definitely b) Not all names can be easily reassembled with just the atoms. Autonyms need a bit of caution, hybrid formulas surely wont fit into the atoms and things like Inula L. (s.str.) or Valeriana officinalis s. str. wont be possible either. dwc:scientificName should be the most complete representation of the full name. The (redundant) atomised parts are a recommended nice to have to avoid any name parsing.
As a consumer this leads to trouble as there is no guarantee that all terms exist. But the same problem exists with all of the ID terms and their verbatim counterpart. Only additional best practice guidelines can make sure we have the most important terms such as taxonRank or taxonomicStatus available.
Markus
On Nov 18, 2010, at 23:26, Tony.Rees@csiro.au wrote:
Just re-sending the message below because it bounced the first time.
Markus/all,
I guess my point is that (as I understand it) scientificName is a
required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists
somewhere and you can just point to it.
in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities:
http://www.cmar.csiro.au/datacentre/biodiversity.htm
Personal info:
http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
I'm with Jm. For the love of God let's keep things clean and simple. Have a field for the name without any extraneous junk (and by that I include authorship), and have a separate field for the name plus all the extra stuff. Having fields that atomise the name is also useful, but not at the expense of a field with just the name.
Please, please think of data consumers like me who have to parse this stuff. There is no excuse in this day and age for publishing data that users have to parse before they can do anything sensible with it.
Regards
Rod
On 19 Nov 2010, at 07:06, Jim Croft wrote:
Including the authors, dates and any thing else (with the exception of the infraspecific rank and teh hybrid symbol and in botany) as part of a thing called "the name" is an unholy abomination, a lexical atrocity, an affront to logic and an insult the natural order of the cosmos and any deity conceived by humankind.
In botany at least, the "name" (which I take to be the basic communication handle for a taxon) is conventionally the genus plus the species epithet (plus the infraspecies rank and the infraspecies name, if present). All else is protologue and other metadata (e. s.l. s.s, taxonomic qualifiers) some of which may be essential for name resolution, but metadata nevertheless. In much communication, the name can and does travel in the absense of its metadata; that is not to say it is a good or a bad thing, only that it happens. I am not saying thi binominal approach is a good thing, in many respects Linnaeus and the genus have a lot to answer for; but it what we have been given to work with.
in zoology... well, who can say what evil lurks within... but if what you say below is right, at least they got it right with the authorship... ;)
I think it is a really bad move to attempt to redefine "name" so as to include the name metadata to achieve some degree of name resolution (basically the list of attributes does not end until you have almost a complete bibliographic citation - is author abbreviation enough? no, add the full author surname? no, add the author initials? no, add the first name? no, add the transferring author? no, add the year of the publication? no, add the journal? no, add the article title? no, add the type specimen? no, add the... )
That is not to say these strings of the name and selected metadata are not useful, perhaps even essential, in certain contexts; only that we should not pretend or declare they are the "name". They are something else and we should find another "name" for them. "Scientific name" is not good enough as a normal person would interpret this as the latin name
Fortunately I think nearly every modern application keeps all the bits of the name and publication metadata separate in some form, so it is just a matter of geekery to glue them together in whatever combination we might require...
jim
On Fri, Nov 19, 2010 at 9:57 AM, Tony.Rees@csiro.au wrote:
Well, that sounds fine to me, however you may note that the ICZN Code at least expressly states that authorship is *not* part of the scientific name:
"Article 51. Citation of names of authors.
51.1. Optional use of names of authors. The name of the author does not form part of the name of a taxon and its citation is optional, although customary and often advisable."
I vaguely remember this has been discussed before - would anyone care to comment further?
Cheers - Tony
-----Original Message----- From: Markus Döring [mailto:m.doering@mac.com] Sent: Friday, 19 November 2010 9:49 AM To: Rees, Tony (CMAR, Hobart); David Remsen Cc: tdwg-content@lists.tdwg.org List Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
Sorry if I wasnt clear, but definitely b) Not all names can be easily reassembled with just the atoms. Autonyms need a bit of caution, hybrid formulas surely wont fit into the atoms and things like Inula L. (s.str.) or Valeriana officinalis s. str. wont be possible either. dwc:scientificName should be the most complete representation of the full name. The (redundant) atomised parts are a recommended nice to have to avoid any name parsing.
As a consumer this leads to trouble as there is no guarantee that all terms exist. But the same problem exists with all of the ID terms and their verbatim counterpart. Only additional best practice guidelines can make sure we have the most important terms such as taxonRank or taxonomicStatus available.
Markus
On Nov 18, 2010, at 23:26, Tony.Rees@csiro.au wrote:
Just re-sending the message below because it bounced the first time.
Markus/all,
I guess my point is that (as I understand it) scientificName is a
required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists
somewhere and you can just point to it.
in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities:
http://www.cmar.csiro.au/datacentre/biodiversity.htm
Personal info:
http://www.fishbase.org/collaborators/collaboratorsummary.cfm? id=1566
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- _________________ Jim Croft ~ jim.croft@gmail.com ~ +61-2-62509499 ~ http://www.google.com/profiles/jim.croft 'A civilized society is one which tolerates eccentricity to the point of doubtful sanity.'
- Robert Frost, poet (1874-1963)
Please send URIs, not attachments: http://www.gnu.org/philosophy/no-word-attachments.html _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
What Darwin Core offers right now are 2 ways of expressing the name:
A) the complete string as dwc:scientificName B) the atomised parts: genus, subgenus, specificEpithet, infraspecificEpithet, verbatimTaxonRank (+taxonRank), scientificNameAuthorship
Those 2 options are there to satisfy the different needs we have seen in this thread - the consumers call for a simple input and the need to express complex names in their verbatim form. Is there really anything we are missing?
When it comes to how its being used in the wild right now I agree with Dima that there is a lot of variety out there. It would be very, very useful if everyone would always publish both options in a consistent way.
Right now the fulI name can be found in once of these combinations: - scientificName - scientificName & scientificNameAuthorship - scientificName, taxonRank & scientificNameAuthorship - scientificName, verbatimTaxonRank & scientificNameAuthorship - genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, scientificNameAuthorship - genus, subgenus, specificEpithet, infraspecificEpithet, verbatimTaxonRank, scientificNameAuthorship
To make matters worse the way the authorship is expressed is also impressively rich of variants. In particular the use of brackets is not always consistent. You find things like:
# regular botanical names with ex authors Mycosphaerella eryngii (Fr. ex Duby) Johanson ex Oudem. 1897
# original name authors not in brackets, but year is Lithobius chibenus Ishii & Tamura (1994)
# original name in brackets but year not Zophosis persis (Chatanay), 1914
# names with imprint years cited Ctenotus alacer Storr, 1970 ["1969"] Anomalopus truncatus (Peters, 1876 ["1877"]) Deyeuxia coarctata Kunth, 1815 [1816] Proasellus arnautovici (Remy 1932 1941)
On Nov 19, 2010, at 8:50, Roderic Page wrote:
I'm with Jm. For the love of God let's keep things clean and simple. Have a field for the name without any extraneous junk (and by that I include authorship), and have a separate field for the name plus all the extra stuff. Having fields that atomise the name is also useful, but not at the expense of a field with just the name.
Please, please think of data consumers like me who have to parse this stuff. There is no excuse in this day and age for publishing data that users have to parse before they can do anything sensible with it.
Regards
Rod
On 19 Nov 2010, at 07:06, Jim Croft wrote:
Including the authors, dates and any thing else (with the exception of the infraspecific rank and teh hybrid symbol and in botany) as part of a thing called "the name" is an unholy abomination, a lexical atrocity, an affront to logic and an insult the natural order of the cosmos and any deity conceived by humankind.
In botany at least, the "name" (which I take to be the basic communication handle for a taxon) is conventionally the genus plus the species epithet (plus the infraspecies rank and the infraspecies name, if present). All else is protologue and other metadata (e. s.l. s.s, taxonomic qualifiers) some of which may be essential for name resolution, but metadata nevertheless. In much communication, the name can and does travel in the absense of its metadata; that is not to say it is a good or a bad thing, only that it happens. I am not saying thi binominal approach is a good thing, in many respects Linnaeus and the genus have a lot to answer for; but it what we have been given to work with.
in zoology... well, who can say what evil lurks within... but if what you say below is right, at least they got it right with the authorship... ;)
I think it is a really bad move to attempt to redefine "name" so as to include the name metadata to achieve some degree of name resolution (basically the list of attributes does not end until you have almost a complete bibliographic citation - is author abbreviation enough? no, add the full author surname? no, add the author initials? no, add the first name? no, add the transferring author? no, add the year of the publication? no, add the journal? no, add the article title? no, add the type specimen? no, add the... )
That is not to say these strings of the name and selected metadata are not useful, perhaps even essential, in certain contexts; only that we should not pretend or declare they are the "name". They are something else and we should find another "name" for them. "Scientific name" is not good enough as a normal person would interpret this as the latin name
Fortunately I think nearly every modern application keeps all the bits of the name and publication metadata separate in some form, so it is just a matter of geekery to glue them together in whatever combination we might require...
jim
On Fri, Nov 19, 2010 at 9:57 AM, Tony.Rees@csiro.au wrote:
Well, that sounds fine to me, however you may note that the ICZN Code at least expressly states that authorship is *not* part of the scientific name:
"Article 51. Citation of names of authors.
51.1. Optional use of names of authors. The name of the author does not form part of the name of a taxon and its citation is optional, although customary and often advisable."
I vaguely remember this has been discussed before - would anyone care to comment further?
Cheers - Tony
-----Original Message----- From: Markus Döring [mailto:m.doering@mac.com] Sent: Friday, 19 November 2010 9:49 AM To: Rees, Tony (CMAR, Hobart); David Remsen Cc: tdwg-content@lists.tdwg.org List Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
Sorry if I wasnt clear, but definitely b) Not all names can be easily reassembled with just the atoms. Autonyms need a bit of caution, hybrid formulas surely wont fit into the atoms and things like Inula L. (s.str.) or Valeriana officinalis s. str. wont be possible either. dwc:scientificName should be the most complete representation of the full name. The (redundant) atomised parts are a recommended nice to have to avoid any name parsing.
As a consumer this leads to trouble as there is no guarantee that all terms exist. But the same problem exists with all of the ID terms and their verbatim counterpart. Only additional best practice guidelines can make sure we have the most important terms such as taxonRank or taxonomicStatus available.
Markus
On Nov 18, 2010, at 23:26, Tony.Rees@csiro.au wrote:
Just re-sending the message below because it bounced the first time.
Markus/all,
I guess my point is that (as I understand it) scientificName is a
required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists
somewhere and you can just point to it.
in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities:
http://www.cmar.csiro.au/datacentre/biodiversity.htm
Personal info:
http://www.fishbase.org/collaborators/collaboratorsummary.cfm? id=1566
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- _________________ Jim Croft ~ jim.croft@gmail.com ~ +61-2-62509499 ~ http://www.google.com/profiles/jim.croft 'A civilized society is one which tolerates eccentricity to the point of doubtful sanity.'
- Robert Frost, poet (1874-1963)
Please send URIs, not attachments: http://www.gnu.org/philosophy/no-word-attachments.html _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
I'm coming in a bit late on this conversation so I hope I am not repeating what has already been said, but botanical names can also have authorship at both specific and infraspecific levels, e.g. Centaurea apiculata Ledeb. ssp. adpressa (Ledeb.) Dostál
And to make it even more complex, you can have subspecies variants, so 2 infraspecific levels, e.g. Centaurea affinis Friv. ssp. affinis var. Affinis
Atomising this properly could be quite complex but necessary to be able to present the name as it should be written with italics in the correct place. E.g. in the example above, the author string and rank strings are not normally italiced, but the rest of the name is. Unless we can include this formatting information in dwc:scientificName?
Regards
John
-----Original Message----- From: tdwg-content-bounces@lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of "Markus Döring (GBIF)" Sent: 19 November 2010 09:24 To: Roderic Page Cc: tdwg-content@lists.tdwg.org; Jim Croft Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
What Darwin Core offers right now are 2 ways of expressing the name:
A) the complete string as dwc:scientificName B) the atomised parts: genus, subgenus, specificEpithet, infraspecificEpithet, verbatimTaxonRank (+taxonRank), scientificNameAuthorship
Those 2 options are there to satisfy the different needs we have seen in this thread - the consumers call for a simple input and the need to express complex names in their verbatim form. Is there really anything we are missing?
When it comes to how its being used in the wild right now I agree with Dima that there is a lot of variety out there. It would be very, very useful if everyone would always publish both options in a consistent way.
Right now the fulI name can be found in once of these combinations: - scientificName - scientificName & scientificNameAuthorship - scientificName, taxonRank & scientificNameAuthorship - scientificName, verbatimTaxonRank & scientificNameAuthorship - genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, scientificNameAuthorship - genus, subgenus, specificEpithet, infraspecificEpithet, verbatimTaxonRank, scientificNameAuthorship
To make matters worse the way the authorship is expressed is also impressively rich of variants. In particular the use of brackets is not always consistent. You find things like:
# regular botanical names with ex authors Mycosphaerella eryngii (Fr. ex Duby) Johanson ex Oudem. 1897
# original name authors not in brackets, but year is Lithobius chibenus Ishii & Tamura (1994)
# original name in brackets but year not Zophosis persis (Chatanay), 1914
# names with imprint years cited Ctenotus alacer Storr, 1970 ["1969"] Anomalopus truncatus (Peters, 1876 ["1877"]) Deyeuxia coarctata Kunth, 1815 [1816] Proasellus arnautovici (Remy 1932 1941)
On Nov 19, 2010, at 8:50, Roderic Page wrote:
I'm with Jm. For the love of God let's keep things clean and simple. Have a field for the name without any extraneous junk (and by that I include authorship), and have a separate field for the name plus all the extra stuff. Having fields that atomise the name is also useful, but not at the expense of a field with just the name.
Please, please think of data consumers like me who have to parse this stuff. There is no excuse in this day and age for publishing data that users have to parse before they can do anything sensible with it.
Regards
Rod
On 19 Nov 2010, at 07:06, Jim Croft wrote:
Including the authors, dates and any thing else (with the exception of the infraspecific rank and teh hybrid symbol and in botany) as part of a thing called "the name" is an unholy abomination, a lexical atrocity, an affront to logic and an insult the natural order of the cosmos and any deity conceived by humankind.
In botany at least, the "name" (which I take to be the basic communication handle for a taxon) is conventionally the genus plus the species epithet (plus the infraspecies rank and the infraspecies name, if present). All else is protologue and other metadata (e. s.l. s.s, taxonomic qualifiers) some of which may be essential for name resolution, but metadata nevertheless. In much communication, the name can and does travel in the absense of its metadata; that is not to say it is a good or a bad thing, only that it happens. I am not saying thi binominal approach is a good thing, in many respects Linnaeus and the genus have a lot to answer for; but it what we have been given to work with.
in zoology... well, who can say what evil lurks within... but if what you say below is right, at least they got it right with the authorship... ;)
I think it is a really bad move to attempt to redefine "name" so as to include the name metadata to achieve some degree of name resolution (basically the list of attributes does not end until you have almost a complete bibliographic citation - is author abbreviation enough? no, add the full author surname? no, add the author initials? no, add the first name? no, add the transferring author? no, add the year of the publication? no, add the journal? no, add the article title? no, add the type specimen? no, add the... )
That is not to say these strings of the name and selected metadata are not useful, perhaps even essential, in certain contexts; only that we should not pretend or declare they are the "name". They are something else and we should find another "name" for them. "Scientific name" is not good enough as a normal person would interpret this as the latin name
Fortunately I think nearly every modern application keeps all the bits of the name and publication metadata separate in some form, so it is just a matter of geekery to glue them together in whatever combination we might require...
jim
On Fri, Nov 19, 2010 at 9:57 AM, Tony.Rees@csiro.au wrote:
Well, that sounds fine to me, however you may note that the ICZN Code at least expressly states that authorship is *not* part of the scientific name:
"Article 51. Citation of names of authors.
51.1. Optional use of names of authors. The name of the author does not form part of the name of a taxon and its citation is optional, although customary and often advisable."
I vaguely remember this has been discussed before - would anyone care to comment further?
Cheers - Tony
-----Original Message----- From: Markus Döring [mailto:m.doering@mac.com] Sent: Friday, 19 November 2010 9:49 AM To: Rees, Tony (CMAR, Hobart); David Remsen Cc: tdwg-content@lists.tdwg.org List Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
Sorry if I wasnt clear, but definitely b) Not all names can be easily reassembled with just the atoms. Autonyms need a bit of caution, hybrid formulas surely wont fit into the atoms and things like Inula L. (s.str.) or Valeriana officinalis s. str. wont be possible either. dwc:scientificName should be the most complete representation of the full name. The (redundant) atomised parts are a recommended nice to have to avoid any name parsing.
As a consumer this leads to trouble as there is no guarantee that all terms exist. But the same problem exists with all of the ID terms and their verbatim counterpart. Only additional best practice guidelines can make sure we have the most important terms such as taxonRank or taxonomicStatus available.
Markus
On Nov 18, 2010, at 23:26, Tony.Rees@csiro.au wrote:
Just re-sending the message below because it bounced the first time.
Markus/all,
I guess my point is that (as I understand it) scientificName is a
required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists
somewhere and you can just point to it.
in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities:
http://www.cmar.csiro.au/datacentre/biodiversity.htm
Personal info:
http://www.fishbase.org/collaborators/collaboratorsummary.cfm? id=1566
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- _________________ Jim Croft ~ jim.croft@gmail.com ~ +61-2-62509499 ~ http://www.google.com/profiles/jim.croft 'A civilized society is one which tolerates eccentricity to the point of doubtful sanity.'
- Robert Frost, poet (1874-1963)
Please send URIs, not attachments: http://www.gnu.org/philosophy/no-word-attachments.html _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
From harvesting point of view it is a pain to figure out what did author
decided to express scientific name and thus
- scientificName - scientificName & scientificNameAuthorship - scientificName, taxonRank & scientificNameAuthorship - scientificName, verbatimTaxonRank & scientificNameAuthorship - genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, scientificNameAuthorship - genus, subgenus, specificEpithet, infraspecificEpithet, verbatimTaxonRank, scientificNameAuthorship
is a nightmare to figure intetion of an archive author during the harvest. If there would be 2 terms -- one only for 'canonical name' and one for a name with authorship (if known) I would be much obliged!
Dima
On Fri, Nov 19, 2010 at 4:24 AM, "Markus Döring (GBIF)" mdoering@gbif.orgwrote:
What Darwin Core offers right now are 2 ways of expressing the name:
A) the complete string as dwc:scientificName B) the atomised parts: genus, subgenus, specificEpithet, infraspecificEpithet, verbatimTaxonRank (+taxonRank), scientificNameAuthorship
Those 2 options are there to satisfy the different needs we have seen in this thread - the consumers call for a simple input and the need to express complex names in their verbatim form. Is there really anything we are missing?
When it comes to how its being used in the wild right now I agree with Dima that there is a lot of variety out there. It would be very, very useful if everyone would always publish both options in a consistent way.
Right now the fulI name can be found in once of these combinations:
- scientificName
- scientificName & scientificNameAuthorship
- scientificName, taxonRank & scientificNameAuthorship
- scientificName, verbatimTaxonRank & scientificNameAuthorship
- genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank,
scientificNameAuthorship
- genus, subgenus, specificEpithet, infraspecificEpithet,
verbatimTaxonRank, scientificNameAuthorship
To make matters worse the way the authorship is expressed is also impressively rich of variants. In particular the use of brackets is not always consistent. You find things like:
# regular botanical names with ex authors Mycosphaerella eryngii (Fr. ex Duby) Johanson ex Oudem. 1897
# original name authors not in brackets, but year is Lithobius chibenus Ishii & Tamura (1994)
# original name in brackets but year not Zophosis persis (Chatanay), 1914
# names with imprint years cited Ctenotus alacer Storr, 1970 ["1969"] Anomalopus truncatus (Peters, 1876 ["1877"]) Deyeuxia coarctata Kunth, 1815 [1816] Proasellus arnautovici (Remy 1932 1941)
On Nov 19, 2010, at 8:50, Roderic Page wrote:
I'm with Jm. For the love of God let's keep things clean and simple. Have a field for the name without any extraneous junk (and by that I include authorship), and have a separate field for the name plus all the extra stuff. Having fields that atomise the name is also useful, but not at the expense of a field with just the name.
Please, please think of data consumers like me who have to parse this stuff. There is no excuse in this day and age for publishing data that users have to parse before they can do anything sensible with it.
Regards
Rod
On 19 Nov 2010, at 07:06, Jim Croft wrote:
Including the authors, dates and any thing else (with the exception of the infraspecific rank and teh hybrid symbol and in botany) as part of a thing called "the name" is an unholy abomination, a lexical atrocity, an affront to logic and an insult the natural order of the cosmos and any deity conceived by humankind.
In botany at least, the "name" (which I take to be the basic communication handle for a taxon) is conventionally the genus plus the species epithet (plus the infraspecies rank and the infraspecies name, if present). All else is protologue and other metadata (e. s.l. s.s, taxonomic qualifiers) some of which may be essential for name resolution, but metadata nevertheless. In much communication, the name can and does travel in the absense of its metadata; that is not to say it is a good or a bad thing, only that it happens. I am not saying thi binominal approach is a good thing, in many respects Linnaeus and the genus have a lot to answer for; but it what we have been given to work with.
in zoology... well, who can say what evil lurks within... but if what you say below is right, at least they got it right with the authorship... ;)
I think it is a really bad move to attempt to redefine "name" so as to include the name metadata to achieve some degree of name resolution (basically the list of attributes does not end until you have almost a complete bibliographic citation - is author abbreviation enough? no, add the full author surname? no, add the author initials? no, add the first name? no, add the transferring author? no, add the year of the publication? no, add the journal? no, add the article title? no, add the type specimen? no, add the... )
That is not to say these strings of the name and selected metadata are not useful, perhaps even essential, in certain contexts; only that we should not pretend or declare they are the "name". They are something else and we should find another "name" for them. "Scientific name" is not good enough as a normal person would interpret this as the latin name
Fortunately I think nearly every modern application keeps all the bits of the name and publication metadata separate in some form, so it is just a matter of geekery to glue them together in whatever combination we might require...
jim
On Fri, Nov 19, 2010 at 9:57 AM, Tony.Rees@csiro.au wrote:
Well, that sounds fine to me, however you may note that the ICZN Code at least expressly states that authorship is *not* part of the scientific name:
"Article 51. Citation of names of authors.
51.1. Optional use of names of authors. The name of the author does not form part of the name of a taxon and its citation is optional, although customary and often advisable."
I vaguely remember this has been discussed before - would anyone care to comment further?
Cheers - Tony
-----Original Message----- From: Markus Döring [mailto:m.doering@mac.com] Sent: Friday, 19 November 2010 9:49 AM To: Rees, Tony (CMAR, Hobart); David Remsen Cc: tdwg-content@lists.tdwg.org List Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
Sorry if I wasnt clear, but definitely b) Not all names can be easily reassembled with just the atoms. Autonyms need a bit of caution, hybrid formulas surely wont fit into the atoms and things like Inula L. (s.str.) or Valeriana officinalis s. str. wont be possible either. dwc:scientificName should be the most complete representation of the full name. The (redundant) atomised parts are a recommended nice to have to avoid any name parsing.
As a consumer this leads to trouble as there is no guarantee that all terms exist. But the same problem exists with all of the ID terms and their verbatim counterpart. Only additional best practice guidelines can make sure we have the most important terms such as taxonRank or taxonomicStatus available.
Markus
On Nov 18, 2010, at 23:26, Tony.Rees@csiro.au wrote:
Just re-sending the message below because it bounced the first time.
Markus/all,
I guess my point is that (as I understand it) scientificName is a
required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists
somewhere and you can just point to it.
in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities:
http://www.cmar.csiro.au/datacentre/biodiversity.htm
Personal info:
http://www.fishbase.org/collaborators/collaboratorsummary.cfm? id=1566
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-- _________________ Jim Croft ~ jim.croft@gmail.com ~ +61-2-62509499 ~ http://www.google.com/profiles/jim.croft 'A civilized society is one which tolerates eccentricity to the point of doubtful sanity.'
- Robert Frost, poet (1874-1963)
Please send URIs, not attachments: http://www.gnu.org/philosophy/no-word-attachments.html _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: r.page@bio.gla.ac.uk Tel: +44 141 330 4778 Fax: +44 141 330 2792 AIM: rodpage1962@aim.com Facebook: http://www.facebook.com/profile.php?id=1112517192 Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com Home page: http://taxonomy.zoology.gla.ac.uk/rod/rod.html
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Markus Döring (GBIF) wrote:
What Darwin Core offers right now are 2 ways of expressing the name:
A) the complete string as dwc:scientificName B) the atomised parts: genus, subgenus, specificEpithet, infraspecificEpithet, verbatimTaxonRank (+taxonRank), scientificNameAuthorship
Those 2 options are there to satisfy the different needs we have seen in this thread - the consumers call for a simple input and the need to express complex names in their verbatim form. Is there really anything we are missing?
Yes! A consensus on dwc:taxonRank as to whether Latin or English should be used (see http://code.google.com/p/darwincore/wiki/Taxon#taxonRank). I think it is bad to have a controlled vocabulary with two options for every possible value. Why make all software check for two alternatives when a consensus would fix the problem? (Consensus... did I say that word in a tdwg-content email????)
Steve
On Fri, Nov 19, 2010 at 8:56 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
[concensus discussion] ... Why make all software check for two alternatives when a consensus would fix the problem? (Consensus... did I say that word in a tdwg-content email????)
Ummm, because plenty of data will not meet the consensus? Because robust software checks for things that may occur even if they violate expectations, rules, standards, recommendations, conventions, or consensus?
One model of consensus might be "what most of the real current data does". Since so much biodiversity data is dirty, I'm pretty sure that would bring howls from taxonomists following this list. But it would also have a shot at exposing more data, if one also believes that most consuming applications are not robust. Better is for the community to make recommendations (by consensus or some other mechanism) and let developers of non-robust applications accept responsibility for their non-robustness.
My self-serving(*) position is that a huge amount of dirty current data is being served by organizations/people who have no idea that it is dirty, and no systematic way to find out that it is. By contrast, the relatively small number of client developers are likely to have a good idea of where the dirt is and often can deal with it. The well-known social problem with that arises in circumstances such as yours, where domain scientists are writing software out of necessity or urgency, and rightfully want to get on with their science. They then have to make choices about where to spend their time: on software engineering or science. Many have little choice, since they are paid to be scientists, not software engineers. Nor are lay software engineers the only authors of non-robust software. Analogous time-constraints imposed on professionals often result in the same kinds of problems. Alas, there is no single solution to this conundrum. But my (not biologically informed) guess is that for the problem in this thread, supporting both alternatives does not impose a big burden on developers. In which case there is no need for consensus. :-)
Bob Morris (*) http://etaxonomy.org/mw/FP2010:_Continuous_Quality_Control
Well, I get your point and it certainly wouldn't be a huge burden on developers to check both the Latin and English names. But I thought the whole point of suggesting a controlled vocabulary was so that people would have one recommended value to use for each state they want to represent. Why do we have ISO 639-1 language codes? A good developer could just check for English, english, EN, en, eng., Eng., Engl, engl., etc. etc. Steve
Bob Morris wrote:
On Fri, Nov 19, 2010 at 8:56 AM, Steve Baskauf steve.baskauf@vanderbilt.edu wrote:
[concensus discussion] ... Why make all software check for two alternatives when a consensus would fix the problem? (Consensus... did I say that word in a tdwg-content email????)
Ummm, because plenty of data will not meet the consensus? Because robust software checks for things that may occur even if they violate expectations, rules, standards, recommendations, conventions, or consensus?
One model of consensus might be "what most of the real current data does". Since so much biodiversity data is dirty, I'm pretty sure that would bring howls from taxonomists following this list. But it would also have a shot at exposing more data, if one also believes that most consuming applications are not robust. Better is for the community to make recommendations (by consensus or some other mechanism) and let developers of non-robust applications accept responsibility for their non-robustness.
My self-serving(*) position is that a huge amount of dirty current data is being served by organizations/people who have no idea that it is dirty, and no systematic way to find out that it is. By contrast, the relatively small number of client developers are likely to have a good idea of where the dirt is and often can deal with it. The well-known social problem with that arises in circumstances such as yours, where domain scientists are writing software out of necessity or urgency, and rightfully want to get on with their science. They then have to make choices about where to spend their time: on software engineering or science. Many have little choice, since they are paid to be scientists, not software engineers. Nor are lay software engineers the only authors of non-robust software. Analogous time-constraints imposed on professionals often result in the same kinds of problems. Alas, there is no single solution to this conundrum. But my (not biologically informed) guess is that for the problem in this thread, supporting both alternatives does not impose a big burden on developers. In which case there is no need for consensus. :-)
Bob Morris (*) http://etaxonomy.org/mw/FP2010:_Continuous_Quality_Control
[concensus discussion] ... Why make all software check for two alternatives when a consensus would fix the problem? (Consensus... did I say that word in a tdwg-content email????)
Ummm, because plenty of data will not meet the consensus? Because robust software checks for things that may occur even if they violate expectations, rules, standards, recommendations, conventions, or consensus?
Nevertheless its worthwhile trying to converge towards a standard vocabulary. So we definitely should recommend a best practice that more and more people can follow over time!
At GBIF we recommend the english values, but if there is a consensus to change that to latin I think we dont mind: http://rs.gbif.org/vocabulary/gbif/rank.xml
Markus
I agree wholly. There does remain the question of what changes, if any, are being urged for legacy data, and what is the scope of such changes. One thing that ought to happen is that any recommendations be versioned so that software can document and act upon the version it means to. I forget whether GBIF recommendations do this, but in another context I'd urgently like to know. :-)
On Fri, Nov 19, 2010 at 10:50 AM, "Markus Döring (GBIF)" mdoering@gbif.org wrote:
[concensus discussion] ... Why make all software check for two alternatives when a consensus would fix the problem? (Consensus... did I say that word in a tdwg-content email????)
Ummm, because plenty of data will not meet the consensus? Because robust software checks for things that may occur even if they violate expectations, rules, standards, recommendations, conventions, or consensus?
Nevertheless its worthwhile trying to converge towards a standard vocabulary. So we definitely should recommend a best practice that more and more people can follow over time!
At GBIF we recommend the english values, but if there is a consensus to change that to latin I think we dont mind: http://rs.gbif.org/vocabulary/gbif/rank.xml
Markus
"Markus Döring (GBIF)" mdoering@gbif.org ha escrito:
[concensus discussion] ... Why make all software check for two alternatives when a consensus would fix the problem? (Consensus... did I say that word in a tdwg-content email????)
Ummm, because plenty of data will not meet the consensus? Because robust software checks for things that may occur even if they violate expectations, rules, standards, recommendations, conventions, or consensus?
Nevertheless its worthwhile trying to converge towards a standard vocabulary. So we definitely should recommend a best practice that more and more people can follow over time!
At GBIF we recommend the english values, but if there is a consensus to change that to latin I think we dont mind: http://rs.gbif.org/vocabulary/gbif/rank.xml
Markus
In time and from scratch... yes, converging simplifies things in Taxspeak. Smaller dictionary: doubleplusgood.
But existing, real life data is stubbornly diverse. Non-Oceanians would probably favour robust (i.e. multiple) approaches for a while.
Arturo
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
---------------------------------------------------------------- Este mensaje ha sido enviado desde https://webmail.unav.es
Folks,
I found the following worked example at http://code.google.com/p/darwincore/wiki/Examples#Taxonomic_Treatment,_norma...
which agrees with Marcus' reading below:
dwc:taxonID=10400156 dwc:parentNameUsageID=10400152 dwc:scientificName=Philander opossum Linnaeus, 1758 dwc:scientificNameAuthorship=Linnaeus, 1758 dwc:taxonRank=species dwc:taxonomicStatus=valid dwc:nomenclaturalCode=ICZN dwc:namePublishedIn=Syst. Nat., 10th ed., 1: 55. dwc:taxonRemarks=Corbet and Hill (1980), Hall (1981), Husson (1978), and Pine (1973) used Metachirops opossum for this species. Reviewed by Castro-Arellano et al. (2000, Mammalian Species, 638). The name D. larvata Jentink, 1888, is a nomen nudum. Didelphis opossum Linnaeus, 1758, is the type species for Holothylax Cabrera, 1919. dwc:vernacularName=Gray Four-eyed Opossum dc:source=http://www.bucknell.edu/msw3/browse.asp?id=10400156
However the practical implication of this is that a name parser now has to extract the "scientific name without authority" (a.k.a. canonical name) elements from the element "dwc:scientificName", by one of 2 possible methods: (1) from first principles as per the GNA names parser (which could in theory be tripped up by unexpected/non-standard content), or (2) by doing a substraction / difference between the "dwc:scientificName" and "dwc:scientificNameAuthorship" fields, which could be similarly tripped up by bad or mismatched content, or by a null in the second field event though there is an authority included in the first.
So, is this still the best method? One could say that genus, species epithet fields etc. are also available which could solve this, but in practice these are populated erratically by different data providers...
Suggestions, comments still appreciated...
Regards - Tony
________________________________________ From: tdwg-content-bounces@lists.tdwg.org [tdwg-content-bounces@lists.tdwg.org] On Behalf Of Tony.Rees@csiro.au [Tony.Rees@csiro.au] Sent: Friday, 19 November 2010 9:57 AM To: m.doering@mac.com; dremsen@gbif.org Cc: tdwg-content@lists.tdwg.org Subject: [ExternalEmail] Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
Well, that sounds fine to me, however you may note that the ICZN Code at least expressly states that authorship is *not* part of the scientific name:
"Article 51. Citation of names of authors.
51.1. Optional use of names of authors. The name of the author does not form part of the name of a taxon and its citation is optional, although customary and often advisable."
I vaguely remember this has been discussed before - would anyone care to comment further?
Cheers - Tony
-----Original Message----- From: Markus Döring [mailto:m.doering@mac.com] Sent: Friday, 19 November 2010 9:49 AM To: Rees, Tony (CMAR, Hobart); David Remsen Cc: tdwg-content@lists.tdwg.org List Subject: Re: [tdwg-content] [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
Sorry if I wasnt clear, but definitely b) Not all names can be easily reassembled with just the atoms. Autonyms need a bit of caution, hybrid formulas surely wont fit into the atoms and things like Inula L. (s.str.) or Valeriana officinalis s. str. wont be possible either. dwc:scientificName should be the most complete representation of the full name. The (redundant) atomised parts are a recommended nice to have to avoid any name parsing.
As a consumer this leads to trouble as there is no guarantee that all terms exist. But the same problem exists with all of the ID terms and their verbatim counterpart. Only additional best practice guidelines can make sure we have the most important terms such as taxonRank or taxonomicStatus available.
Markus
On Nov 18, 2010, at 23:26, Tony.Rees@csiro.au wrote:
Just re-sending the message below because it bounced the first time.
Markus/all,
I guess my point is that (as I understand it) scientificName is a
required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists
somewhere and you can just point to it.
in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities:
http://www.cmar.csiro.au/datacentre/biodiversity.htm
Personal info:
http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
_______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
On Nov 19, 2010, at 6:11 PM, Tony.Rees@csiro.au Tony.Rees@csiro.au wrote:
dwc:scientificName=Philander opossum Linnaeus, 1758 dwc:scientificNameAuthorship=Linnaeus, 1758
If we have the latter, I don't understand why the author and year needs to be in the former, too, frankly. Doesn't this just complicate our lives unnecessarily?
-hilmar
Hi Hilmar, all,
Actually this is similar to my belief when starting this thread. I think it looks odd because there is a missing implied element, i.e. this would make a lot more sense:
dwc:scientificName=Philander opossum Linnaeus, 1758 dwc:canonicalName=Philander opossum dwc:scientificNameAuthorship=Linnaeus, 1758
However the element "dwc:canonicalName" does not seem to exist, e.g. not present in http://rs.tdwg.org/dwc/terms/index.htm - Maybe it should...
Regards - Tony
________________________________________ From: Hilmar Lapp [hlapp@nescent.org] Sent: Sunday, 21 November 2010 5:34 AM To: Rees, Tony (CMAR, Hobart) Cc: m.doering@mac.com; dremsen@gbif.org; tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] [ExternalEmail] Re: [tdwg-tag] Inclusion of authorship in DwC scientificName: good or bad?
On Nov 19, 2010, at 6:11 PM, Tony.Rees@csiro.au Tony.Rees@csiro.au wrote:
dwc:scientificName=Philander opossum Linnaeus, 1758 dwc:scientificNameAuthorship=Linnaeus, 1758
If we have the latter, I don't understand why the author and year needs to be in the former, too, frankly. Doesn't this just complicate our lives unnecessarily?
-hilmar -- =========================================================== : Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org : ===========================================================
On 21/11/2010, at 9:00 AM, Tony.Rees@csiro.au Tony.Rees@csiro.au wrote:
dwc:scientificName=Philander opossum Linnaeus, 1758 dwc:canonicalName=Philander opossum dwc:scientificNameAuthorship=Linnaeus, 1758
However the element "dwc:canonicalName" does not seem to exist, e.g. not present in http://rs.tdwg.org/dwc/terms/index.htm - Maybe it should...
The current TDWG vocabulary calls this "nameComplete".
_______________________ Moral certainty is always a sign of cultural inferiority. The more uncivilized the man, the surer he is that he knows precisely what is right and what is wrong. All human progress, even in morals, has been the work of men who have doubted the current moral values, not of men who have whooped them up and tried to enforce them. The truly civilized man is always skeptical and tolerant. ~ H. L. Mencken, Minority Report (1956), quoted from James A. Haught, ed., 2000 Years of Disbelief
If there is a way to make approach b) to be fool proof and make sure everyone is following it, it would be fine for the purpose of conversion DwCA file to something eise. However in practice usually if I see scientificName and scientificNameAuthorhips in core file -- scientificName contains only canonical name.
Dima
On Thu, Nov 18, 2010 at 5:49 PM, Markus Döring m.doering@mac.com wrote:
Sorry if I wasnt clear, but definitely b) Not all names can be easily reassembled with just the atoms. Autonyms need a bit of caution, hybrid formulas surely wont fit into the atoms and things like Inula L. (s.str.) or Valeriana officinalis s. str. wont be possible either. dwc:scientificName should be the most complete representation of the full name. The (redundant) atomised parts are a recommended nice to have to avoid any name parsing.
As a consumer this leads to trouble as there is no guarantee that all terms exist. But the same problem exists with all of the ID terms and their verbatim counterpart. Only additional best practice guidelines can make sure we have the most important terms such as taxonRank or taxonomicStatus available.
Markus
On Nov 18, 2010, at 23:26, Tony.Rees@csiro.au wrote:
Just re-sending the message below because it bounced the first time…
Markus/all,
I guess my point is that (as I understand it) scientificName is a required field in DwC, so the question is what it should be populated with. If it is (e.g.) genus + species epithet + authority, then is it beneficial to supply these fields individually / atomised as well, maybe with other qualifiers as needed?
Just looking for an example "best practice" here - or maybe it exists somewhere and you can just point to it. in other words:
(a) <scientificName>Homo sapiens</scientificName> <scientificNameAuthorship>Linnaeus, 1758</a>
or (b): <scientificName>Homo sapiens Linnaeus, 1758</scientificName> <genus>Homo</genus> <specificEpithet>sapiens</specificEpithet> <scientificNameAuthorship>Linnaeus, 1758</a>
if you get my drift...
Regards - Tony
Tony Rees Manager, Divisional Data Centre, CSIRO Marine and Atmospheric Research, GPO Box 1538, Hobart, Tasmania 7001, Australia Ph: 0362 325318 (Int: +61 362 325318) Fax: 0362 325000 (Int: +61 362 325000) e-mail: Tony.Rees@csiro.au Manager, OBIS Australia regional node, http://www.obis.org.au/ Biodiversity informatics research activities: http://www.cmar.csiro.au/datacentre/biodiversity.htm Personal info: http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
participants (12)
-
"Markus Döring (GBIF)"
-
Arturo H Ariño
-
Bob Morris
-
Dmitry Mozzherin
-
Hilmar Lapp
-
Jim Croft
-
John van Breda
-
Markus Döring
-
Paul Murray
-
Roderic Page
-
Steve Baskauf
-
Tony.Rees@csiro.au