Dear John & DwC friends,
after finally having time to review the current dwc terms again I came across a couple of issues I'd like to see discussed or even changed. I am working for nearly 1 year now with the new terms during their development, especially with the new and modified taxonomic terms. So far they work very well in practice, but there are a few improvements I can think of, mostly related to the latest changes shortly before the public review started. I have added them as separate issues to the google code site, but list them here in one go. The number of issues is larger than I hoped for, but most of them are minor terminology issues for consistency and not touching the core meaning of the terms.
Markus
--- #47 rename basionym(ID) to originalName(ID) http://code.google.com/p/darwincore/issues/detail?id=47 The intend for this term is really to reflect where a name originally comes from in case it is a recombination. The term basionym is mostly used with botanists and covers only the cases when an epithet remains the same, i.e. not replacement names. The best matching, broader term therefore is originalName I think. Changes have to be done to both the verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found in this document: http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
--- #48 remove taxonConceptID http://code.google.com/p/darwincore/issues/detail?id=48 The conceptID is intended to state that 2 name usages / potential taxa are the same, even if they use a different name. This is a special case of true concept relations and I would much prefer to see this covered in a dedicated extension treating all concept relations, especially frequent cases such as includes, overlaps, etc. I am more than willing to define such an extension
--- #49 rename scientificNameID, acceptedScientificNameID and higherTaxonNameID http://code.google.com/p/darwincore/issues/detail?id=49 no matter what the final term names are I think the 3 ones should be consistent. Originally it was intended to call them taxonID, acceptedTaxonID and higherTaxonID with a loose definition of a taxon, more based on the idea of that all terms here are taxonomic terms and therefore contain taxon in their name. The current version scientificNameID, acceptedScientificNameID and higherTaxonNameID intends to do the same I believe, but the terminology invites people to use them not referring to each other from what I have seen so far in practice. Concrete recomendations:
#49a replace scientificNameID with nameUsageID There is the need to uniquely identify a taxon concept with a given name, a name usage. A nameID suggests the name is unique which it isnt if combined with an sec reference aka taxonAccordingTo. A taxonID suggests to refer to a distinct taxon concept. A name usage seems the smallest entity and can therefore be used to act as a sort of unique key for names, taxa, taxon concepts or just usages of a name. All other taxonomic dwc ID terms can and should point to a name usage id then. This makes me think if most/all other IDs should reflect this in their names, see below.
It could make sense to keep scientificNameID as a ID to the name as defined by a nomenclator. But this ID can also be used as a name usage id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID) this term should point to the name usage that reflects the "accepted" taxon in case of synonyms, no matter if they are objective or subjective. AcceptedScientificName sounds more like a nomenclatural exercise and in accordance with #3 (nameUsageID) the term acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID) in consistency with nameUsage & acceptedNameUsage
--- #50 remove recommendation to concatenate multiple values, especially for higherTaxonName/higherNameUsage http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be a verbatim pointer to the higher taxon as an alternative way of using higherTaxonNameID. Therefore it should only contain a single name, the direct parent, in my eyes. There are also already the 7 mayor ranks as separate terms that can be used to express a flattened hierarchy. I am aware DwC suggests to use concatenated lists in a single term in other places, e.g. , but I believe it would be better to keep the meaning singular and use multiple instances of that term to express multiple values. Dublin Core also recommends to use multiple XML elements for multiple values, see recommendation 5 in http://dublincore.org/documents/dc-xml-guidelines/
--- #51 rename namePublicationID to namePublishedInID http://code.google.com/p/darwincore/issues/detail?id=51 for consistency with namePublishedIn
--- #52 rename (verbatim)scientificNameRank to (verbatim)rank http://code.google.com/p/darwincore/issues/detail?id=52 to avoid discussions about whether the rank belongs to the name or the taxon and also because its nice and short and there is no clash in biological terminology.
Hi Markus and others,
My expertise is insufficient to assess all of these proposed changes. I will have to take lack of response to be the same as accord. So, if you have any objections or comments, please make them soon.
I do have a few comments nonetheless. Along with many of these proposed changes would have to be changes in the wording of the definitions and examples. Markus, can you provider those?
The one suggestion with which I disagree is the one for higherTaxonName (Issue #50). The definition of the term currently reflects its origin and a need that could not otherwise be met - a way to provide a complete classification in Simple Darwin Core, in which only one instance of a term may occur per record. I don't think this functionality should be lost. However, I do agree that the term higherTaxonName should be consistent with the related identifier term higherTaxonNameID, which is specifically about the parent name of the scientificName only. So, I recommend that the current higherTaxonName term be renamed to higherClassification and that a new higherTaxonName term be added with the following attributes (adjusted later or name changes to terms, as necessary):
Definition: The name that is the parent of the scientificName.
Comment: "Examples: "Animalia", "Chordata", "Vertebrata", "Mammalia", "Theria", "Eutheria", "Rodentia", "Hystricognatha", "Hystricognathi", "Ctenomyidae", "Ctenomyini", "Ctenomys".
ABCD 2.06: DataSets/DataSet/Units/Unit/Identifications/Identification/TaxonIdentified/HigherTaxa/HigherTaxon/HigherTaxonName
Finally, to me the term name "nameUsage does nothing to suggest its content. This may require people to read the definition, which would be a good thing, but it might just as easily slip people's notice as something they don't understand and therefore must be irrelevant. For such important terms I think that is too dangerous. I'd suggest better names, and definitely request clear definitions that let those currently using the term called "scientificName" know where that same information should go.
John
On Thu, Aug 6, 2009 at 5:31 AM, Markus Döringm.doering@mac.com wrote:
Dear John & DwC friends,
after finally having time to review the current dwc terms again I came across a couple of issues I'd like to see discussed or even changed. I am working for nearly 1 year now with the new terms during their development, especially with the new and modified taxonomic terms. So far they work very well in practice, but there are a few improvements I can think of, mostly related to the latest changes shortly before the public review started. I have added them as separate issues to the google code site, but list them here in one go. The number of issues is larger than I hoped for, but most of them are minor terminology issues for consistency and not touching the core meaning of the terms.
Markus
#47 rename basionym(ID) to originalName(ID) http://code.google.com/p/darwincore/issues/detail?id=47 The intend for this term is really to reflect where a name originally comes from in case it is a recombination. The term basionym is mostly used with botanists and covers only the cases when an epithet remains the same, i.e. not replacement names. The best matching, broader term therefore is originalName I think. Changes have to be done to both the verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found in this document: http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
#48 remove taxonConceptID http://code.google.com/p/darwincore/issues/detail?id=48 The conceptID is intended to state that 2 name usages / potential taxa are the same, even if they use a different name. This is a special case of true concept relations and I would much prefer to see this covered in a dedicated extension treating all concept relations, especially frequent cases such as includes, overlaps, etc. I am more than willing to define such an extension
#49 rename scientificNameID, acceptedScientificNameID and higherTaxonNameID http://code.google.com/p/darwincore/issues/detail?id=49 no matter what the final term names are I think the 3 ones should be consistent. Originally it was intended to call them taxonID, acceptedTaxonID and higherTaxonID with a loose definition of a taxon, more based on the idea of that all terms here are taxonomic terms and therefore contain taxon in their name. The current version scientificNameID, acceptedScientificNameID and higherTaxonNameID intends to do the same I believe, but the terminology invites people to use them not referring to each other from what I have seen so far in practice. Concrete recomendations:
#49a replace scientificNameID with nameUsageID There is the need to uniquely identify a taxon concept with a given name, a name usage. A nameID suggests the name is unique which it isnt if combined with an sec reference aka taxonAccordingTo. A taxonID suggests to refer to a distinct taxon concept. A name usage seems the smallest entity and can therefore be used to act as a sort of unique key for names, taxa, taxon concepts or just usages of a name. All other taxonomic dwc ID terms can and should point to a name usage id then. This makes me think if most/all other IDs should reflect this in their names, see below.
It could make sense to keep scientificNameID as a ID to the name as defined by a nomenclator. But this ID can also be used as a name usage id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID) this term should point to the name usage that reflects the "accepted" taxon in case of synonyms, no matter if they are objective or subjective. AcceptedScientificName sounds more like a nomenclatural exercise and in accordance with #3 (nameUsageID) the term acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID) in consistency with nameUsage & acceptedNameUsage
#50 remove recommendation to concatenate multiple values, especially for higherTaxonName/higherNameUsage http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be a verbatim pointer to the higher taxon as an alternative way of using higherTaxonNameID. Therefore it should only contain a single name, the direct parent, in my eyes. There are also already the 7 mayor ranks as separate terms that can be used to express a flattened hierarchy. I am aware DwC suggests to use concatenated lists in a single term in other places, e.g. , but I believe it would be better to keep the meaning singular and use multiple instances of that term to express multiple values. Dublin Core also recommends to use multiple XML elements for multiple values, see recommendation 5 in http://dublincore.org/documents/dc-xml-guidelines/
#51 rename namePublicationID to namePublishedInID http://code.google.com/p/darwincore/issues/detail?id=51 for consistency with namePublishedIn
#52 rename (verbatim)scientificNameRank to (verbatim)rank http://code.google.com/p/darwincore/issues/detail?id=52 to avoid discussions about whether the rank belongs to the name or the taxon and also because its nice and short and there is no clash in biological terminology. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
I do have a few comments nonetheless. Along with many of these proposed changes would have to be changes in the wording of the definitions and examples. Markus, can you provider those?
Me and David Remsen will be happy to do so. Which exactactly are the missing definitions currently?
The one suggestion with which I disagree is the one for higherTaxonName (Issue #50). The definition of the term currently reflects its origin and a need that could not otherwise be met - a way to provide a complete classification in Simple Darwin Core, in which only one instance of a term may occur per record. I don't think this functionality should be lost. However, I do agree that the term higherTaxonName should be consistent with the related identifier term higherTaxonNameID, which is specifically about the parent name of the scientificName only. So, I recommend that the current higherTaxonName term be renamed to higherClassification and that a new higherTaxonName term be added with the following attributes (adjusted later or name changes to terms, as necessary):
Definition: The name that is the parent of the scientificName.
Comment: "Examples: "Animalia", "Chordata", "Vertebrata", "Mammalia", "Theria", "Eutheria", "Rodentia", "Hystricognatha", "Hystricognathi", "Ctenomyidae", "Ctenomyini", "Ctenomys".
ABCD 2.06: DataSets/DataSet/Units/Unit/Identifications/ Identification/TaxonIdentified/HigherTaxa/HigherTaxon/HigherTaxonName
That would be the best option in my eyes too, I agree. And it nicely lines up with ABCD
Finally, to me the term name "nameUsage does nothing to suggest its content. This may require people to read the definition, which would be a good thing, but it might just as easily slip people's notice as something they don't understand and therefore must be irrelevant. For such important terms I think that is too dangerous. I'd suggest better names, and definitely request clear definitions that let those currently using the term called "scientificName" know where that same information should go.
I never wanted to challenge scientificName. My main concern is and was about the term providing an ID for the "taxon class". I have had great sucess so far using the taxonomic terms in different context, representing nomenclatural work, classic taxonomic or even concept based taxonomies. Especially using the different ID terms for the classification, the basionym and the pointer form synonyms to the accepted taxon works nicely if you have a single "primary key" to refer to and not different terms for pure names, taxa or taxon concepts. This is where I thought name usage (see global names usage database, GNUB) would be the least common denominator and better suited as a term than scientificNameID or taxonID. If we can provide a definition of scientificNameID that covers the use for taxa or any "name usage" as in a name that has been used somewhere aka publication or digital source, then I am glad to keep scientificNameID. It just sounds very nomenclaturally biased when you come across it without reading the documentation.
John
On Thu, Aug 6, 2009 at 5:31 AM, Markus Döringm.doering@mac.com wrote:
Dear John & DwC friends,
after finally having time to review the current dwc terms again I came across a couple of issues I'd like to see discussed or even changed. I am working for nearly 1 year now with the new terms during their development, especially with the new and modified taxonomic terms. So far they work very well in practice, but there are a few improvements I can think of, mostly related to the latest changes shortly before the public review started. I have added them as separate issues to the google code site, but list them here in one go. The number of issues is larger than I hoped for, but most of them are minor terminology issues for consistency and not touching the core meaning of the terms.
Markus
#47 rename basionym(ID) to originalName(ID) http://code.google.com/p/darwincore/issues/detail?id=47 The intend for this term is really to reflect where a name originally comes from in case it is a recombination. The term basionym is mostly used with botanists and covers only the cases when an epithet remains the same, i.e. not replacement names. The best matching, broader term therefore is originalName I think. Changes have to be done to both the verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found in this document: http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
#48 remove taxonConceptID http://code.google.com/p/darwincore/issues/detail?id=48 The conceptID is intended to state that 2 name usages / potential taxa are the same, even if they use a different name. This is a special case of true concept relations and I would much prefer to see this covered in a dedicated extension treating all concept relations, especially frequent cases such as includes, overlaps, etc. I am more than willing to define such an extension
#49 rename scientificNameID, acceptedScientificNameID and higherTaxonNameID http://code.google.com/p/darwincore/issues/detail?id=49 no matter what the final term names are I think the 3 ones should be consistent. Originally it was intended to call them taxonID, acceptedTaxonID and higherTaxonID with a loose definition of a taxon, more based on the idea of that all terms here are taxonomic terms and therefore contain taxon in their name. The current version scientificNameID, acceptedScientificNameID and higherTaxonNameID intends to do the same I believe, but the terminology invites people to use them not referring to each other from what I have seen so far in practice. Concrete recomendations:
#49a replace scientificNameID with nameUsageID There is the need to uniquely identify a taxon concept with a given name, a name usage. A nameID suggests the name is unique which it isnt if combined with an sec reference aka taxonAccordingTo. A taxonID suggests to refer to a distinct taxon concept. A name usage seems the smallest entity and can therefore be used to act as a sort of unique key for names, taxa, taxon concepts or just usages of a name. All other taxonomic dwc ID terms can and should point to a name usage id then. This makes me think if most/all other IDs should reflect this in their names, see below.
It could make sense to keep scientificNameID as a ID to the name as defined by a nomenclator. But this ID can also be used as a name usage id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID) this term should point to the name usage that reflects the "accepted" taxon in case of synonyms, no matter if they are objective or subjective. AcceptedScientificName sounds more like a nomenclatural exercise and in accordance with #3 (nameUsageID) the term acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID) in consistency with nameUsage & acceptedNameUsage
#50 remove recommendation to concatenate multiple values, especially for higherTaxonName/higherNameUsage http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be a verbatim pointer to the higher taxon as an alternative way of using higherTaxonNameID. Therefore it should only contain a single name, the direct parent, in my eyes. There are also already the 7 mayor ranks as separate terms that can be used to express a flattened hierarchy. I am aware DwC suggests to use concatenated lists in a single term in other places, e.g. , but I believe it would be better to keep the meaning singular and use multiple instances of that term to express multiple values. Dublin Core also recommends to use multiple XML elements for multiple values, see recommendation 5 in http://dublincore.org/documents/dc-xml-guidelines/
#51 rename namePublicationID to namePublishedInID http://code.google.com/p/darwincore/issues/detail?id=51 for consistency with namePublishedIn
#52 rename (verbatim)scientificNameRank to (verbatim)rank http://code.google.com/p/darwincore/issues/detail?id=52 to avoid discussions about whether the rank belongs to the name or the taxon and also because its nice and short and there is no clash in biological terminology. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
It seems we will need new definitions of each of the terms you propose to change as well as any definitions or comments that reference the names or proposed content of any of these terms.
On Sun, Aug 16, 2009 at 11:24 PM, Markus Döringm.doering@mac.com wrote:
I do have a few comments nonetheless. Along with many of these proposed changes would have to be changes in the wording of the definitions and examples. Markus, can you provider those?
Me and David Remsen will be happy to do so. Which exactactly are the missing definitions currently?
The one suggestion with which I disagree is the one for higherTaxonName (Issue #50). The definition of the term currently reflects its origin and a need that could not otherwise be met - a way to provide a complete classification in Simple Darwin Core, in which only one instance of a term may occur per record. I don't think this functionality should be lost. However, I do agree that the term higherTaxonName should be consistent with the related identifier term higherTaxonNameID, which is specifically about the parent name of the scientificName only. So, I recommend that the current higherTaxonName term be renamed to higherClassification and that a new higherTaxonName term be added with the following attributes (adjusted later or name changes to terms, as necessary):
Definition: The name that is the parent of the scientificName.
Comment: "Examples: "Animalia", "Chordata", "Vertebrata", "Mammalia", "Theria", "Eutheria", "Rodentia", "Hystricognatha", "Hystricognathi", "Ctenomyidae", "Ctenomyini", "Ctenomys".
ABCD 2.06: DataSets/DataSet/Units/Unit/Identifications/Identification/TaxonIdentified/HigherTaxa/HigherTaxon/HigherTaxonName
That would be the best option in my eyes too, I agree. And it nicely lines up with ABCD
Finally, to me the term name "nameUsage does nothing to suggest its content. This may require people to read the definition, which would be a good thing, but it might just as easily slip people's notice as something they don't understand and therefore must be irrelevant. For such important terms I think that is too dangerous. I'd suggest better names, and definitely request clear definitions that let those currently using the term called "scientificName" know where that same information should go.
I never wanted to challenge scientificName. My main concern is and was about the term providing an ID for the "taxon class". I have had great sucess so far using the taxonomic terms in different context, representing nomenclatural work, classic taxonomic or even concept based taxonomies. Especially using the different ID terms for the classification, the basionym and the pointer form synonyms to the accepted taxon works nicely if you have a single "primary key" to refer to and not different terms for pure names, taxa or taxon concepts. This is where I thought name usage (see global names usage database, GNUB) would be the least common denominator and better suited as a term than scientificNameID or taxonID. If we can provide a definition of scientificNameID that covers the use for taxa or any "name usage" as in a name that has been used somewhere aka publication or digital source, then I am glad to keep scientificNameID. It just sounds very nomenclaturally biased when you come across it without reading the documentation.
John
On Thu, Aug 6, 2009 at 5:31 AM, Markus Döringm.doering@mac.com wrote:
Dear John & DwC friends,
after finally having time to review the current dwc terms again I came across a couple of issues I'd like to see discussed or even changed. I am working for nearly 1 year now with the new terms during their development, especially with the new and modified taxonomic terms. So far they work very well in practice, but there are a few improvements I can think of, mostly related to the latest changes shortly before the public review started. I have added them as separate issues to the google code site, but list them here in one go. The number of issues is larger than I hoped for, but most of them are minor terminology issues for consistency and not touching the core meaning of the terms.
Markus
#47 rename basionym(ID) to originalName(ID) http://code.google.com/p/darwincore/issues/detail?id=47 The intend for this term is really to reflect where a name originally comes from in case it is a recombination. The term basionym is mostly used with botanists and covers only the cases when an epithet remains the same, i.e. not replacement names. The best matching, broader term therefore is originalName I think. Changes have to be done to both the verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found in this document:
http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
#48 remove taxonConceptID http://code.google.com/p/darwincore/issues/detail?id=48 The conceptID is intended to state that 2 name usages / potential taxa are the same, even if they use a different name. This is a special case of true concept relations and I would much prefer to see this covered in a dedicated extension treating all concept relations, especially frequent cases such as includes, overlaps, etc. I am more than willing to define such an extension
#49 rename scientificNameID, acceptedScientificNameID and higherTaxonNameID http://code.google.com/p/darwincore/issues/detail?id=49 no matter what the final term names are I think the 3 ones should be consistent. Originally it was intended to call them taxonID, acceptedTaxonID and higherTaxonID with a loose definition of a taxon, more based on the idea of that all terms here are taxonomic terms and therefore contain taxon in their name. The current version scientificNameID, acceptedScientificNameID and higherTaxonNameID intends to do the same I believe, but the terminology invites people to use them not referring to each other from what I have seen so far in practice. Concrete recomendations:
#49a replace scientificNameID with nameUsageID There is the need to uniquely identify a taxon concept with a given name, a name usage. A nameID suggests the name is unique which it isnt if combined with an sec reference aka taxonAccordingTo. A taxonID suggests to refer to a distinct taxon concept. A name usage seems the smallest entity and can therefore be used to act as a sort of unique key for names, taxa, taxon concepts or just usages of a name. All other taxonomic dwc ID terms can and should point to a name usage id then. This makes me think if most/all other IDs should reflect this in their names, see below.
It could make sense to keep scientificNameID as a ID to the name as defined by a nomenclator. But this ID can also be used as a name usage id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID) this term should point to the name usage that reflects the "accepted" taxon in case of synonyms, no matter if they are objective or subjective. AcceptedScientificName sounds more like a nomenclatural exercise and in accordance with #3 (nameUsageID) the term acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID) in consistency with nameUsage & acceptedNameUsage
#50 remove recommendation to concatenate multiple values, especially for higherTaxonName/higherNameUsage http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be a verbatim pointer to the higher taxon as an alternative way of using higherTaxonNameID. Therefore it should only contain a single name, the direct parent, in my eyes. There are also already the 7 mayor ranks as separate terms that can be used to express a flattened hierarchy. I am aware DwC suggests to use concatenated lists in a single term in other places, e.g. , but I believe it would be better to keep the meaning singular and use multiple instances of that term to express multiple values. Dublin Core also recommends to use multiple XML elements for multiple values, see recommendation 5 in http://dublincore.org/documents/dc-xml-guidelines/
#51 rename namePublicationID to namePublishedInID http://code.google.com/p/darwincore/issues/detail?id=51 for consistency with namePublishedIn
#52 rename (verbatim)scientificNameRank to (verbatim)rank http://code.google.com/p/darwincore/issues/detail?id=52 to avoid discussions about whether the rank belongs to the name or the taxon and also because its nice and short and there is no clash in biological terminology. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
While thinking further in trying to implement the suggested changes another question occurred to me. The recommendation was made in Issue #48 to remove taxonConceptID. If it is removed, how would anyone be able to capture the proposition that a given specimen was a member of a circumscription identified by a registered (having a resolvable GUID) taxon concept? I pose that one could not, because we would be left only with name terms. Unless I'm getting something wrong, I believe this term cannot be removed.
On Thu, Aug 6, 2009 at 5:31 AM, Markus Döringm.doering@mac.com wrote:
Dear John & DwC friends,
after finally having time to review the current dwc terms again I came across a couple of issues I'd like to see discussed or even changed. I am working for nearly 1 year now with the new terms during their development, especially with the new and modified taxonomic terms. So far they work very well in practice, but there are a few improvements I can think of, mostly related to the latest changes shortly before the public review started. I have added them as separate issues to the google code site, but list them here in one go. The number of issues is larger than I hoped for, but most of them are minor terminology issues for consistency and not touching the core meaning of the terms.
Markus
#47 rename basionym(ID) to originalName(ID) http://code.google.com/p/darwincore/issues/detail?id=47 The intend for this term is really to reflect where a name originally comes from in case it is a recombination. The term basionym is mostly used with botanists and covers only the cases when an epithet remains the same, i.e. not replacement names. The best matching, broader term therefore is originalName I think. Changes have to be done to both the verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found in this document: http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
#48 remove taxonConceptID http://code.google.com/p/darwincore/issues/detail?id=48 The conceptID is intended to state that 2 name usages / potential taxa are the same, even if they use a different name. This is a special case of true concept relations and I would much prefer to see this covered in a dedicated extension treating all concept relations, especially frequent cases such as includes, overlaps, etc. I am more than willing to define such an extension
#49 rename scientificNameID, acceptedScientificNameID and higherTaxonNameID http://code.google.com/p/darwincore/issues/detail?id=49 no matter what the final term names are I think the 3 ones should be consistent. Originally it was intended to call them taxonID, acceptedTaxonID and higherTaxonID with a loose definition of a taxon, more based on the idea of that all terms here are taxonomic terms and therefore contain taxon in their name. The current version scientificNameID, acceptedScientificNameID and higherTaxonNameID intends to do the same I believe, but the terminology invites people to use them not referring to each other from what I have seen so far in practice. Concrete recomendations:
#49a replace scientificNameID with nameUsageID There is the need to uniquely identify a taxon concept with a given name, a name usage. A nameID suggests the name is unique which it isnt if combined with an sec reference aka taxonAccordingTo. A taxonID suggests to refer to a distinct taxon concept. A name usage seems the smallest entity and can therefore be used to act as a sort of unique key for names, taxa, taxon concepts or just usages of a name. All other taxonomic dwc ID terms can and should point to a name usage id then. This makes me think if most/all other IDs should reflect this in their names, see below.
It could make sense to keep scientificNameID as a ID to the name as defined by a nomenclator. But this ID can also be used as a name usage id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID) this term should point to the name usage that reflects the "accepted" taxon in case of synonyms, no matter if they are objective or subjective. AcceptedScientificName sounds more like a nomenclatural exercise and in accordance with #3 (nameUsageID) the term acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID) in consistency with nameUsage & acceptedNameUsage
#50 remove recommendation to concatenate multiple values, especially for higherTaxonName/higherNameUsage http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be a verbatim pointer to the higher taxon as an alternative way of using higherTaxonNameID. Therefore it should only contain a single name, the direct parent, in my eyes. There are also already the 7 mayor ranks as separate terms that can be used to express a flattened hierarchy. I am aware DwC suggests to use concatenated lists in a single term in other places, e.g. , but I believe it would be better to keep the meaning singular and use multiple instances of that term to express multiple values. Dublin Core also recommends to use multiple XML elements for multiple values, see recommendation 5 in http://dublincore.org/documents/dc-xml-guidelines/
#51 rename namePublicationID to namePublishedInID http://code.google.com/p/darwincore/issues/detail?id=51 for consistency with namePublishedIn
#52 rename (verbatim)scientificNameRank to (verbatim)rank http://code.google.com/p/darwincore/issues/detail?id=52 to avoid discussions about whether the rank belongs to the name or the taxon and also because its nice and short and there is no clash in biological terminology. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
John, I think this is based on the different understanding of the other IDs we are having. If ScientificNameID is purely for the name as the term suggests, I do agree with you that taxonConceptID is still needed. But as me and David have argued we would prefer a wider definition closer to the originally suggested taxonID (which was turned into scientificNameID at some point). An identifier for anything that is described by the taxonomic terms, let it be a name, a taxon (concept) or any other use of a name. So the same name effectively can have different IDs if it has been used in different places, thereby representing different taxonomic concepts. This would make the conceptID superflous. If the taxon(Concept)ID is to take on this role and the scientificNameID is a purely nomenclatural name identifier only, I am with you.
One thing I would like to avoid very much though is that some ID terms would refer to the scientificNameID (like originalNameID) while others like the higherTaxonID would reference the taxonConceptID. I think it all becomes a lot simpler if there is a single taxon/nameID for all purpuses. Similarly I dont think we would want a separate occurrenceID, specimenID and fossilID.
Markus
On Aug 25, 2009, at 0:55, John R. WIECZOREK wrote:
While thinking further in trying to implement the suggested changes another question occurred to me. The recommendation was made in Issue #48 to remove taxonConceptID. If it is removed, how would anyone be able to capture the proposition that a given specimen was a member of a circumscription identified by a registered (having a resolvable GUID) taxon concept? I pose that one could not, because we would be left only with name terms. Unless I'm getting something wrong, I believe this term cannot be removed.
On Thu, Aug 6, 2009 at 5:31 AM, Markus Döringm.doering@mac.com wrote:
Dear John & DwC friends,
after finally having time to review the current dwc terms again I came across a couple of issues I'd like to see discussed or even changed. I am working for nearly 1 year now with the new terms during their development, especially with the new and modified taxonomic terms. So far they work very well in practice, but there are a few improvements I can think of, mostly related to the latest changes shortly before the public review started. I have added them as separate issues to the google code site, but list them here in one go. The number of issues is larger than I hoped for, but most of them are minor terminology issues for consistency and not touching the core meaning of the terms.
Markus
#47 rename basionym(ID) to originalName(ID) http://code.google.com/p/darwincore/issues/detail?id=47 The intend for this term is really to reflect where a name originally comes from in case it is a recombination. The term basionym is mostly used with botanists and covers only the cases when an epithet remains the same, i.e. not replacement names. The best matching, broader term therefore is originalName I think. Changes have to be done to both the verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found in this document: http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
#48 remove taxonConceptID http://code.google.com/p/darwincore/issues/detail?id=48 The conceptID is intended to state that 2 name usages / potential taxa are the same, even if they use a different name. This is a special case of true concept relations and I would much prefer to see this covered in a dedicated extension treating all concept relations, especially frequent cases such as includes, overlaps, etc. I am more than willing to define such an extension
#49 rename scientificNameID, acceptedScientificNameID and higherTaxonNameID http://code.google.com/p/darwincore/issues/detail?id=49 no matter what the final term names are I think the 3 ones should be consistent. Originally it was intended to call them taxonID, acceptedTaxonID and higherTaxonID with a loose definition of a taxon, more based on the idea of that all terms here are taxonomic terms and therefore contain taxon in their name. The current version scientificNameID, acceptedScientificNameID and higherTaxonNameID intends to do the same I believe, but the terminology invites people to use them not referring to each other from what I have seen so far in practice. Concrete recomendations:
#49a replace scientificNameID with nameUsageID There is the need to uniquely identify a taxon concept with a given name, a name usage. A nameID suggests the name is unique which it isnt if combined with an sec reference aka taxonAccordingTo. A taxonID suggests to refer to a distinct taxon concept. A name usage seems the smallest entity and can therefore be used to act as a sort of unique key for names, taxa, taxon concepts or just usages of a name. All other taxonomic dwc ID terms can and should point to a name usage id then. This makes me think if most/all other IDs should reflect this in their names, see below.
It could make sense to keep scientificNameID as a ID to the name as defined by a nomenclator. But this ID can also be used as a name usage id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID) this term should point to the name usage that reflects the "accepted" taxon in case of synonyms, no matter if they are objective or subjective. AcceptedScientificName sounds more like a nomenclatural exercise and in accordance with #3 (nameUsageID) the term acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID) in consistency with nameUsage & acceptedNameUsage
#50 remove recommendation to concatenate multiple values, especially for higherTaxonName/higherNameUsage http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be a verbatim pointer to the higher taxon as an alternative way of using higherTaxonNameID. Therefore it should only contain a single name, the direct parent, in my eyes. There are also already the 7 mayor ranks as separate terms that can be used to express a flattened hierarchy. I am aware DwC suggests to use concatenated lists in a single term in other places, e.g. , but I believe it would be better to keep the meaning singular and use multiple instances of that term to express multiple values. Dublin Core also recommends to use multiple XML elements for multiple values, see recommendation 5 in http://dublincore.org/documents/dc-xml-guidelines/
#51 rename namePublicationID to namePublishedInID http://code.google.com/p/darwincore/issues/detail?id=51 for consistency with namePublishedIn
#52 rename (verbatim)scientificNameRank to (verbatim)rank http://code.google.com/p/darwincore/issues/detail?id=52 to avoid discussions about whether the rank belongs to the name or the taxon and also because its nice and short and there is no clash in biological terminology. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Right, that all makes sense now, and is exactly the kind of simplification that was already in place in the Location class, where the locationID refers to the Location as a whole, not some part of it, such as a country in one case or a city in another case. So, I agree, remove the taxonConceptID.
I've been struggling with trying to come up with a better term name than nameUsage. After reading the arguments again with every alternative I can come up with (scientificName, taxonName, taxon_name, nameAsUsed, nameAsPublished, publishedName, publishedTaxon) I'm not sure I can really do any better for a name that states specifically what you are trying to encompass with that term. Nevertheless, the term seems awkward, especially on first encounter. The terms would have to be very carefully described (but I guess all terms should be). The problem is, I think the same problem with recognizing what the term is for would happen on the second encounter as well ("What was that term for again?"). I don't think that would happen with terms that were more familiar, even if their meaning is broad. To me, "taxon" works, because it could be a name or a concept - exactly what we're trying to encompass.
So here's what I'd do in an attempt to be clear, concise, and consistent.
Given that the Class is Taxon (which captures the idea of a name as well as it does a concept), consistency would argue that the id term for a record of the class should be taxonID. The list of terms under this scenario would be: taxonID, acceptedTaxonID, higherTaxonID, originalTaxonID, scientificName, acceptedTaxon, higherTaxon, originalTaxon, higherClassification, kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, verbatimTaxonRank, scientificNameAuthorship, nomenclaturalCode, taxonPublicationID, taxonPublication, taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName.
I retained "scientificName " for two big reasons. First, the obvious alternative "taxon" would be too easily confused with the name of the Class "Taxon". Second, scientificName has broad current usage and will immediately suggest the appropriate content for most users. An additional minor reason is that the term contrasts with and is nicely consistent with "vernacularName".
The rest is all dependent on good definitions. Here are some drafts for new definitions for terms that need them. Please suggest any necessary revisions.
taxonID: An identifier for a specific taxon-related name usage (a Taxon record). May be a global unique identifier or an identifier specific to the data set.
acceptedTaxonID: A unique identifier for the acceptedTaxon.
higherTaxonID: A unique identifier for the taxon that is the parent of the scientificName.
originalTaxonID: A unique identifier for the basionym (botany), basonym (bacteriology), or replacement of the scientificName.
scientificName: The taxon name (with date and authorship information if applicable). When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined. This term should not contain Identification qualifications, which should instead be supplied in the IdentificationQualifier term.
acceptedTaxon: The currently valid (zoological) or accepted (botanical) name for the scientificName.
higherTaxon: The taxon that is the parent of the scientificName.
originalTaxon: The basionym (botany), basonym (bacteriology), or replacement of the scientificName..
higherClassification: A list (concatenated and separated) of the names for the taxonomic ranks less specific than that given in the scientificName.
kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet - all unchanged.
taxonRank: The taxonomic rank of the scientificName. Recommended best practice is to use a controlled vocabulary.
verbatimTaxonRank: The verbatim original taxonomic rank of the scientificName.
scientificNameAuthorship, nomenclaturalCode - unchanged
taxonPublicationID: A unique identifier for the publication of the Taxon.
taxonPublication: A reference for the publication of the Taxon.
taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName - unchanged.
On Mon, Aug 24, 2009 at 4:15 PM, "Markus Döring (GBIF)"mdoering@gbif.org wrote:
John, I think this is based on the different understanding of the other IDs we are having. If ScientificNameID is purely for the name as the term suggests, I do agree with you that taxonConceptID is still needed. But as me and David have argued we would prefer a wider definition closer to the originally suggested taxonID (which was turned into scientificNameID at some point). An identifier for anything that is described by the taxonomic terms, let it be a name, a taxon (concept) or any other use of a name. So the same name effectively can have different IDs if it has been used in different places, thereby representing different taxonomic concepts. This would make the conceptID superflous. If the taxon(Concept)ID is to take on this role and the scientificNameID is a purely nomenclatural name identifier only, I am with you.
One thing I would like to avoid very much though is that some ID terms would refer to the scientificNameID (like originalNameID) while others like the higherTaxonID would reference the taxonConceptID. I think it all becomes a lot simpler if there is a single taxon/nameID for all purpuses. Similarly I dont think we would want a separate occurrenceID, specimenID and fossilID.
Markus
On Aug 25, 2009, at 0:55, John R. WIECZOREK wrote:
While thinking further in trying to implement the suggested changes another question occurred to me. The recommendation was made in Issue #48 to remove taxonConceptID. If it is removed, how would anyone be able to capture the proposition that a given specimen was a member of a circumscription identified by a registered (having a resolvable GUID) taxon concept? I pose that one could not, because we would be left only with name terms. Unless I'm getting something wrong, I believe this term cannot be removed.
On Thu, Aug 6, 2009 at 5:31 AM, Markus Döringm.doering@mac.com wrote:
Dear John & DwC friends,
after finally having time to review the current dwc terms again I came across a couple of issues I'd like to see discussed or even changed. I am working for nearly 1 year now with the new terms during their development, especially with the new and modified taxonomic terms. So far they work very well in practice, but there are a few improvements I can think of, mostly related to the latest changes shortly before the public review started. I have added them as separate issues to the google code site, but list them here in one go. The number of issues is larger than I hoped for, but most of them are minor terminology issues for consistency and not touching the core meaning of the terms.
Markus
#47 rename basionym(ID) to originalName(ID) http://code.google.com/p/darwincore/issues/detail?id=47 The intend for this term is really to reflect where a name originally comes from in case it is a recombination. The term basionym is mostly used with botanists and covers only the cases when an epithet remains the same, i.e. not replacement names. The best matching, broader term therefore is originalName I think. Changes have to be done to both the verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found in this document:
http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
#48 remove taxonConceptID http://code.google.com/p/darwincore/issues/detail?id=48 The conceptID is intended to state that 2 name usages / potential taxa are the same, even if they use a different name. This is a special case of true concept relations and I would much prefer to see this covered in a dedicated extension treating all concept relations, especially frequent cases such as includes, overlaps, etc. I am more than willing to define such an extension
#49 rename scientificNameID, acceptedScientificNameID and higherTaxonNameID http://code.google.com/p/darwincore/issues/detail?id=49 no matter what the final term names are I think the 3 ones should be consistent. Originally it was intended to call them taxonID, acceptedTaxonID and higherTaxonID with a loose definition of a taxon, more based on the idea of that all terms here are taxonomic terms and therefore contain taxon in their name. The current version scientificNameID, acceptedScientificNameID and higherTaxonNameID intends to do the same I believe, but the terminology invites people to use them not referring to each other from what I have seen so far in practice. Concrete recomendations:
#49a replace scientificNameID with nameUsageID There is the need to uniquely identify a taxon concept with a given name, a name usage. A nameID suggests the name is unique which it isnt if combined with an sec reference aka taxonAccordingTo. A taxonID suggests to refer to a distinct taxon concept. A name usage seems the smallest entity and can therefore be used to act as a sort of unique key for names, taxa, taxon concepts or just usages of a name. All other taxonomic dwc ID terms can and should point to a name usage id then. This makes me think if most/all other IDs should reflect this in their names, see below.
It could make sense to keep scientificNameID as a ID to the name as defined by a nomenclator. But this ID can also be used as a name usage id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID) this term should point to the name usage that reflects the "accepted" taxon in case of synonyms, no matter if they are objective or subjective. AcceptedScientificName sounds more like a nomenclatural exercise and in accordance with #3 (nameUsageID) the term acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID) in consistency with nameUsage & acceptedNameUsage
#50 remove recommendation to concatenate multiple values, especially for higherTaxonName/higherNameUsage http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be a verbatim pointer to the higher taxon as an alternative way of using higherTaxonNameID. Therefore it should only contain a single name, the direct parent, in my eyes. There are also already the 7 mayor ranks as separate terms that can be used to express a flattened hierarchy. I am aware DwC suggests to use concatenated lists in a single term in other places, e.g. , but I believe it would be better to keep the meaning singular and use multiple instances of that term to express multiple values. Dublin Core also recommends to use multiple XML elements for multiple values, see recommendation 5 in http://dublincore.org/documents/dc-xml-guidelines/
#51 rename namePublicationID to namePublishedInID http://code.google.com/p/darwincore/issues/detail?id=51 for consistency with namePublishedIn
#52 rename (verbatim)scientificNameRank to (verbatim)rank http://code.google.com/p/darwincore/issues/detail?id=52 to avoid discussions about whether the rank belongs to the name or the taxon and also because its nice and short and there is no clash in biological terminology. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
John in haste as I prepare for biking my son to school as another option for nameUsage I've been favoring the term "taxonReference"
it's more atomic than taxonID in the sense that a taxon can be described by a series of taxon references I think the other linked terms you listed could stay as they are but I think taxonID and scientificNameID carry implications that don't match the cardinality of the term, at least in some cases, whereas in all cases there is a reference to a taxon.
I notice that namePublishedIn and taxonAccordingTo have been merged into a single taxonPublication. In working through various use cases for modelling published taxon concepts I used these different terms as well as a bibliographic extension and will re-evaluate these with the proposed changes when I get to work.
Cheers, David
On Aug 25, 2009, at 7:00 AM, John R. WIECZOREK wrote:
Right, that all makes sense now, and is exactly the kind of simplification that was already in place in the Location class, where the locationID refers to the Location as a whole, not some part of it, such as a country in one case or a city in another case. So, I agree, remove the taxonConceptID.
I've been struggling with trying to come up with a better term name than nameUsage. After reading the arguments again with every alternative I can come up with (scientificName, taxonName, taxon_name, nameAsUsed, nameAsPublished, publishedName, publishedTaxon) I'm not sure I can really do any better for a name that states specifically what you are trying to encompass with that term. Nevertheless, the term seems awkward, especially on first encounter. The terms would have to be very carefully described (but I guess all terms should be). The problem is, I think the same problem with recognizing what the term is for would happen on the second encounter as well ("What was that term for again?"). I don't think that would happen with terms that were more familiar, even if their meaning is broad. To me, "taxon" works, because it could be a name or a concept - exactly what we're trying to encompass.
So here's what I'd do in an attempt to be clear, concise, and consistent.
Given that the Class is Taxon (which captures the idea of a name as well as it does a concept), consistency would argue that the id term for a record of the class should be taxonID. The list of terms under this scenario would be: taxonID, acceptedTaxonID, higherTaxonID, originalTaxonID, scientificName, acceptedTaxon, higherTaxon, originalTaxon, higherClassification, kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, verbatimTaxonRank, scientificNameAuthorship, nomenclaturalCode, taxonPublicationID, taxonPublication, taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName.
I retained "scientificName " for two big reasons. First, the obvious alternative "taxon" would be too easily confused with the name of the Class "Taxon". Second, scientificName has broad current usage and will immediately suggest the appropriate content for most users. An additional minor reason is that the term contrasts with and is nicely consistent with "vernacularName".
The rest is all dependent on good definitions. Here are some drafts for new definitions for terms that need them. Please suggest any necessary revisions.
taxonID: An identifier for a specific taxon-related name usage (a Taxon record). May be a global unique identifier or an identifier specific to the data set.
acceptedTaxonID: A unique identifier for the acceptedTaxon.
higherTaxonID: A unique identifier for the taxon that is the parent of the scientificName.
originalTaxonID: A unique identifier for the basionym (botany), basonym (bacteriology), or replacement of the scientificName.
scientificName: The taxon name (with date and authorship information if applicable). When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined. This term should not contain Identification qualifications, which should instead be supplied in the IdentificationQualifier term.
acceptedTaxon: The currently valid (zoological) or accepted (botanical) name for the scientificName.
higherTaxon: The taxon that is the parent of the scientificName.
originalTaxon: The basionym (botany), basonym (bacteriology), or replacement of the scientificName..
higherClassification: A list (concatenated and separated) of the names for the taxonomic ranks less specific than that given in the scientificName.
kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet - all unchanged.
taxonRank: The taxonomic rank of the scientificName. Recommended best practice is to use a controlled vocabulary.
verbatimTaxonRank: The verbatim original taxonomic rank of the scientificName.
scientificNameAuthorship, nomenclaturalCode - unchanged
taxonPublicationID: A unique identifier for the publication of the Taxon.
taxonPublication: A reference for the publication of the Taxon.
taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName - unchanged.
On Mon, Aug 24, 2009 at 4:15 PM, "Markus Döring (GBIF)"mdoering@gbif.org wrote:
John, I think this is based on the different understanding of the other IDs we are having. If ScientificNameID is purely for the name as the term suggests, I do agree with you that taxonConceptID is still needed. But as me and David have argued we would prefer a wider definition closer to the originally suggested taxonID (which was turned into scientificNameID at some point). An identifier for anything that is described by the taxonomic terms, let it be a name, a taxon (concept) or any other use of a name. So the same name effectively can have different IDs if it has been used in different places, thereby representing different taxonomic concepts. This would make the conceptID superflous. If the taxon(Concept)ID is to take on this role and the scientificNameID is a purely nomenclatural name identifier only, I am with you.
One thing I would like to avoid very much though is that some ID terms would refer to the scientificNameID (like originalNameID) while others like the higherTaxonID would reference the taxonConceptID. I think it all becomes a lot simpler if there is a single taxon/ nameID for all purpuses. Similarly I dont think we would want a separate occurrenceID, specimenID and fossilID.
Markus
On Aug 25, 2009, at 0:55, John R. WIECZOREK wrote:
While thinking further in trying to implement the suggested changes another question occurred to me. The recommendation was made in Issue #48 to remove taxonConceptID. If it is removed, how would anyone be able to capture the proposition that a given specimen was a member of a circumscription identified by a registered (having a resolvable GUID) taxon concept? I pose that one could not, because we would be left only with name terms. Unless I'm getting something wrong, I believe this term cannot be removed.
On Thu, Aug 6, 2009 at 5:31 AM, Markus Döringm.doering@mac.com wrote:
Dear John & DwC friends,
after finally having time to review the current dwc terms again I came across a couple of issues I'd like to see discussed or even changed. I am working for nearly 1 year now with the new terms during their development, especially with the new and modified taxonomic terms. So far they work very well in practice, but there are a few improvements I can think of, mostly related to the latest changes shortly before the public review started. I have added them as separate issues to the google code site, but list them here in one go. The number of issues is larger than I hoped for, but most of them are minor terminology issues for consistency and not touching the core meaning of the terms.
Markus
#47 rename basionym(ID) to originalName(ID) http://code.google.com/p/darwincore/issues/detail?id=47 The intend for this term is really to reflect where a name originally comes from in case it is a recombination. The term basionym is mostly used with botanists and covers only the cases when an epithet remains the same, i.e. not replacement names. The best matching, broader term therefore is originalName I think. Changes have to be done to both the verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found in this document:
http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
#48 remove taxonConceptID http://code.google.com/p/darwincore/issues/detail?id=48 The conceptID is intended to state that 2 name usages / potential taxa are the same, even if they use a different name. This is a special case of true concept relations and I would much prefer to see this covered in a dedicated extension treating all concept relations, especially frequent cases such as includes, overlaps, etc. I am more than willing to define such an extension
#49 rename scientificNameID, acceptedScientificNameID and higherTaxonNameID http://code.google.com/p/darwincore/issues/detail?id=49 no matter what the final term names are I think the 3 ones should be consistent. Originally it was intended to call them taxonID, acceptedTaxonID and higherTaxonID with a loose definition of a taxon, more based on the idea of that all terms here are taxonomic terms and therefore contain taxon in their name. The current version scientificNameID, acceptedScientificNameID and higherTaxonNameID intends to do the same I believe, but the terminology invites people to use them not referring to each other from what I have seen so far in practice. Concrete recomendations:
#49a replace scientificNameID with nameUsageID There is the need to uniquely identify a taxon concept with a given name, a name usage. A nameID suggests the name is unique which it isnt if combined with an sec reference aka taxonAccordingTo. A taxonID suggests to refer to a distinct taxon concept. A name usage seems the smallest entity and can therefore be used to act as a sort of unique key for names, taxa, taxon concepts or just usages of a name. All other taxonomic dwc ID terms can and should point to a name usage id then. This makes me think if most/all other IDs should reflect this in their names, see below.
It could make sense to keep scientificNameID as a ID to the name as defined by a nomenclator. But this ID can also be used as a name usage id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID) this term should point to the name usage that reflects the "accepted" taxon in case of synonyms, no matter if they are objective or subjective. AcceptedScientificName sounds more like a nomenclatural exercise and in accordance with #3 (nameUsageID) the term acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID) in consistency with nameUsage & acceptedNameUsage
#50 remove recommendation to concatenate multiple values, especially for higherTaxonName/higherNameUsage http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be a verbatim pointer to the higher taxon as an alternative way of using higherTaxonNameID. Therefore it should only contain a single name, the direct parent, in my eyes. There are also already the 7 mayor ranks as separate terms that can be used to express a flattened hierarchy. I am aware DwC suggests to use concatenated lists in a single term in other places, e.g. , but I believe it would be better to keep the meaning singular and use multiple instances of that term to express multiple values. Dublin Core also recommends to use multiple XML elements for multiple values, see recommendation 5 in http://dublincore.org/documents/dc-xml-guidelines/
#51 rename namePublicationID to namePublishedInID http://code.google.com/p/darwincore/issues/detail?id=51 for consistency with namePublishedIn
#52 rename (verbatim)scientificNameRank to (verbatim)rank http://code.google.com/p/darwincore/issues/detail?id=52 to avoid discussions about whether the rank belongs to the name or the taxon and also because its nice and short and there is no clash in biological terminology. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
If we're striving for compatibility with previous work wherever possible, the TNC schema should be the appropriate "higher" authority when it comes to names and concepts. Departure from that should come only if we're aware of a problem. (We would be justified in abandoning previous DwC terms in order to promote consistency with TNC.)
If the new DwC is supposed to transport purely taxonomic data, and not just identifications of specimens and observations, then I think DwC needs to include at least the most important elements in TNC. (BTW, in the spirit of acknowledging prior art, my earliest knowledge of this distinction, in conjunction with explicit information modeling methods, goes back to 1993 [Beach, Pramanik, and Beaman 1993; as well as Croft and Whitbread pers. comm.])
I understood that TNC distinguishes taxonomic concepts (taxa) from taxonomic names as the biological entities versus the names that refer to them. If a taxonomic name is admitted to the domain of taxa (taxonomic concepts), it's the case referred to as the nominal concept, which I understood to be the union of taxa ever having born that name.
Taxonomic identifications -- the relationship between an existing concept (taxon) and an organism or a particular group of organisms -- should refer to a taxonomic concept, even if the name used does not include enough information to reference a specific concept. This is in contrast to the application of a name (existing or new) to a taxon that is intended to be (published as) a taxonomic or nomenclatural act. The designation of a type specimen, for example, is not an identification. Perhaps naively, I think it's worth distinguishing identifications and the representation of taxonomic acts.
So in the context of transporting purely taxonomic data we have the need for: 1) a taxonomic name, 2) the concept it designates, 3) the ID of that concept, and 4) the ID of the name itself, if we accept that it has properties other than the string and the string itself does not uniquely identify the "name" in the context of taxonomic nomenclature (i.e., it can be modifed for gender agreement and still be the same thing). When dealing with identifications, however, taxonomic names are just supposed to be part of the short hand for referencing a taxonomic concept; so a taxonID (=taxonConceptID) is needed, but not a taxonNameID, except in the case we want to reveal that a specimen has been designated the type of a particular name and the name can be referenced with an actionable GUID. But again, I don't think of this later case as an identification.
Do I have the concepts and distinctions correct?
Am I right in assuming that all of these various IDs are supposed to be actionable GUIDs?
-Stan
________________________________________ From: tdwg-content-bounces@lists.tdwg.org [tdwg-content-bounces@lists.tdwg.org] On Behalf Of John R. WIECZOREK [tuco@berkeley.edu] Sent: Monday, August 24, 2009 10:00 PM To: Markus Döring (GBIF) Cc: tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] DwC taxonomic terms
Right, that all makes sense now, and is exactly the kind of simplification that was already in place in the Location class, where the locationID refers to the Location as a whole, not some part of it, such as a country in one case or a city in another case. So, I agree, remove the taxonConceptID.
I've been struggling with trying to come up with a better term name than nameUsage. After reading the arguments again with every alternative I can come up with (scientificName, taxonName, taxon_name, nameAsUsed, nameAsPublished, publishedName, publishedTaxon) I'm not sure I can really do any better for a name that states specifically what you are trying to encompass with that term. Nevertheless, the term seems awkward, especially on first encounter. The terms would have to be very carefully described (but I guess all terms should be). The problem is, I think the same problem with recognizing what the term is for would happen on the second encounter as well ("What was that term for again?"). I don't think that would happen with terms that were more familiar, even if their meaning is broad. To me, "taxon" works, because it could be a name or a concept - exactly what we're trying to encompass.
So here's what I'd do in an attempt to be clear, concise, and consistent.
Given that the Class is Taxon (which captures the idea of a name as well as it does a concept), consistency would argue that the id term for a record of the class should be taxonID. The list of terms under this scenario would be: taxonID, acceptedTaxonID, higherTaxonID, originalTaxonID, scientificName, acceptedTaxon, higherTaxon, originalTaxon, higherClassification, kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, verbatimTaxonRank, scientificNameAuthorship, nomenclaturalCode, taxonPublicationID, taxonPublication, taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName.
I retained "scientificName " for two big reasons. First, the obvious alternative "taxon" would be too easily confused with the name of the Class "Taxon". Second, scientificName has broad current usage and will immediately suggest the appropriate content for most users. An additional minor reason is that the term contrasts with and is nicely consistent with "vernacularName".
The rest is all dependent on good definitions. Here are some drafts for new definitions for terms that need them. Please suggest any necessary revisions.
taxonID: An identifier for a specific taxon-related name usage (a Taxon record). May be a global unique identifier or an identifier specific to the data set.
acceptedTaxonID: A unique identifier for the acceptedTaxon.
higherTaxonID: A unique identifier for the taxon that is the parent of the scientificName.
originalTaxonID: A unique identifier for the basionym (botany), basonym (bacteriology), or replacement of the scientificName.
scientificName: The taxon name (with date and authorship information if applicable). When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined. This term should not contain Identification qualifications, which should instead be supplied in the IdentificationQualifier term.
acceptedTaxon: The currently valid (zoological) or accepted (botanical) name for the scientificName.
higherTaxon: The taxon that is the parent of the scientificName.
originalTaxon: The basionym (botany), basonym (bacteriology), or replacement of the scientificName..
higherClassification: A list (concatenated and separated) of the names for the taxonomic ranks less specific than that given in the scientificName.
kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet - all unchanged.
taxonRank: The taxonomic rank of the scientificName. Recommended best practice is to use a controlled vocabulary.
verbatimTaxonRank: The verbatim original taxonomic rank of the scientificName.
scientificNameAuthorship, nomenclaturalCode - unchanged
taxonPublicationID: A unique identifier for the publication of the Taxon.
taxonPublication: A reference for the publication of the Taxon.
taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName - unchanged.
Stan, thanks for the very clear summary, I think you are pretty much right, though Im not sure if I understand you correctly when saying we should differentiate between describing taxa and using them for identifications. Do you suggest to have a completely seperate set of terms for the two purposes?
In case we have 2 ID terms, one for the name (scientificNameID) and one for the concept (taxonConceptID), this maps nicely to TCS. The reason why we favoured nameUsageID over taxonID was that a taxon by definition is accepted. And it would therefore be awkward to use this term as the ID for a synonym or misapplied name - unless it refers to the accepted taxon and is not a "primary key". If we call it taxonConceptID this might be less of a problem. TCS does the same and uses the taxon concept ID as a primary key for classic synonyms aka nominal taxon concepts.
In regards of GUIDs, yes, I would hope that all IDs are GUIDs, but I wouldn't mandate them to be, just recommend it. Required should only be that they are consistent within a dataset.
My biggest concern with going for actually two classes (name & taxon concept) and having 2 IDs for them is what do other ID terms like originalNameID, higherTaxonID, acceptedTaxonID refer to? It might cause confusion in case the IDs are not GUIDs - which I think we will have to still live with for quite some time. On the other hand this might be the price to pay for being exact and the terms quite clearly are called either xxxNameID or xxxTaxonID...
Markus
On Aug 25, 2009, at 10:45, Blum, Stan wrote:
If we're striving for compatibility with previous work wherever possible, the TNC schema should be the appropriate "higher" authority when it comes to names and concepts. Departure from that should come only if we're aware of a problem. (We would be justified in abandoning previous DwC terms in order to promote consistency with TNC.)
If the new DwC is supposed to transport purely taxonomic data, and not just identifications of specimens and observations, then I think DwC needs to include at least the most important elements in TNC. (BTW, in the spirit of acknowledging prior art, my earliest knowledge of this distinction, in conjunction with explicit information modeling methods, goes back to 1993 [Beach, Pramanik, and Beaman 1993; as well as Croft and Whitbread pers. comm.])
I understood that TNC distinguishes taxonomic concepts (taxa) from taxonomic names as the biological entities versus the names that refer to them. If a taxonomic name is admitted to the domain of taxa (taxonomic concepts), it's the case referred to as the nominal concept, which I understood to be the union of taxa ever having born that name.
Taxonomic identifications -- the relationship between an existing concept (taxon) and an organism or a particular group of organisms -- should refer to a taxonomic concept, even if the name used does not include enough information to reference a specific concept. This is in contrast to the application of a name (existing or new) to a taxon that is intended to be (published as) a taxonomic or nomenclatural act. The designation of a type specimen, for example, is not an identification. Perhaps naively, I think it's worth distinguishing identifications and the representation of taxonomic acts.
So in the context of transporting purely taxonomic data we have the need for: 1) a taxonomic name, 2) the concept it designates, 3) the ID of that concept, and 4) the ID of the name itself, if we accept that it has properties other than the string and the string itself does not uniquely identify the "name" in the context of taxonomic nomenclature (i.e., it can be modifed for gender agreement and still be the same thing). When dealing with identifications, however, taxonomic names are just supposed to be part of the short hand for referencing a taxonomic concept; so a taxonID (=taxonConceptID) is needed, but not a taxonNameID, except in the case we want to reveal that a specimen has been designated the type of a particular name and the name can be referenced with an actionable GUID. But again, I don't think of this later case as an identification.
Do I have the concepts and distinctions correct?
Am I right in assuming that all of these various IDs are supposed to be actionable GUIDs?
-Stan
From: tdwg-content-bounces@lists.tdwg.org [tdwg-content-bounces@lists.tdwg.org ] On Behalf Of John R. WIECZOREK [tuco@berkeley.edu] Sent: Monday, August 24, 2009 10:00 PM To: Markus Döring (GBIF) Cc: tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] DwC taxonomic terms
Right, that all makes sense now, and is exactly the kind of simplification that was already in place in the Location class, where the locationID refers to the Location as a whole, not some part of it, such as a country in one case or a city in another case. So, I agree, remove the taxonConceptID.
I've been struggling with trying to come up with a better term name than nameUsage. After reading the arguments again with every alternative I can come up with (scientificName, taxonName, taxon_name, nameAsUsed, nameAsPublished, publishedName, publishedTaxon) I'm not sure I can really do any better for a name that states specifically what you are trying to encompass with that term. Nevertheless, the term seems awkward, especially on first encounter. The terms would have to be very carefully described (but I guess all terms should be). The problem is, I think the same problem with recognizing what the term is for would happen on the second encounter as well ("What was that term for again?"). I don't think that would happen with terms that were more familiar, even if their meaning is broad. To me, "taxon" works, because it could be a name or a concept - exactly what we're trying to encompass.
So here's what I'd do in an attempt to be clear, concise, and consistent.
Given that the Class is Taxon (which captures the idea of a name as well as it does a concept), consistency would argue that the id term for a record of the class should be taxonID. The list of terms under this scenario would be: taxonID, acceptedTaxonID, higherTaxonID, originalTaxonID, scientificName, acceptedTaxon, higherTaxon, originalTaxon, higherClassification, kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, verbatimTaxonRank, scientificNameAuthorship, nomenclaturalCode, taxonPublicationID, taxonPublication, taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName.
I retained "scientificName " for two big reasons. First, the obvious alternative "taxon" would be too easily confused with the name of the Class "Taxon". Second, scientificName has broad current usage and will immediately suggest the appropriate content for most users. An additional minor reason is that the term contrasts with and is nicely consistent with "vernacularName".
The rest is all dependent on good definitions. Here are some drafts for new definitions for terms that need them. Please suggest any necessary revisions.
taxonID: An identifier for a specific taxon-related name usage (a Taxon record). May be a global unique identifier or an identifier specific to the data set.
acceptedTaxonID: A unique identifier for the acceptedTaxon.
higherTaxonID: A unique identifier for the taxon that is the parent of the scientificName.
originalTaxonID: A unique identifier for the basionym (botany), basonym (bacteriology), or replacement of the scientificName.
scientificName: The taxon name (with date and authorship information if applicable). When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined. This term should not contain Identification qualifications, which should instead be supplied in the IdentificationQualifier term.
acceptedTaxon: The currently valid (zoological) or accepted (botanical) name for the scientificName.
higherTaxon: The taxon that is the parent of the scientificName.
originalTaxon: The basionym (botany), basonym (bacteriology), or replacement of the scientificName..
higherClassification: A list (concatenated and separated) of the names for the taxonomic ranks less specific than that given in the scientificName.
kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet - all unchanged.
taxonRank: The taxonomic rank of the scientificName. Recommended best practice is to use a controlled vocabulary.
verbatimTaxonRank: The verbatim original taxonomic rank of the scientificName.
scientificNameAuthorship, nomenclaturalCode - unchanged
taxonPublicationID: A unique identifier for the publication of the Taxon.
taxonPublication: A reference for the publication of the Taxon.
taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName - unchanged. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
-----Original Message----- From: "Markus Döring (GBIF)" [mailto:mdoering@gbif.org] Sent: Tuesday, August 25, 2009 2:31 AM To: Blum, Stan Cc: tuco@berkeley.edu; tdwg-content@lists.tdwg.org Subject: Re: [tdwg-content] DwC taxonomic terms
Stan, Im not sure if I understand you correctly when saying we should differentiate between describing taxa and using them for identifications. Do you suggest to have a completely seperate set of terms for the two purposes?
Taxon(Concept)ID would be the same in both contexts, description and identification. My point was that TaxonNameID should NOT be included in an identification. Include the name, yes, along with nameAuthor, date, accordingTo elements, but not the NameID. They are all just strings to be processed in determining what taxonomic concept was intended. I think a TaxonNameID is useful only in documenting the derivation of names (if it's useful at all). That could be an extreme view, but I tossed it out to see if other agreed or not.
In case we have 2 ID terms, one for the name (scientificNameID) and one for the concept (taxonConceptID), this maps nicely to TCS. The reason why we favoured nameUsageID over taxonID was that a taxon by definition is accepted.
In the context of an identification, yes, a taxon is asserted to be valid/accepted by the identifier (at the time), but not all identifications are accepted by the data manager, so that last statement isn't always true. Also not all taxa are accepted/valid within a classification (if it includes synonymous taxa).
I think "nameUsageID" is too obscure for general use. It makes sense, but only after understanding the "name-according-to" concept. I think TaxonID along with (and therefore different from) TaxonNameID make a better combination. TaxonConceptID also works well.
And it would therefore be awkward to use this term as the ID for a synonym or misapplied name - unless it refers to the accepted taxon and is not a "primary key". If we call it taxonConceptID this might be less of a problem. TCS does the same and uses the taxon concept ID as a primary key for classic synonyms aka nominal taxon concepts.
I don't follow the "primary key" point, but I think we agree about the taxonConceptID as the preferred (or acceptable) name for this term.
My biggest concern with going for actually two classes (name & taxon concept) and having 2 IDs for them is what do other ID terms like originalNameID, higherTaxonID, acceptedTaxonID refer to? It might cause confusion in case the IDs are not GUIDs - which I think we will have to still live with for quite some time. On the other hand this might be the price to pay for being exact and the terms quite clearly are called either xxxNameID or xxxTaxonID...
If we are moving from distributed query to harvesting and querying against a large integrated repository, the need for most of the higher taxonomic names goes away. We put them in the original DwC to support querying. In a large integrated repository, the higher classification used by the provider will be stripped off, won't it? All that will matter is the lowest level (most precise) taxon used in the identification(s).
Are we still trying to support the distributed query function?
-Stan
Stan, Im not sure if I understand you correctly when saying we should differentiate between describing taxa and using them for identifications. Do you suggest to have a completely seperate set of terms for the two purposes?
Taxon(Concept)ID would be the same in both contexts, description and identification. My point was that TaxonNameID should NOT be included in an identification. Include the name, yes, along with nameAuthor, date, accordingTo elements, but not the NameID. They are all just strings to be processed in determining what taxonomic concept was intended. I think a TaxonNameID is useful only in documenting the derivation of names (if it's useful at all). That could be an extreme view, but I tossed it out to see if other agreed or not.
agreed. A name id to me only is good to link to a richer definition at nomenclators, an IPNI lsid for example. But Im not sure if that is adding any value to a taxon definition or even identification
In case we have 2 ID terms, one for the name (scientificNameID) and one for the concept (taxonConceptID), this maps nicely to TCS. The reason why we favoured nameUsageID over taxonID was that a taxon by definition is accepted.
In the context of an identification, yes, a taxon is asserted to be valid/accepted by the identifier (at the time), but not all identifications are accepted by the data manager, so that last statement isn't always true. Also not all taxa are accepted/valid within a classification (if it includes synonymous taxa).
I think "nameUsageID" is too obscure for general use. It makes sense, but only after understanding the "name-according-to" concept. I think TaxonID along with (and therefore different from) TaxonNameID make a better combination. TaxonConceptID also works well.
And it would therefore be awkward to use this term as the ID for a synonym or misapplied name - unless it refers to the accepted taxon and is not a "primary key". If we call it taxonConceptID this might be less of a problem. TCS does the same and uses the taxon concept ID as a primary key for classic synonyms aka nominal taxon concepts.
I don't follow the "primary key" point, but I think we agree about the taxonConceptID as the preferred (or acceptable) name for this term.
by primary key I meant uniqueness for every record in a taxonomic dataset that also includes synonyms and other non accepted "taxa". Some people would argue a synonym doesnt have a concept and therefore only accepted taxa have a taxonID. If we agree that this is not the case then all is fine and I would actually prefer to stick with taxonID instead of the unneccessary longer taxonConceptID.
My biggest concern with going for actually two classes (name & taxon concept) and having 2 IDs for them is what do other ID terms like originalNameID, higherTaxonID, acceptedTaxonID refer to? It might cause confusion in case the IDs are not GUIDs - which I think we will have to still live with for quite some time. On the other hand this might be the price to pay for being exact and the terms quite clearly are called either xxxNameID or xxxTaxonID...
If we are moving from distributed query to harvesting and querying against a large integrated repository, the need for most of the higher taxonomic names goes away. We put them in the original DwC to support querying. In a large integrated repository, the higher classification used by the provider will be stripped off, won't it? All that will matter is the lowest level (most precise) taxon used in the identification(s).
the higher classification is very useful for both occurrence identifications and a taxonomic dataset. In both cases it helps identifying canonical homonyms. And for the taxonomic datasets the classification is a central part of the infromation. Having explicit terms for higher ranks like dwc:kingdom, dwc:higherTaxon for the verbatim name string of a higher taxon at any rank and dwc:higherTaxonID to use unique ids to transmit this hierarchy are just several options for the same thing. It helps the publishers while the consumers have to deal with more complexity, but to a tolerable degree I would think. In my applications so far I tend to implement a precedence higherTaxonID > higherTaxon > KPCOFGS in case several of those terms are present.
Are we still trying to support the distributed query function?
no
-Stan _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
Upon re-read of your post I see that taxonAccordingTo has been retained. Therefore I can interpret taxonPublication to simply be a change in the name of namePublishedIn.
I have worked up various examples using the terms in combination with some extensions we (Markus and I) have been drafting. They are listed below with comments. Note that given the instability around the taxon identifier names they might not be congruent with the current terminology.
Euro+Med Example (http://spreadsheets.google.com/pub?key=tnoVriNunOOMzYp709vtauQ&output=ht... )
We hit a snag with this example when the source database did not provide identifiers for the misapplied names and we therefore had to manufacture local identifiers for them. In this example namePublishedIn holds the unparsed primary citation and taxonAccordingTo holds the misapplied name reference.
Peabody Museum Zoological and Botanical Synonyms Example Source Document (http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p... ) Transformed Document (http://spreadsheets.google.com/pub?key=ts5YVtLnXCBvv8X-prpprOg&output=ht... ) Transformed Document with Comments (http://code.google.com/p/gbif-ecat/wiki/GNAsynonymsExample )
These examples best illustrates my reasoning for the use of the term "taxon Reference" There is a comments part of the wiki that highlights some of the issues I hit.
Lastly, this is a mapping of the DwC terms to the Catalogue of Life Standard Dataset http://code.google.com/p/gbif-ecat/wiki/CoL_Comparison
I refer to a "GNA standard" in this document to refer to our use of the draft terms in combination with other draft terms structured as extensions according to the text guidelines. In this case I used taxonAccordingTo to reference the latest taxonomic scrutiny property of the standard dataset.
Cheers, David
On Aug 25, 2009, at 7:00 AM, John R. WIECZOREK wrote:
Right, that all makes sense now, and is exactly the kind of simplification that was already in place in the Location class, where the locationID refers to the Location as a whole, not some part of it, such as a country in one case or a city in another case. So, I agree, remove the taxonConceptID.
I've been struggling with trying to come up with a better term name than nameUsage. After reading the arguments again with every alternative I can come up with (scientificName, taxonName, taxon_name, nameAsUsed, nameAsPublished, publishedName, publishedTaxon) I'm not sure I can really do any better for a name that states specifically what you are trying to encompass with that term. Nevertheless, the term seems awkward, especially on first encounter. The terms would have to be very carefully described (but I guess all terms should be). The problem is, I think the same problem with recognizing what the term is for would happen on the second encounter as well ("What was that term for again?"). I don't think that would happen with terms that were more familiar, even if their meaning is broad. To me, "taxon" works, because it could be a name or a concept - exactly what we're trying to encompass.
So here's what I'd do in an attempt to be clear, concise, and consistent.
Given that the Class is Taxon (which captures the idea of a name as well as it does a concept), consistency would argue that the id term for a record of the class should be taxonID. The list of terms under this scenario would be: taxonID, acceptedTaxonID, higherTaxonID, originalTaxonID, scientificName, acceptedTaxon, higherTaxon, originalTaxon, higherClassification, kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, verbatimTaxonRank, scientificNameAuthorship, nomenclaturalCode, taxonPublicationID, taxonPublication, taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName.
I retained "scientificName " for two big reasons. First, the obvious alternative "taxon" would be too easily confused with the name of the Class "Taxon". Second, scientificName has broad current usage and will immediately suggest the appropriate content for most users. An additional minor reason is that the term contrasts with and is nicely consistent with "vernacularName".
The rest is all dependent on good definitions. Here are some drafts for new definitions for terms that need them. Please suggest any necessary revisions.
taxonID: An identifier for a specific taxon-related name usage (a Taxon record). May be a global unique identifier or an identifier specific to the data set.
acceptedTaxonID: A unique identifier for the acceptedTaxon.
higherTaxonID: A unique identifier for the taxon that is the parent of the scientificName.
originalTaxonID: A unique identifier for the basionym (botany), basonym (bacteriology), or replacement of the scientificName.
scientificName: The taxon name (with date and authorship information if applicable). When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined. This term should not contain Identification qualifications, which should instead be supplied in the IdentificationQualifier term.
acceptedTaxon: The currently valid (zoological) or accepted (botanical) name for the scientificName.
higherTaxon: The taxon that is the parent of the scientificName.
originalTaxon: The basionym (botany), basonym (bacteriology), or replacement of the scientificName..
higherClassification: A list (concatenated and separated) of the names for the taxonomic ranks less specific than that given in the scientificName.
kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet - all unchanged.
taxonRank: The taxonomic rank of the scientificName. Recommended best practice is to use a controlled vocabulary.
verbatimTaxonRank: The verbatim original taxonomic rank of the scientificName.
scientificNameAuthorship, nomenclaturalCode - unchanged
taxonPublicationID: A unique identifier for the publication of the Taxon.
taxonPublication: A reference for the publication of the Taxon.
taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName - unchanged.
On Mon, Aug 24, 2009 at 4:15 PM, "Markus Döring (GBIF)"mdoering@gbif.org wrote:
John, I think this is based on the different understanding of the other IDs we are having. If ScientificNameID is purely for the name as the term suggests, I do agree with you that taxonConceptID is still needed. But as me and David have argued we would prefer a wider definition closer to the originally suggested taxonID (which was turned into scientificNameID at some point). An identifier for anything that is described by the taxonomic terms, let it be a name, a taxon (concept) or any other use of a name. So the same name effectively can have different IDs if it has been used in different places, thereby representing different taxonomic concepts. This would make the conceptID superflous. If the taxon(Concept)ID is to take on this role and the scientificNameID is a purely nomenclatural name identifier only, I am with you.
One thing I would like to avoid very much though is that some ID terms would refer to the scientificNameID (like originalNameID) while others like the higherTaxonID would reference the taxonConceptID. I think it all becomes a lot simpler if there is a single taxon/ nameID for all purpuses. Similarly I dont think we would want a separate occurrenceID, specimenID and fossilID.
Markus
On Aug 25, 2009, at 0:55, John R. WIECZOREK wrote:
While thinking further in trying to implement the suggested changes another question occurred to me. The recommendation was made in Issue #48 to remove taxonConceptID. If it is removed, how would anyone be able to capture the proposition that a given specimen was a member of a circumscription identified by a registered (having a resolvable GUID) taxon concept? I pose that one could not, because we would be left only with name terms. Unless I'm getting something wrong, I believe this term cannot be removed.
On Thu, Aug 6, 2009 at 5:31 AM, Markus Döringm.doering@mac.com wrote:
Dear John & DwC friends,
after finally having time to review the current dwc terms again I came across a couple of issues I'd like to see discussed or even changed. I am working for nearly 1 year now with the new terms during their development, especially with the new and modified taxonomic terms. So far they work very well in practice, but there are a few improvements I can think of, mostly related to the latest changes shortly before the public review started. I have added them as separate issues to the google code site, but list them here in one go. The number of issues is larger than I hoped for, but most of them are minor terminology issues for consistency and not touching the core meaning of the terms.
Markus
#47 rename basionym(ID) to originalName(ID) http://code.google.com/p/darwincore/issues/detail?id=47 The intend for this term is really to reflect where a name originally comes from in case it is a recombination. The term basionym is mostly used with botanists and covers only the cases when an epithet remains the same, i.e. not replacement names. The best matching, broader term therefore is originalName I think. Changes have to be done to both the verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found in this document:
http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
#48 remove taxonConceptID http://code.google.com/p/darwincore/issues/detail?id=48 The conceptID is intended to state that 2 name usages / potential taxa are the same, even if they use a different name. This is a special case of true concept relations and I would much prefer to see this covered in a dedicated extension treating all concept relations, especially frequent cases such as includes, overlaps, etc. I am more than willing to define such an extension
#49 rename scientificNameID, acceptedScientificNameID and higherTaxonNameID http://code.google.com/p/darwincore/issues/detail?id=49 no matter what the final term names are I think the 3 ones should be consistent. Originally it was intended to call them taxonID, acceptedTaxonID and higherTaxonID with a loose definition of a taxon, more based on the idea of that all terms here are taxonomic terms and therefore contain taxon in their name. The current version scientificNameID, acceptedScientificNameID and higherTaxonNameID intends to do the same I believe, but the terminology invites people to use them not referring to each other from what I have seen so far in practice. Concrete recomendations:
#49a replace scientificNameID with nameUsageID There is the need to uniquely identify a taxon concept with a given name, a name usage. A nameID suggests the name is unique which it isnt if combined with an sec reference aka taxonAccordingTo. A taxonID suggests to refer to a distinct taxon concept. A name usage seems the smallest entity and can therefore be used to act as a sort of unique key for names, taxa, taxon concepts or just usages of a name. All other taxonomic dwc ID terms can and should point to a name usage id then. This makes me think if most/all other IDs should reflect this in their names, see below.
It could make sense to keep scientificNameID as a ID to the name as defined by a nomenclator. But this ID can also be used as a name usage id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID) this term should point to the name usage that reflects the "accepted" taxon in case of synonyms, no matter if they are objective or subjective. AcceptedScientificName sounds more like a nomenclatural exercise and in accordance with #3 (nameUsageID) the term acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID) in consistency with nameUsage & acceptedNameUsage
#50 remove recommendation to concatenate multiple values, especially for higherTaxonName/higherNameUsage http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be a verbatim pointer to the higher taxon as an alternative way of using higherTaxonNameID. Therefore it should only contain a single name, the direct parent, in my eyes. There are also already the 7 mayor ranks as separate terms that can be used to express a flattened hierarchy. I am aware DwC suggests to use concatenated lists in a single term in other places, e.g. , but I believe it would be better to keep the meaning singular and use multiple instances of that term to express multiple values. Dublin Core also recommends to use multiple XML elements for multiple values, see recommendation 5 in http://dublincore.org/documents/dc-xml-guidelines/
#51 rename namePublicationID to namePublishedInID http://code.google.com/p/darwincore/issues/detail?id=51 for consistency with namePublishedIn
#52 rename (verbatim)scientificNameRank to (verbatim)rank http://code.google.com/p/darwincore/issues/detail?id=52 to avoid discussions about whether the rank belongs to the name or the taxon and also because its nice and short and there is no clash in biological terminology. _______________________________________________ tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
On Tue, Aug 25, 2009 at 2:04 AM, David Remsen (GBIF)dremsen@gbif.org wrote:
Upon re-read of your post I see that taxonAccordingTo has been retained. Therefore I can interpret taxonPublication to simply be a change in the name of namePublishedIn. I have worked up various examples using the terms in combination with some extensions we (Markus and I) have been drafting. They are listed below with comments. Note that given the instability around the taxon identifier names they might not be congruent with the current terminology. Euro+Med Example (http://spreadsheets.google.com/pub?key=tnoVriNunOOMzYp709vtauQ&output=ht...) We hit a snag with this example when the source database did not provide identifiers for the misapplied names and we therefore had to manufacture local identifiers for them. In this example namePublishedIn holds the unparsed primary citation and taxonAccordingTo holds the misapplied name reference. Peabody Museum Zoological and Botanical Synonyms Example Source Document (http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...) Transformed Document (http://spreadsheets.google.com/pub?key=ts5YVtLnXCBvv8X-prpprOg&output=ht...) Transformed Document with Comments (http://code.google.com/p/gbif-ecat/wiki/GNAsynonymsExample) These examples best illustrates my reasoning for the use of the term "taxon Reference" There is a comments part of the wiki that highlights some of the issues I hit. Lastly, this is a mapping of the DwC terms to the Catalogue of Life Standard Dataset http://code.google.com/p/gbif-ecat/wiki/CoL_Comparison I refer to a "GNA standard" in this document to refer to our use of the draft terms in combination with other draft terms structured as extensions according to the text guidelines. In this case I used taxonAccordingTo to reference the latest taxonomic scrutiny property of the standard dataset. Cheers, David On Aug 25, 2009, at 7:00 AM, John R. WIECZOREK wrote:
Right, that all makes sense now, and is exactly the kind of simplification that was already in place in the Location class, where the locationID refers to the Location as a whole, not some part of it, such as a country in one case or a city in another case. So, I agree, remove the taxonConceptID.
I've been struggling with trying to come up with a better term name than nameUsage. After reading the arguments again with every alternative I can come up with (scientificName, taxonName, taxon_name, nameAsUsed, nameAsPublished, publishedName, publishedTaxon) I'm not sure I can really do any better for a name that states specifically what you are trying to encompass with that term. Nevertheless, the term seems awkward, especially on first encounter. The terms would have to be very carefully described (but I guess all terms should be). The problem is, I think the same problem with recognizing what the term is for would happen on the second encounter as well ("What was that term for again?"). I don't think that would happen with terms that were more familiar, even if their meaning is broad. To me, "taxon" works, because it could be a name or a concept - exactly what we're trying to encompass.
So here's what I'd do in an attempt to be clear, concise, and consistent.
Given that the Class is Taxon (which captures the idea of a name as well as it does a concept), consistency would argue that the id term for a record of the class should be taxonID. The list of terms under this scenario would be: taxonID, acceptedTaxonID, higherTaxonID, originalTaxonID, scientificName, acceptedTaxon, higherTaxon, originalTaxon, higherClassification, kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet, taxonRank, verbatimTaxonRank, scientificNameAuthorship, nomenclaturalCode, taxonPublicationID, taxonPublication, taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName.
I retained "scientificName " for two big reasons. First, the obvious alternative "taxon" would be too easily confused with the name of the Class "Taxon". Second, scientificName has broad current usage and will immediately suggest the appropriate content for most users. An additional minor reason is that the term contrasts with and is nicely consistent with "vernacularName".
The rest is all dependent on good definitions. Here are some drafts for new definitions for terms that need them. Please suggest any necessary revisions.
taxonID: An identifier for a specific taxon-related name usage (a Taxon record). May be a global unique identifier or an identifier specific to the data set.
acceptedTaxonID: A unique identifier for the acceptedTaxon.
higherTaxonID: A unique identifier for the taxon that is the parent of the scientificName.
originalTaxonID: A unique identifier for the basionym (botany), basonym (bacteriology), or replacement of the scientificName.
scientificName: The taxon name (with date and authorship information if applicable). When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined. This term should not contain Identification qualifications, which should instead be supplied in the IdentificationQualifier term.
acceptedTaxon: The currently valid (zoological) or accepted (botanical) name for the scientificName.
higherTaxon: The taxon that is the parent of the scientificName.
originalTaxon: The basionym (botany), basonym (bacteriology), or replacement of the scientificName..
higherClassification: A list (concatenated and separated) of the names for the taxonomic ranks less specific than that given in the scientificName.
kingdom, phylum, class, order, family, genus, subgenus, specificEpithet, infraspecificEpithet - all unchanged.
taxonRank: The taxonomic rank of the scientificName. Recommended best practice is to use a controlled vocabulary.
verbatimTaxonRank: The verbatim original taxonomic rank of the scientificName.
scientificNameAuthorship, nomenclaturalCode - unchanged
taxonPublicationID: A unique identifier for the publication of the Taxon.
taxonPublication: A reference for the publication of the Taxon.
taxonomicStatus, nomenclaturalStatus, taxonAccordingTo, taxonRemarks, vernacularName - unchanged.
On Mon, Aug 24, 2009 at 4:15 PM, "Markus Döring (GBIF)"mdoering@gbif.org wrote:
John,
I think this is based on the different understanding of the other IDs we are
having.
If ScientificNameID is purely for the name as the term suggests, I do agree
with you that taxonConceptID is still needed. But as me and David have
argued we would prefer a wider definition closer to the originally suggested
taxonID (which was turned into scientificNameID at some point). An
identifier for anything that is described by the taxonomic terms, let it be
a name, a taxon (concept) or any other use of a name. So the same name
effectively can have different IDs if it has been used in different places,
thereby representing different taxonomic concepts. This would make the
conceptID superflous. If the taxon(Concept)ID is to take on this role and
the scientificNameID is a purely nomenclatural name identifier only, I am
with you.
One thing I would like to avoid very much though is that some ID terms would
refer to the scientificNameID (like originalNameID) while others like the
higherTaxonID would reference the taxonConceptID.
I think it all becomes a lot simpler if there is a single taxon/nameID for
all purpuses. Similarly I dont think we would want a separate occurrenceID,
specimenID and fossilID.
Markus
On Aug 25, 2009, at 0:55, John R. WIECZOREK wrote:
While thinking further in trying to implement the suggested changes
another question occurred to me. The recommendation was made in Issue
#48 to remove taxonConceptID. If it is removed, how would anyone be
able to capture the proposition that a given specimen was a member of
a circumscription identified by a registered (having a resolvable
GUID) taxon concept? I pose that one could not, because we would be
left only with name terms. Unless I'm getting something wrong, I
believe this term cannot be removed.
On Thu, Aug 6, 2009 at 5:31 AM, Markus Döringm.doering@mac.com wrote:
Dear John & DwC friends,
after finally having time to review the current dwc terms again I came
across a couple of issues I'd like to see discussed or even changed. I
am working for nearly 1 year now with the new terms during their
development, especially with the new and modified taxonomic terms. So
far they work very well in practice, but there are a few improvements
I can think of, mostly related to the latest changes shortly before
the public review started. I have added them as separate issues to the
google code site, but list them here in one go. The number of issues
is larger than I hoped for, but most of them are minor terminology
issues for consistency and not touching the core meaning of the terms.
Markus
#47 rename basionym(ID) to originalName(ID)
http://code.google.com/p/darwincore/issues/detail?id=47
The intend for this term is really to reflect where a name originally
comes from in case it is a recombination. The term basionym is mostly
used with botanists and covers only the cases when an epithet remains
the same, i.e. not replacement names. The best matching, broader term
therefore is originalName I think. Changes have to be done to both the
verbatim name and the ID.
Good examples for synonyms, basionyms, replaced names etc can be found
in this document:
http://www.peabody.yale.edu/other/PROTEM/TAXSIG/taxonomy_synonyms_examples.p...
#48 remove taxonConceptID
http://code.google.com/p/darwincore/issues/detail?id=48
The conceptID is intended to state that 2 name usages / potential taxa
are the same, even if they use a different name. This is a special
case of true concept relations and I would much prefer to see this
covered in a dedicated extension treating all concept relations,
especially frequent cases such as includes, overlaps, etc. I am more
than willing to define such an extension
#49 rename scientificNameID, acceptedScientificNameID and
higherTaxonNameID
http://code.google.com/p/darwincore/issues/detail?id=49
no matter what the final term names are I think the 3 ones should be
consistent. Originally it was intended to call them taxonID,
acceptedTaxonID and higherTaxonID
with a loose definition of a taxon, more based on the idea of that all
terms here are taxonomic terms and therefore contain taxon in their
name. The current version scientificNameID, acceptedScientificNameID
and higherTaxonNameID intends to do the same I believe, but the
terminology invites people to use them not referring to each other
from what I have seen so far in practice.
Concrete recomendations:
#49a replace scientificNameID with nameUsageID
There is the need to uniquely identify a taxon concept with a given
name, a name usage. A nameID suggests the name is unique which it isnt
if combined with an sec reference aka taxonAccordingTo. A taxonID
suggests to refer to a distinct taxon concept. A name usage seems the
smallest entity and can therefore be used to act as a sort of unique
key for names, taxa, taxon concepts or just usages of a name. All
other taxonomic dwc ID terms can and should point to a name usage id
then. This makes me think if most/all other IDs should reflect this in
their names, see below.
It could make sense to keep scientificNameID as a ID to the name as
defined by a nomenclator. But this ID can also be used as a name usage
id, so in order to gain clarity I would prefer to have the term removed.
#49b rename acceptedScientificName(ID) to acceptedNameUsage(ID)
this term should point to the name usage that reflects the "accepted"
taxon in case of synonyms, no matter if they are objective or
subjective. AcceptedScientificName sounds more like a nomenclatural
exercise and in accordance with #3 (nameUsageID) the term
acceptedNameUsage(ID) would be the best fit in my eyes.
#49c rename higherTaxonName(ID) to higherNameUsage(ID)
in consistency with nameUsage & acceptedNameUsage
#50 remove recommendation to concatenate multiple values, especially
for higherTaxonName/higherNameUsage
http://code.google.com/p/darwincore/issues/detail?id=50
similar to originalName or acceptedNameUsage this term is meant to be
a verbatim pointer to the higher taxon as an alternative way of using
higherTaxonNameID. Therefore it should only contain a single name, the
direct parent, in my eyes. There are also already the 7 mayor ranks as
separate terms that can be used to express a flattened hierarchy.
I am aware DwC suggests to use concatenated lists in a single term in
other places, e.g. , but I believe it would be better to keep the
meaning singular and use multiple instances of that term to express
multiple values. Dublin Core also recommends to use multiple XML
elements for multiple values, see recommendation 5 in
http://dublincore.org/documents/dc-xml-guidelines/
#51 rename namePublicationID to namePublishedInID
http://code.google.com/p/darwincore/issues/detail?id=51
for consistency with namePublishedIn
#52 rename (verbatim)scientificNameRank to (verbatim)rank
http://code.google.com/p/darwincore/issues/detail?id=52
to avoid discussions about whether the rank belongs to the name or the
taxon and also because its nice and short and there is no clash in
biological terminology.
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
participants (5)
-
"Markus Döring (GBIF)"
-
Blum, Stan
-
David Remsen (GBIF)
-
John R. WIECZOREK
-
Markus Döring