TaxonOccurrence non-standard (Spinning off from something Peter DeVries mentioned)
Hi Peter,
The TaxonOccurrence standard is not actually a standard. It is part of the vocabulary that was put together two years ago and maps well to DwC.
Meanwhile, as Markus points out, DwC has gone its own way.
The whole TDWG ontology needs some clarification and sorting out and as Bob points out this is now done on a voluntary basis.
I am working on the TaxonName and TaxonConcept vocabularies because I need to use them. I hope I will provide what Markus requests (good return type definitions for GUIDs) but I will only do it for taxon names and concepts because that is all I am working on. I hope to take this to a stage where it is solid, documented and standardizable.
This will entail doing some stuff on the way the TDWG ontology/ vocabulary is presented but really it is not my department to do that alone.
This is very important for TDWG (as Markus points out).
What really needs to happen is for people who need things to happen in the ontology to role up their sleeves and make them happen.
For occurrence stuff I recommend you join forces with John Wieczorek.
All the best,
Roger
Roger, if you are working on that already, what do you think are the chances to have a one to one mapping from the taxonomic dwc terms to the names & taxon classes in the ontology?
We are using the dwc terms not only for occurrence records, but are already encoding pure taxonomic or nomenclatural checklists with simple dwc terms. Within ECAT and also the Global Names Index / Architecture we have started sharing such checklists encoded as text files which are far easier to handle than TCS-RDF, especially when dealing with hundreds of thousands of names. The expressiveness is of course limited, but it allows us to easily share all of the basic taxonomic/nomenclatural information such as synonyms, basionyms, taxonomic hierarchies, taxon concept reference, original publication, nomenclatural code, nomenclatural & taxonomic status, rank, atomised name parts and the classic higher dwc ranks for each name. I would be super glad to see a convergence with the ontology here.
Markus
On Apr 24, 2009, at 11:54, Roger Hyam wrote:
Hi Peter,
The TaxonOccurrence standard is not actually a standard. It is part of the vocabulary that was put together two years ago and maps well to DwC.
Meanwhile, as Markus points out, DwC has gone its own way.
The whole TDWG ontology needs some clarification and sorting out and as Bob points out this is now done on a voluntary basis.
I am working on the TaxonName and TaxonConcept vocabularies because I need to use them. I hope I will provide what Markus requests (good return type definitions for GUIDs) but I will only do it for taxon names and concepts because that is all I am working on. I hope to take this to a stage where it is solid, documented and standardizable.
This will entail doing some stuff on the way the TDWG ontology/ vocabulary is presented but really it is not my department to do that alone.
This is very important for TDWG (as Markus points out).
What really needs to happen is for people who need things to happen in the ontology to role up their sleeves and make them happen.
For occurrence stuff I recommend you join forces with John Wieczorek.
All the best,
Roger
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Hi Markus,
My plan is to produce an abstract model (a text description of the classes and properties) and a series of example concrete models. One would be the OWL ontology (perhaps this would be normative not sure yet) another would be a CSV file format and a third would be a simple (one namespace) XML file format. I will define mappings between all these (possibly XSLT to get from XML to CSV and OWL).
I also plan to do a mapping from the IPT checklist format to one or all of these things.
It would be good to write a couple of tools to do some auto conversion.
If I cut out all the rubbish and only include what is actually used we are only talking two classes and < thirty properties so I should be able beat the thing into submission quiet quickly.
All the best,
Roger
On 24 Apr 2009, at 12:07, Markus Döring (GBIF) wrote:
Roger, if you are working on that already, what do you think are the chances to have a one to one mapping from the taxonomic dwc terms to the names & taxon classes in the ontology?
We are using the dwc terms not only for occurrence records, but are already encoding pure taxonomic or nomenclatural checklists with simple dwc terms. Within ECAT and also the Global Names Index / Architecture we have started sharing such checklists encoded as text files which are far easier to handle than TCS-RDF, especially when dealing with hundreds of thousands of names. The expressiveness is of course limited, but it allows us to easily share all of the basic taxonomic/nomenclatural information such as synonyms, basionyms, taxonomic hierarchies, taxon concept reference, original publication, nomenclatural code, nomenclatural & taxonomic status, rank, atomised name parts and the classic higher dwc ranks for each name. I would be super glad to see a convergence with the ontology here.
Markus
On Apr 24, 2009, at 11:54, Roger Hyam wrote:
Hi Peter,
The TaxonOccurrence standard is not actually a standard. It is part of the vocabulary that was put together two years ago and maps well to DwC.
Meanwhile, as Markus points out, DwC has gone its own way.
The whole TDWG ontology needs some clarification and sorting out and as Bob points out this is now done on a voluntary basis.
I am working on the TaxonName and TaxonConcept vocabularies because I need to use them. I hope I will provide what Markus requests (good return type definitions for GUIDs) but I will only do it for taxon names and concepts because that is all I am working on. I hope to take this to a stage where it is solid, documented and standardizable.
This will entail doing some stuff on the way the TDWG ontology/ vocabulary is presented but really it is not my department to do that alone.
This is very important for TDWG (as Markus points out).
What really needs to happen is for people who need things to happen in the ontology to role up their sleeves and make them happen.
For occurrence stuff I recommend you join forces with John Wieczorek.
All the best,
Roger
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
------------------------------------------------------------- Roger Hyam Roger@BiodiversityCollectionsIndex.org http://www.BiodiversityCollectionsIndex.org ------------------------------------------------------------- Royal Botanic Garden Edinburgh 20A Inverleith Row, Edinburgh, EH3 5LR, UK Tel: +44 131 552 7171 ext 3015 Fax: +44 131 248 2901 http://www.rbge.org.uk/ -------------------------------------------------------------
Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
On Fri, Apr 24, 2009 at 6:38 AM, Roger Hyam rogerhyam@mac.com wrote:
Hi Markus,
My plan is to produce an abstract model (a text description of the classes and properties) and a series of example concrete models. One would be the OWL ontology (perhaps this would be normative not sure yet) another would be a CSV file format and a third would be a simple (one namespace) XML file format. I will define mappings between all these (possibly XSLT to get from XML to CSV and OWL).
I also plan to do a mapping from the IPT checklist format to one or all of these things.
It would be good to write a couple of tools to do some auto conversion.
If I cut out all the rubbish and only include what is actually used we are only talking two classes and < thirty properties so I should be able beat the thing into submission quiet quickly.
All the best,
Roger
On 24 Apr 2009, at 12:07, Markus Döring (GBIF) wrote:
Roger, if you are working on that already, what do you think are the chances to have a one to one mapping from the taxonomic dwc terms to the names & taxon classes in the ontology?
We are using the dwc terms not only for occurrence records, but are already encoding pure taxonomic or nomenclatural checklists with simple dwc terms. Within ECAT and also the Global Names Index / Architecture we have started sharing such checklists encoded as text files which are far easier to handle than TCS-RDF, especially when dealing with hundreds of thousands of names. The expressiveness is of course limited, but it allows us to easily share all of the basic taxonomic/nomenclatural information such as synonyms, basionyms, taxonomic hierarchies, taxon concept reference, original publication, nomenclatural code, nomenclatural & taxonomic status, rank, atomised name parts and the classic higher dwc ranks for each name. I would be super glad to see a convergence with the ontology here.
Markus
On Apr 24, 2009, at 11:54, Roger Hyam wrote:
Hi Peter,
The TaxonOccurrence standard is not actually a standard. It is part of the vocabulary that was put together two years ago and maps well to DwC.
Meanwhile, as Markus points out, DwC has gone its own way.
The whole TDWG ontology needs some clarification and sorting out and as Bob points out this is now done on a voluntary basis.
I am working on the TaxonName and TaxonConcept vocabularies because I need to use them. I hope I will provide what Markus requests (good return type definitions for GUIDs) but I will only do it for taxon names and concepts because that is all I am working on. I hope to take this to a stage where it is solid, documented and standardizable.
This will entail doing some stuff on the way the TDWG ontology/ vocabulary is presented but really it is not my department to do that alone.
This is very important for TDWG (as Markus points out).
What really needs to happen is for people who need things to happen in the ontology to role up their sleeves and make them happen.
For occurrence stuff I recommend you join forces with John Wieczorek.
All the best,
Roger
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Roger Hyam Roger@BiodiversityCollectionsIndex.org http://www.BiodiversityCollectionsIndex.org
Royal Botanic Garden Edinburgh 20A Inverleith Row, Edinburgh, EH3 5LR, UK Tel: +44 131 552 7171 ext 3015 Fax: +44 131 248 2901 http://www.rbge.org.uk/
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
From: tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
===========================================================
John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
1) The current state of the TDWG ontology (primarily the naming conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
2) the fact that the new DarwinCore straddles or overlaps the roles of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics. I would love to see it as a standard (at the least it might give it a bit more clout). I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
________________________________ From: tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
From: tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
===========================================================
John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
1) The current state of the TDWG ontology (primarily the naming conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
2) the fact that the new DarwinCore straddles or overlaps the roles of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
________________________________ Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics. I would love to see it as a standard (at the least it might give it a bit more clout). I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
From: tdwg-tag-bounces@lists.tdwg.org [tdwg-tag- bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
From: tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
- The current state of the TDWG ontology (primarily the naming
conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
- the fact that the new DarwinCore straddles or overlaps the roles
of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
I tend to agree with David ... but only to a point.
Are we not discussing the equivalent here of database normalization ... where two diametrically opposite opinions hold - fully normalized and it don't work (performance issue) or fully de-normalized and it don't work (integrity issues - accepting that a well designed UI can protect database content (and integrity) from the user)? Here we have DwC-with-extensions on the one hand and something like TCS/CDM? on the other. By taxonomic information I assume we mean that a taxonomic opinion on a name has been expressed (usually by a person). If so, the requirements are fairly basic - name with nomenclatorLSID, status, who, when and where. It gets complex, and it's what I do not think DwC supports, when everything in a complex taxonomic opinion (homotypic names, heterotypic names, misapplications, pro-parte synonyms etc) is placed in one 'object' - the equivalent of recursive joins to go back to a database analogy.
In haste,
Paul
________________________________
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of David Remsen Sent: 27 April 2009 14:47 To: Kevin Richards Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics. I would love to see it as a standard (at the least it might give it a bit more clout). I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses. If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-) Kevin ________________________________
From: tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology From: tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology Anything I should do on the DwC side in anticipation of harmony? http://rs.tdwg.org/dwc/terms/index.htm#theterms =========================================================== John, At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology. But having said that, I also need to say that I'm uncomfortable with: 1) The current state of the TDWG ontology (primarily the naming conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and 2) the fact that the new DarwinCore straddles or overlaps the roles of an ontology and an application schema. I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground. Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing). Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts. -Stan
________________________________
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
************************************************************************ The information contained in this e-mail and any files transmitted with it is confidential and is for the exclusive use of the intended recipient. If you are not the intended recipient please note that any distribution, copying or use of this communication or the information in it is prohibited.
Whilst CAB International trading as CABI takes steps to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.
If you have received this communication in error, please notify us by e-mail at cabi@cabi.org or by telephone on +44 (0)1491 829199 and then delete the e-mail and any copies of it.
CABI is an International Organization recognised by the UK Government under Statutory Instrument 1982 No. 1071.
**************************************************************************
Paul - I'll post some examples as soon as I can (I'm in a GBIF Science Committee meeting the next two days) that illustrate how we can use the DwC terms to create simple schemas for representing rich taxonomic data. Markus and I have modelled complex zoological and botanical taxonomies using it for testing the GBIF IPT and it works. I'll get these examples served in the next day or so. It includes some Euro +Med Plants data and a Smithsonian taxonomic bulletin on Cetacean taxonomy (zoological + botanical examples). It covers all of the components of taxonomic opinion to which you refer and more. It can represent all that I know are contained in Index Fungorum and Species Fungorum unless you are hiding a lot more, which you may well be. The only thing it doesn't do well are the explicit concept-to-concept assertions enabled by TCS but I have other strategies for achieving that. There are some homonym cases as well which I'm not sure we do well but which Markus thinks we can so I need to dig in there a bit.
If you have some sample taxonomic or nomenclatural data that you think will break it I would be happy to take it for a ride within the DwC- with-extensions approach that is employed within the IPT and see if it can't be effectively addressed. I don't mind being wrong about it so long so as I or someone else can provide some clues as where to go from here.
DR.
On Apr 27, 2009, at 4:20 PM, Paul Kirk wrote:
I tend to agree with David ... but only to a point.
Are we not discussing the equivalent here of database normalization ... where two diametrically opposite opinions hold - fully normalized and it don't work (performance issue) or fully de- normalized and it don't work (integrity issues - accepting that a well designed UI can protect database content (and integrity) from the user)? Here we have DwC-with-extensions on the one hand and something like TCS/CDM? on the other. By taxonomic information I assume we mean that a taxonomic opinion on a name has been expressed (usually by a person). If so, the requirements are fairly basic - name with nomenclatorLSID, status, who, when and where. It gets complex, and it's what I do not think DwC supports, when everything in a complex taxonomic opinion (homotypic names, heterotypic names, misapplications, pro-parte synonyms etc) is placed in one 'object' - the equivalent of recursive joins to go back to a database analogy.
In haste,
Paul
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org ] On Behalf Of David Remsen Sent: 27 April 2009 14:47 To: Kevin Richards Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics. I would love to see it as a standard (at the least it might give it a bit more clout). I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
From: tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.org ] On Behalf Of Blum, Stan [sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
From: tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
- The current state of the TDWG ontology (primarily the naming
conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
- the fact that the new DarwinCore straddles or overlaps the roles
of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
P Think Green - don't print this email unless you really need to
The information contained in this e-mail and any files transmitted with it is confidential and is for the exclusive use of the intended recipient. If you are not the intended recipient please note that any distribution, copying or use of this communication or the information in it is prohibited.
Whilst CAB International trading as CABI takes steps to prevent the transmission of viruses via e-mail, we cannot guarantee that any e- mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.
If you have received this communication in error, please notify us by e-mail at cabi@cabi.org or by telephone on +44 (0)1491 829199 and then delete the e-mail and any copies of it.
CABI is an International Organization recognised by the UK Government under Statutory Instrument 1982 No. 1071.
Sorry, I'm not a taxonomic expert, I just feel uncomfortable using an observation schema for a taxonomic job. But if you have examples of representing complex taxonomies using DwC, then good oh. It will probably be fine if you have basic taxon name strings and a simple hierarchy, but issues may come up when you have for example, some names with authors and year and some without, etc, and working out if they are the same name ... and as Paul says homonyms, misapplications, etc, etc - depends on what you want to do with the data. As I said probably "horses for courses".
Why not use the TDWG ontology? That contains more structure
Kevin
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of David Remsen Sent: Tuesday, 28 April 2009 9:05 a.m. To: Paul Kirk Cc: Kevin Richards; Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Paul - I'll post some examples as soon as I can (I'm in a GBIF Science Committee meeting the next two days) that illustrate how we can use the DwC terms to create simple schemas for representing rich taxonomic data. Markus and I have modelled complex zoological and botanical taxonomies using it for testing the GBIF IPT and it works. I'll get these examples served in the next day or so. It includes some Euro+Med Plants data and a Smithsonian taxonomic bulletin on Cetacean taxonomy (zoological + botanical examples). It covers all of the components of taxonomic opinion to which you refer and more. It can represent all that I know are contained in Index Fungorum and Species Fungorum unless you are hiding a lot more, which you may well be. The only thing it doesn't do well are the explicit concept-to-concept assertions enabled by TCS but I have other strategies for achieving that. There are some homonym cases as well which I'm not sure we do well but which Markus thinks we can so I need to dig in there a bit.
If you have some sample taxonomic or nomenclatural data that you think will break it I would be happy to take it for a ride within the DwC-with-extensions approach that is employed within the IPT and see if it can't be effectively addressed. I don't mind being wrong about it so long so as I or someone else can provide some clues as where to go from here.
DR.
On Apr 27, 2009, at 4:20 PM, Paul Kirk wrote:
I tend to agree with David ... but only to a point.
Are we not discussing the equivalent here of database normalization ... where two diametrically opposite opinions hold - fully normalized and it don't work (performance issue) or fully de-normalized and it don't work (integrity issues - accepting that a well designed UI can protect database content (and integrity) from the user)? Here we have DwC-with-extensions on the one hand and something like TCS/CDM? on the other. By taxonomic information I assume we mean that a taxonomic opinion on a name has been expressed (usually by a person). If so, the requirements are fairly basic - name with nomenclatorLSID, status, who, when and where. It gets complex, and it's what I do not think DwC supports, when everything in a complex taxonomic opinion (homotypic names, heterotypic names, misapplications, pro-parte synonyms etc) is placed in one 'object' - the equivalent of recursive joins to go back to a database analogy.
In haste,
Paul
________________________________ From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of David Remsen Sent: 27 April 2009 14:47 To: Kevin Richards Cc: Technical Architecture Group mailing list; exec@tdwg.orgmailto:exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics. I would love to see it as a standard (at the least it might give it a bit more clout). I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
________________________________ From: tdwg-tag-bounces@lists.tdwg.orgmailto:tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.orgmailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.orgmailto:sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.orgmailto:exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology From: tdwg-tag-bounces@lists.tdwg.orgmailto:tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms =========================================================== John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
1) The current state of the TDWG ontology (primarily the naming conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
2) the fact that the new DarwinCore straddles or overlaps the roles of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
________________________________ Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.orgmailto:tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
P Think Green - don't print this email unless you really need to
************************************************************************ The information contained in this e-mail and any files transmitted with it is confidential and is for the exclusive use of the intended recipient. If you are not the intended recipient please note that any distribution, copying or use of this communication or the information in it is prohibited.
Whilst CAB International trading as CABI takes steps to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.
If you have received this communication in error, please notify us by e-mail at cabi@cabi.orgmailto:cabi@cabi.org or by telephone on +44 (0)1491 829199 and then delete the e-mail and any copies of it.
CABI is an International Organization recognised by the UK Government under Statutory Instrument 1982 No. 1071.
**************************************************************************
________________________________ Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
I guess I should take a look at the DwC+Ext to see if it has all the appropriate 'containers and pointer' to handle complex botanical taxonomic synonymy. But is the DwC+Ext designed to handle taxonomy or nomenclature (or both)? From your comments it appears that it's both. But if it's just the former doesn't it need just the minimum I noted below - with the proviso that 'status' would be either accepted (no more needed) OR synonym (with the accepted name and nomenclatorLSID appended)?
Paul
________________________________
From: Kevin Richards [mailto:RichardsK@landcareresearch.co.nz] Sent: Mon 27/04/2009 22:54 To: David Remsen; Paul Kirk Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
Sorry, I'm not a taxonomic expert, I just feel uncomfortable using an observation schema for a taxonomic job. But if you have examples of representing complex taxonomies using DwC, then good oh. It will probably be fine if you have basic taxon name strings and a simple hierarchy, but issues may come up when you have for example, some names with authors and year and some without, etc, and working out if they are the same name ... and as Paul says homonyms, misapplications, etc, etc - depends on what you want to do with the data. As I said probably "horses for courses".
Why not use the TDWG ontology? That contains more structure
Kevin
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of David Remsen Sent: Tuesday, 28 April 2009 9:05 a.m. To: Paul Kirk Cc: Kevin Richards; Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Paul - I'll post some examples as soon as I can (I'm in a GBIF Science Committee meeting the next two days) that illustrate how we can use the DwC terms to create simple schemas for representing rich taxonomic data. Markus and I have modelled complex zoological and botanical taxonomies using it for testing the GBIF IPT and it works. I'll get these examples served in the next day or so. It includes some Euro+Med Plants data and a Smithsonian taxonomic bulletin on Cetacean taxonomy (zoological + botanical examples). It covers all of the components of taxonomic opinion to which you refer and more. It can represent all that I know are contained in Index Fungorum and Species Fungorum unless you are hiding a lot more, which you may well be. The only thing it doesn't do well are the explicit concept-to-concept assertions enabled by TCS but I have other strategies for achieving that. There are some homonym cases as well which I'm not sure we do well but which Markus thinks we can so I need to dig in there a bit.
If you have some sample taxonomic or nomenclatural data that you think will break it I would be happy to take it for a ride within the DwC-with-extensions approach that is employed within the IPT and see if it can't be effectively addressed. I don't mind being wrong about it so long so as I or someone else can provide some clues as where to go from here.
DR.
On Apr 27, 2009, at 4:20 PM, Paul Kirk wrote:
I tend to agree with David ... but only to a point.
Are we not discussing the equivalent here of database normalization ... where two diametrically opposite opinions hold - fully normalized and it don't work (performance issue) or fully de-normalized and it don't work (integrity issues - accepting that a well designed UI can protect database content (and integrity) from the user)? Here we have DwC-with-extensions on the one hand and something like TCS/CDM? on the other. By taxonomic information I assume we mean that a taxonomic opinion on a name has been expressed (usually by a person). If so, the requirements are fairly basic - name with nomenclatorLSID, status, who, when and where. It gets complex, and it's what I do not think DwC supports, when everything in a complex taxonomic opinion (homotypic names, heterotypic names, misapplications, pro-parte synonyms etc) is placed in one 'object' - the equivalent of recursive joins to go back to a database analogy.
In haste,
Paul
________________________________
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of David Remsen Sent: 27 April 2009 14:47 To: Kevin Richards Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics.
I would love to see it as a standard (at the least it might give it a bit more clout).
I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
________________________________
From: tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
From: tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
===========================================================
John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
1) The current state of the TDWG ontology (primarily the naming conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
2) the fact that the new DarwinCore straddles or overlaps the roles of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
________________________________
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz http://www.landcareresearch.co.nz/ _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
P Think Green - don't print this email unless you really need to
************************************************************************ The information contained in this e-mail and any files transmitted with it is confidential and is for the exclusive use of the intended recipient. If you are not the intended recipient please note that any distribution, copying or use of this communication or the information in it is prohibited.
Whilst CAB International trading as CABI takes steps to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.
If you have received this communication in error, please notify us by e-mail at cabi@cabi.org or by telephone on +44 (0)1491 829199 and then delete the e-mail and any copies of it.
CABI is an International Organization recognised by the UK Government under Statutory Instrument 1982 No. 1071.
**************************************************************************
________________________________
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
Again, it depends on what the data will be used for, ie how structured does it have to be, to be useful.
Your minimum fields (?) (and mappings based on DwC at http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard) :
name - dwc:ScientificName nomenclatorLSID - dwc:GlobalUniqueIdentifier status - ?? curatorialExt:TypeStatus ? who - dwc:AuthorYearOfScientificName (not componentised) ? when - dwc:AuthorYearOfScientificName (not componentised) ? where - ?? dwc:ScientificName ??
Does this look ok??
Kevin
From: Paul Kirk [mailto:p.kirk@cabi.org] Sent: Tuesday, 28 April 2009 10:21 a.m. To: Kevin Richards; David Remsen Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
I guess I should take a look at the DwC+Ext to see if it has all the appropriate 'containers and pointer' to handle complex botanical taxonomic synonymy. But is the DwC+Ext designed to handle taxonomy or nomenclature (or both)? From your comments it appears that it's both. But if it's just the former doesn't it need just the minimum I noted below - with the proviso that 'status' would be either accepted (no more needed) OR synonym (with the accepted name and nomenclatorLSID appended)?
Paul
________________________________ From: Kevin Richards [mailto:RichardsK@landcareresearch.co.nz] Sent: Mon 27/04/2009 22:54 To: David Remsen; Paul Kirk Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology Sorry, I'm not a taxonomic expert, I just feel uncomfortable using an observation schema for a taxonomic job. But if you have examples of representing complex taxonomies using DwC, then good oh. It will probably be fine if you have basic taxon name strings and a simple hierarchy, but issues may come up when you have for example, some names with authors and year and some without, etc, and working out if they are the same name ... and as Paul says homonyms, misapplications, etc, etc - depends on what you want to do with the data. As I said probably "horses for courses".
Why not use the TDWG ontology? That contains more structure
Kevin
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of David Remsen Sent: Tuesday, 28 April 2009 9:05 a.m. To: Paul Kirk Cc: Kevin Richards; Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Paul - I'll post some examples as soon as I can (I'm in a GBIF Science Committee meeting the next two days) that illustrate how we can use the DwC terms to create simple schemas for representing rich taxonomic data. Markus and I have modelled complex zoological and botanical taxonomies using it for testing the GBIF IPT and it works. I'll get these examples served in the next day or so. It includes some Euro+Med Plants data and a Smithsonian taxonomic bulletin on Cetacean taxonomy (zoological + botanical examples). It covers all of the components of taxonomic opinion to which you refer and more. It can represent all that I know are contained in Index Fungorum and Species Fungorum unless you are hiding a lot more, which you may well be. The only thing it doesn't do well are the explicit concept-to-concept assertions enabled by TCS but I have other strategies for achieving that. There are some homonym cases as well which I'm not sure we do well but which Markus thinks we can so I need to dig in there a bit.
If you have some sample taxonomic or nomenclatural data that you think will break it I would be happy to take it for a ride within the DwC-with-extensions approach that is employed within the IPT and see if it can't be effectively addressed. I don't mind being wrong about it so long so as I or someone else can provide some clues as where to go from here.
DR.
On Apr 27, 2009, at 4:20 PM, Paul Kirk wrote:
I tend to agree with David ... but only to a point.
Are we not discussing the equivalent here of database normalization ... where two diametrically opposite opinions hold - fully normalized and it don't work (performance issue) or fully de-normalized and it don't work (integrity issues - accepting that a well designed UI can protect database content (and integrity) from the user)? Here we have DwC-with-extensions on the one hand and something like TCS/CDM? on the other. By taxonomic information I assume we mean that a taxonomic opinion on a name has been expressed (usually by a person). If so, the requirements are fairly basic - name with nomenclatorLSID, status, who, when and where. It gets complex, and it's what I do not think DwC supports, when everything in a complex taxonomic opinion (homotypic names, heterotypic names, misapplications, pro-parte synonyms etc) is placed in one 'object' - the equivalent of recursive joins to go back to a database analogy.
In haste,
Paul
________________________________ From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of David Remsen Sent: 27 April 2009 14:47 To: Kevin Richards Cc: Technical Architecture Group mailing list; exec@tdwg.orgmailto:exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics. I would love to see it as a standard (at the least it might give it a bit more clout). I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
________________________________ From: tdwg-tag-bounces@lists.tdwg.orgmailto:tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.orgmailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.orgmailto:sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.orgmailto:exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology From: tdwg-tag-bounces@lists.tdwg.orgmailto:tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms =========================================================== John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
1) The current state of the TDWG ontology (primarily the naming conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
2) the fact that the new DarwinCore straddles or overlaps the roles of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
________________________________ Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nzhttp://www.landcareresearch.co.nz/ _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.orgmailto:tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
P Think Green - don't print this email unless you really need to
************************************************************************ The information contained in this e-mail and any files transmitted with it is confidential and is for the exclusive use of the intended recipient. If you are not the intended recipient please note that any distribution, copying or use of this communication or the information in it is prohibited.
Whilst CAB International trading as CABI takes steps to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.
If you have received this communication in error, please notify us by e-mail at cabi@cabi.orgmailto:cabi@cabi.org or by telephone on +44 (0)1491 829199 and then delete the e-mail and any copies of it.
CABI is an International Organization recognised by the UK Government under Statutory Instrument 1982 No. 1071.
**************************************************************************
________________________________ Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
________________________________ Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
nope ... I was thinking purely taxonomic.
e.g. Blakeslea trispora Thaxt. 1914 - accepted name - fide Kirk 1984 (Mycological Papers 152)
Perhaps I need to look at the draft standard again and wait for some of the examples David will produce
________________________________
From: Kevin Richards [mailto:RichardsK@landcareresearch.co.nz] Sent: Mon 27/04/2009 23:27 To: Paul Kirk; David Remsen Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
Again, it depends on what the data will be used for, ie how structured does it have to be, to be useful.
Your minimum fields (?) (and mappings based on DwC at http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard) :
name - dwc:ScientificName
nomenclatorLSID - dwc:GlobalUniqueIdentifier
status - ?? curatorialExt:TypeStatus ?
who - dwc:AuthorYearOfScientificName (not componentised) ?
when - dwc:AuthorYearOfScientificName (not componentised) ?
where - ?? dwc:ScientificName ??
Does this look ok??
Kevin
From: Paul Kirk [mailto:p.kirk@cabi.org] Sent: Tuesday, 28 April 2009 10:21 a.m. To: Kevin Richards; David Remsen Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
I guess I should take a look at the DwC+Ext to see if it has all the appropriate 'containers and pointer' to handle complex botanical taxonomic synonymy. But is the DwC+Ext designed to handle taxonomy or nomenclature (or both)? From your comments it appears that it's both. But if it's just the former doesn't it need just the minimum I noted below - with the proviso that 'status' would be either accepted (no more needed) OR synonym (with the accepted name and nomenclatorLSID appended)?
Paul
________________________________
From: Kevin Richards [mailto:RichardsK@landcareresearch.co.nz] Sent: Mon 27/04/2009 22:54 To: David Remsen; Paul Kirk Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
Sorry, I'm not a taxonomic expert, I just feel uncomfortable using an observation schema for a taxonomic job. But if you have examples of representing complex taxonomies using DwC, then good oh. It will probably be fine if you have basic taxon name strings and a simple hierarchy, but issues may come up when you have for example, some names with authors and year and some without, etc, and working out if they are the same name ... and as Paul says homonyms, misapplications, etc, etc - depends on what you want to do with the data. As I said probably "horses for courses".
Why not use the TDWG ontology? That contains more structure
Kevin
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of David Remsen Sent: Tuesday, 28 April 2009 9:05 a.m. To: Paul Kirk Cc: Kevin Richards; Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Paul - I'll post some examples as soon as I can (I'm in a GBIF Science Committee meeting the next two days) that illustrate how we can use the DwC terms to create simple schemas for representing rich taxonomic data. Markus and I have modelled complex zoological and botanical taxonomies using it for testing the GBIF IPT and it works. I'll get these examples served in the next day or so. It includes some Euro+Med Plants data and a Smithsonian taxonomic bulletin on Cetacean taxonomy (zoological + botanical examples). It covers all of the components of taxonomic opinion to which you refer and more. It can represent all that I know are contained in Index Fungorum and Species Fungorum unless you are hiding a lot more, which you may well be. The only thing it doesn't do well are the explicit concept-to-concept assertions enabled by TCS but I have other strategies for achieving that. There are some homonym cases as well which I'm not sure we do well but which Markus thinks we can so I need to dig in there a bit.
If you have some sample taxonomic or nomenclatural data that you think will break it I would be happy to take it for a ride within the DwC-with-extensions approach that is employed within the IPT and see if it can't be effectively addressed. I don't mind being wrong about it so long so as I or someone else can provide some clues as where to go from here.
DR.
On Apr 27, 2009, at 4:20 PM, Paul Kirk wrote:
I tend to agree with David ... but only to a point.
Are we not discussing the equivalent here of database normalization ... where two diametrically opposite opinions hold - fully normalized and it don't work (performance issue) or fully de-normalized and it don't work (integrity issues - accepting that a well designed UI can protect database content (and integrity) from the user)? Here we have DwC-with-extensions on the one hand and something like TCS/CDM? on the other. By taxonomic information I assume we mean that a taxonomic opinion on a name has been expressed (usually by a person). If so, the requirements are fairly basic - name with nomenclatorLSID, status, who, when and where. It gets complex, and it's what I do not think DwC supports, when everything in a complex taxonomic opinion (homotypic names, heterotypic names, misapplications, pro-parte synonyms etc) is placed in one 'object' - the equivalent of recursive joins to go back to a database analogy.
In haste,
Paul
________________________________
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of David Remsen Sent: 27 April 2009 14:47 To: Kevin Richards Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics.
I would love to see it as a standard (at the least it might give it a bit more clout).
I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
________________________________
From: tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
From: tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
===========================================================
John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
1) The current state of the TDWG ontology (primarily the naming conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
2) the fact that the new DarwinCore straddles or overlaps the roles of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
________________________________
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz http://www.landcareresearch.co.nz/ _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
P Think Green - don't print this email unless you really need to
************************************************************************ The information contained in this e-mail and any files transmitted with it is confidential and is for the exclusive use of the intended recipient. If you are not the intended recipient please note that any distribution, copying or use of this communication or the information in it is prohibited.
Whilst CAB International trading as CABI takes steps to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.
If you have received this communication in error, please notify us by e-mail at cabi@cabi.org or by telephone on +44 (0)1491 829199 and then delete the e-mail and any copies of it.
CABI is an International Organization recognised by the UK Government under Statutory Instrument 1982 No. 1071.
**************************************************************************
________________________________
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
________________________________
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
Using Darwin Core to describe a complete specimen or observation (or any biologically relevant) record is perhaps analogous to using the Dublin Core to describe a book. Sure you get some valuable information, but the content can be vastly different and depending on the end use, far less satisfying.
Hence, it has always been my position to consider DwC as the biological equivalent of DC (which was the original intent - I never imagined it would still be argued a decade later). That is, it provides metadata about some biological thing. Very helpful for discovering relevant information, and in some situations sufficiently rich to act as a surrogate for the actual data (e.g. putting dots on a map, or transmitting small sets of facts), and indeed, some of the terms are sufficiently concise that they can be used as data. In other situations though, far more complete information is required, and that is the role of more complex and perhaps conceptually complete, though more specialized data representations.
Ideally, one would consider a mixed model, choosing sufficiently concise definitions drawn from an array of appropriate vocabularies and ontologies, and structured according to agreed rules and constraints to accurately capture and convey the desired intent.
Dave V.
On Apr 27, 2009, at 18:27 , Kevin Richards wrote:
Again, it depends on what the data will be used for, ie how structured does it have to be, to be useful.
Your minimum fields (?) (and mappings based on DwC athttp://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard) :
name - dwc:ScientificName nomenclatorLSID - dwc:GlobalUniqueIdentifier status - ?? curatorialExt:TypeStatus ? who - dwc:AuthorYearOfScientificName (not componentised) ? when - dwc:AuthorYearOfScientificName (not componentised) ? where - ?? dwc:ScientificName ??
Does this look ok??
Kevin
From: Paul Kirk [mailto:p.kirk@cabi.org] Sent: Tuesday, 28 April 2009 10:21 a.m. To: Kevin Richards; David Remsen Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
I guess I should take a look at the DwC+Ext to see if it has all the appropriate 'containers and pointer' to handle complex botanical taxonomic synonymy. But is the DwC+Ext designed to handle taxonomy or nomenclature (or both)? >From your comments it appears that it's both. But if it's just the former doesn't it need just the minimum I noted below - with the proviso that 'status' would be either accepted (no more needed) OR synonym (with the accepted name and nomenclatorLSID appended)?
Paul
From: Kevin Richards [mailto:RichardsK@landcareresearch.co.nz] Sent: Mon 27/04/2009 22:54 To: David Remsen; Paul Kirk Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology Sorry, I'm not a taxonomic expert, I just feel uncomfortable using an observation schema for a taxonomic job. But if you have examples of representing complex taxonomies using DwC, then good oh. It will probably be fine if you have basic taxon name strings and a simple hierarchy, but issues may come up when you have for example, some names with authors and year and some without, etc, and working out if they are the same name … and as Paul says homonyms, misapplications, etc, etc - depends on what you want to do with the data. As I said probably "horses for courses".
Why not use the TDWG ontology? That contains more structure
Kevin
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org ] On Behalf Of David Remsen Sent: Tuesday, 28 April 2009 9:05 a.m. To: Paul Kirk Cc: Kevin Richards; Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Paul - I'll post some examples as soon as I can (I'm in a GBIF Science Committee meeting the next two days) that illustrate how we can use the DwC terms to create simple schemas for representing rich taxonomic data. Markus and I have modelled complex zoological and botanical taxonomies using it for testing the GBIF IPT and it works. I'll get these examples served in the next day or so. It includes some Euro+Med Plants data and a Smithsonian taxonomic bulletin on Cetacean taxonomy (zoological + botanical examples). It covers all of the components of taxonomic opinion to which you refer and more. It can represent all that I know are contained in Index Fungorum and Species Fungorum unless you are hiding a lot more, which you may well be. The only thing it doesn't do well are the explicit concept-to-concept assertions enabled by TCS but I have other strategies for achieving that. There are some homonym cases as well which I'm not sure we do well but which Markus thinks we can so I need to dig in there a bit.
If you have some sample taxonomic or nomenclatural data that you think will break it I would be happy to take it for a ride within the DwC-with-extensions approach that is employed within the IPT and see if it can't be effectively addressed. I don't mind being wrong about it so long so as I or someone else can provide some clues as where to go from here.
DR.
On Apr 27, 2009, at 4:20 PM, Paul Kirk wrote:
I tend to agree with David ... but only to a point.
Are we not discussing the equivalent here of database normalization ... where two diametrically opposite opinions hold - fully normalized and it don't work (performance issue) or fully de- normalized and it don't work (integrity issues - accepting that a well designed UI can protect database content (and integrity) from the user)? Here we have DwC-with-extensions on the one hand and something like TCS/CDM? on the other. By taxonomic information I assume we mean that a taxonomic opinion on a name has been expressed (usually by a person). If so, the requirements are fairly basic - name with nomenclatorLSID, status, who, when and where. It gets complex, and it's what I do not think DwC supports, when everything in a complex taxonomic opinion (homotypic names, heterotypic names, misapplications, pro-parte synonyms etc) is placed in one 'object' - the equivalent of recursive joins to go back to a database analogy.
In haste,
Paul
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org ] On Behalf Of David Remsen Sent: 27 April 2009 14:47 To: Kevin Richards Cc: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics. I would love to see it as a standard (at the least it might give it a bit more clout). I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
From: tdwg-tag-bounces@lists.tdwg.org [tdwg-tag- bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology From: tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
- The current state of the TDWG ontology (primarily the naming
conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
- the fact that the new DarwinCore straddles or overlaps the roles
of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
P Think Green - don't print this email unless you really need to
The information contained in this e-mail and any files transmitted with it is confidential and is for the exclusive use of the intended recipient. If you are not the intended recipient please note that any distribution, copying or use of this communication or the information in it is prohibited.
Whilst CAB International trading as CABI takes steps to prevent the transmission of viruses via e-mail, we cannot guarantee that any e- mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.
If you have received this communication in error, please notify us by e-mail at cabi@cabi.org or by telephone on +44 (0)1491 829199 and then delete the e-mail and any copies of it.
CABI is an International Organization recognised by the UK Government under Statutory Instrument 1982 No. 1071.
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Here's my 2-cents worth...
An example of important taxon concept content missing from DwC and its extensions (unless I missed it - which is very possible) ...
The identification history of a specimen is a vital audit trail of information. It is the history of taxon concepts within the context of a single specimen. A harvested knowledge of that history across specimens allows you to make important inferences about taxonomic stability. For example if the taxon concept has wandered over a spread of names which are heterotypic in origin then its' telling you that you are dealing with a taxonomically fuzzy entity. If you are a biosecurity officer deciding whether a new organisms should be released into the environemnt or allowed across the border then that is a vital piece of information. It's a 'kind of' taxon concept within organism occurrence.
It's in ABCD of course.
So let's add a other extension to DwC. But then what difference between DwC + extensions and ABCD within the context of specimens? One important difference I see is that ABCD is scary to end users because it humungous.
IMHO DwC should be a bunch of simple core metadata concepts for simple observations/occurrences (not appropriate for surveys!) that facilitates resource discovery. If you wish to 'get data' within this context then one approach is to wrap the package into ' metadata + data extensions' . And one way of ensuring the data wrapped by the 'data extensions' are integrable across domains is build them from concepts defined within an overarching agreed standard ontology. Likewise I see the observation/occurrence metadata concepts as the potential entry points to the semantic RDF triple-trails also defined by the same ontology.
That's the way I look at it, which is probably too simplistic.
And like Paul I'd be intrigued to see DwC deal with some of the tortuous nomenclatural data complexities we recently had to untangle in IndexFungorum to port it into Rich Pyle's 'all codes of life' Global Name Usage Bank (GNUB) data structure. It was distantly removed from anything currently in DwC.
Jerry
From: David Remsen [mailto:dremsen@mbl.edu] Sent: Tuesday, 28 April 2009 1:47 a.m. To: Kevin Richards Cc: David Remsen; Blum, Stan; Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics. I would love to see it as a standard (at the least it might give it a bit more clout). I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
________________________________ From: tdwg-tag-bounces@lists.tdwg.orgmailto:tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.orgmailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.orgmailto:sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.orgmailto:exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology From: tdwg-tag-bounces@lists.tdwg.orgmailto:tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms =========================================================== John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
1) The current state of the TDWG ontology (primarily the naming conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
2) the fact that the new DarwinCore straddles or overlaps the roles of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
________________________________ Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.orgmailto:tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
________________________________ Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
I think there maybe a difference in how people thing about species identifiers: 1) Some, mainly conservationists, ecologists, medical entomologists etc. want a stable identifier that they can use to connect their data and publications to.
2) Others, often taxonomists, want an identifier that conveys information about the different ways that a species may connect to the phylogenetic tree.
Another way of thinking about this is that the taxonomic hypothesis for a given species is subordinate and ephemeral compared to the species itself.
The people in group 1 do not think of the species description as the species, they think of it as a human attempt to describe and delimit a species.
One of the problems I had with the current system is that records tagged by name and uBio LSID will see *Aedes triseriatus* and *Ochlerotatus triseriatus* as different things when people in group 1 see them as the same thing with two different taxonomic hypotheses.
Here is a page that explains my thinking: http://about.geospecies.org/
Here is the RDF showing my current way of representing this species concept: http://species.geospecies.org/specs/Ochlerotatus_triseriatus.rdf
(Safari will download the RDF, FireFox will display it)
I link to both of the uBio LSID's because I want to tie to the different information that is about the same species but linked via different LSIDs
Here is the species page http://species.geospecies.org/specs/Ochlerotatus_triseriatus
Here is the URI for that concept:
http://species.geospecies.org/spec_concept_uuid/84b8badd-b899-40ea-8f89-f39f...
I don't think that I have all the answers, I just needed something that worked like twdg/uBio for the people in group 1.
The way I think about it is to first create a species concept and then point all the potential taxonomic hypotheses at it. These different hypotheses could have different weights depending on current opinion.
Later you can shore up the species concept by combining morphological, genetic and population genetic data into a more robust, and hopefully more stable, understanding of that species.
Just my two cents :-)
- Pete
P.S. I had some trouble finding the current ontology information. I now know that the DarwinCore ontology is on GoogleCode at: http://darwincore.googlecode.com/svn/trunk/index.htm
Where is the latest TDWG ontology? Is it here: http://wiki.tdwg.org/twiki/bin/view/TAG/TDWGOntology
On Mon, Apr 27, 2009 at 8:46 AM, David Remsen dremsen@mbl.edu wrote:
Kevin, Can you tell me what the limitations are on being able to exchange taxonomic information with the DwC terms? As far as I can tell, you can exchange fairly complex taxonomic information short of concept-to-concept relations and I find the DwC-with-extensions approach we are using to exchanging information tied to taxa (not instances of taxa) to be a nice and practical compromise between complexity and practicality. My understanding is that the IPT can output TCS/RDF for those who want it. I am personally very happy to see the DwC taxon terms added. Finally I can provide format specifications that biologists can understand.
David Remsen
On Apr 25, 2009, at 9:52 AM, Kevin Richards wrote:
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics. I would love to see it as a standard (at the least it might give it a bit more clout). I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
*From:* tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.org] *Sent:* Saturday, 25 April 2009 6:12 a.m. *To:* Technical Architecture Group mailing list; exec@tdwg.org *Subject:* Re: [tdwg-tag] darwin core terms inside tdwg ontology
*From:* tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK *Sent:* Fri 2009-04-24 8:58 AM *To:* Roger Hyam *Cc:* Technical Architecture Group mailing list *Subject:* Re: [tdwg-tag] darwin core terms inside tdwg ontology
Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
- The current state of the TDWG ontology (primarily the naming
conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
- the fact that the new DarwinCore straddles or overlaps the roles of an
ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz _______________________________________________ tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Dear all,
A note of caution: Don't make any arguments based on the Draft Darwin Core Standard 1.4 and extensions. Dave Remsen's comments re Taxonomy and Nomenclature were not based on that. They were based on the Darwin Core as submitted for peer review, the first step of which is now concluded. I am in the process of addressing all reviewer comments (four days straight into it, in fact) with probably one day left to go before I publish the update and send my revision back to Gail, who is acting as editor. I'm loathe to point anyone at the currently moving target of documentation, so be patient for a day or two and then you can all argue about the new version. The intent is not to have an informal second peer review, but to give you all another target for your already loaded guns, and to prepare you to thrash it in public review, which hopefully is the next step. ;-)
Going back through the history of it all, I believe Stan was aware of the state of the standard up to peer review when he voiced his concerns, and so his comments are probably still pertinent.
Cheers,
John
Kevin,
I agree with you and Stan that the ontology is useful to all schemas. It seems to me that a "TDWG Ontology" is a totally new and different kind of thing than all the data exchange standards of the prior 10 years - DwC, SDD, TCS, etc. But, it is a very useful and important new kind of thing that should be part of the TDWG standards architecture. It challenges prior thinking about the nature of TDWG standards to grasp what standardizing on an ontology means. But, I think it's what is needed.
If TDWG standardized on one Ontology, then the vocabulary of all data exchange could be standardized on it. Then all TDWG standards could be revised over time to comply to that vocabulary standard, including DwC.
Stan said: " I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence." An occurrence record generally has an organism's name associated with it in the real world. It is necessary and inevitable that vocabulary about organism names will be used in an occurrence data exchange schema like DwC. We have been stymied with this idea for years. A standard Ontology/vocabulary for the elements of name information needed to be associated with an occurrence, or a description, or a taxon concept would go a long way toward solving this duality. The "standard vocabulary" would not be standardized within DwC but it would be used in DwC.
Of course there is the problem of the hundreds of installations of DiGIR that use DwC "classic" and are no doubt not going to change for a long time. I think they just have to be accepted and worked around going forward. It's impractical to think of anything else. But, the past should not roadblock the future and we need to get moving toward that future.
Stan thinks that the Ontology is not appropriate for TDWG ratification. Why not? Change has to start somewhere. Yes, other standards would probably be in conflict if the Ontology were ratified, but I think we want to ultimately have consistency across all the standards and that means there has to be change going forward. I think a ratified TDWG Ontology would provide the foundation upon which to start building those changes.
Chuck
________________________________
From: tdwg-tag-bounces@lists.tdwg.org [mailto:tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Kevin Richards Sent: Saturday, April 25, 2009 2:53 AM To: Blum, Stan; Technical Architecture Group mailinglist; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
I see the ontology as a model of ALL (hopefully, eventually all) the data in our domain of biodiversity informatics.
I would love to see it as a standard (at the least it might give it a bit more clout).
I agree that the ontology is useful to tie other TDWG schemas together, using it as a core/master model. I would be happy to see it used for ALL tasks within TDWG, but I understand the usefulnes of the more specific schemas/standards - horses for courses.
If I understand Stan here, I agree with him about the dubious use of DwC for representing Taxon Concepts/Names. As far as I know, it was really intended as a transfer standard for observation records?? It contains very limited taxon information! It really is not a overly difficult job to use a more suitable schema/ontology. I think the popularity of Darwin Core is due to its simplicity - and I wonder if what Roger is proposing will help with this - ie an XML implementation of the ontology as well as an RDF version. This will allow people to create very simple XML documents with reasonably simple/flat data, eg an xml document of TaxonName entities, with perhaps 6 or 7 or so key fields - even simpler than DwC. :-)
Kevin
________________________________
From: tdwg-tag-bounces@lists.tdwg.org [tdwg-tag-bounces@lists.tdwg.org] On Behalf Of Blum, Stan [sblum@calacademy.org] Sent: Saturday, 25 April 2009 6:12 a.m. To: Technical Architecture Group mailing list; exec@tdwg.org Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
From: tdwg-tag-bounces@lists.tdwg.org on behalf of John R. WIECZOREK Sent: Fri 2009-04-24 8:58 AM To: Roger Hyam Cc: Technical Architecture Group mailing list Subject: Re: [tdwg-tag] darwin core terms inside tdwg ontology
Anything I should do on the DwC side in anticipation of harmony?
http://rs.tdwg.org/dwc/terms/index.htm#theterms
===========================================================
John,
At some point, all or (most) of the DarwinCore terms need to be added to the TDWG ontology.
But having said that, I also need to say that I'm uncomfortable with:
1) The current state of the TDWG ontology (primarily the naming conventions; lets just use terms names), and our understanding of the role it plays in TDWG and how it will be managed (entry of terms, integration of terms into the conceptual [is-a / has-a] relationships to other terms); and
2) the fact that the new DarwinCore straddles or overlaps the roles of an ontology and an application schema.
I understood the past TAG roadmaps to indicate that we were adopting an approach in which the TDWG Ontology would be a repository for data concepts that are present in (or implied by) TDWG standards; and that real data transmission would be accomplished with application schemas. The ontology itself would not be a standard, but would be a tool that helps integrate standards. I thought our standards would be created to function as application schemas or components of application schemas (as in the DwC and its extensions). I am now pretty confused. I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence. I haven't gone over all the existing docs, so apologies if I've missed that, but I think it's confusing that a (new) DarwinCore record could be either a taxonomic name or an organism occurrence, or maybe something else. Maybe I'm too attached to object orientation and just don't GET the semantic web, but it feels to me like we are stepping into squishy ground.
Also, I the the DCMI maintenance procedures are also more appropriately applied to the ontology than a TDWG standard. The existing process for ratifying TDWG standards and the procedure in the DwC seem to be pretty explicitly in conflict; one can change the other cannot (without becoming another thing).
Is anyone else having these same trepidations? I don't think I've been as much of a Rip Van Winkle as Jim Croft, but I clearly missed some important shifts.
-Stan
________________________________
Please consider the environment before printing this email Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails. The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
Please note, the most current draft of the DarwinCore is: here http://code.google.com/p/darwincore/ and here http://rs.tdwg.org/dwc/index.htm, not here http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard),
My primary concern with the latest draft is the absence of an explicit class identifier (and an implicit class definition) that indicates what kind of data object the sender is transmitting. If an indexer/aggregator is indexing multiple kinds of resources, as GBIF is, and a publisher provides a record with these elements [ScientificName, ScientificNameAuthorship, NamePublishedIn, and Country], how should the indexer interepret this record? Is it an organism occurrence record, an authoritative taxonomic record (the country name indicating the entire known range of the taxon), or part of a taxonomic checklist for that country? The term/element [BasisOfRecord] is the first step in narrowing the possible meanings, but it's the only step and it appears not to be a required step. (I interpret the "Status: recommended" to mean that it's optional.) At a minimum, BasisOfRecord should be required. It would still be possible to publish garbage (at least hard to interpret records) because our tools don't constrain structure, and there isn't (yet) any guidance controlling the structures for different classes of objects.
The new GBIF Internet Publishing Toolkit (IPT) supports one-to-many relationships among a series of flat tables and looks like it's going to make it easier to transmit more complicated data than we were doing with DiGIR and TAPIR. In conjunction with this new expanded bag of elements we could see a lot more complex data get published. If I may use a skiing metaphor, we've been on the bunny slopes (for beginners) up until now. The new DarwinCore looks like a nicely groomed black diamond slope, but we haven't had any lessons or even watched anyone else do what we're going to attempt. I think we're going to end up in a heap at the bottom of the hill, and the aggregators are going to have to sort it out. ( Woohoo! Extreme biodiversity informatics! )
Finally, I did not mean to say that the Ontology should not be ratified, as in not supported; what I meant was that it should not be a standard because we will want to change its contents without versioning the ontology as a whole. (Also, its not an application schema, so it can't be used directly unless we venture into somewhat uncharted territory.) Its role is to help us keep our application schemas coordinated.
The earlier versions of the DarwinCore (or our protocols, or the way we used them together) were too limiting (see Greg's comments); this version allows terms to be combined in nonsensical ways. It could make life very difficult for data integrators.
I think the bottom line is that we URGENTLY need a similar concerted effort to advance the TDWG (biodiversity informatics) ontology, and a companion set of application schemas coming forward from the collections, marine biology, paleo, observation, taxon-name-concept groups as soon as possible.
-Stan
-----Original Message----- From: Chuck Miller [mailto:Chuck.Miller@mobot.org] Sent: Monday, April 27, 2009 7:22 AM To: Kevin Richards; Blum, Stan; Technical Architecture Group mailinglist; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin,
I agree with you and Stan that the ontology is useful to all schemas. It seems to me that a "TDWG Ontology" is a totally new and different kind of thing than all the data exchange standards of the prior 10 years - DwC, SDD, TCS, etc. But, it is a very useful and important new kind of thing that should be part of the TDWG standards architecture. It challenges prior thinking about the nature of TDWG standards to grasp what standardizing on an ontology means. But, I think it's what is needed.
If TDWG standardized on one Ontology, then the vocabulary of all data exchange could be standardized on it. Then all TDWG standards could be revised over time to comply to that vocabulary standard, including DwC.
Stan said: " I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence." An occurrence record generally has an organism's name associated with it in the real world. It is necessary and inevitable that vocabulary about organism names will be used in an occurrence data exchange schema like DwC. We have been stymied with this idea for years. A standard Ontology/vocabulary for the elements of name information needed to be associated with an occurrence, or a description, or a taxon concept would go a long way toward solving this duality. The "standard vocabulary" would not be standardized within DwC but it would be used in DwC.
Of course there is the problem of the hundreds of installations of DiGIR that use DwC "classic" and are no doubt not going to change for a long time. I think they just have to be accepted and worked around going forward. It's impractical to think of anything else. But, the past should not roadblock the future and we need to get moving toward that future.
Stan thinks that the Ontology is not appropriate for TDWG ratification. Why not? Change has to start somewhere. Yes, other standards would probably be in conflict if the Ontology were ratified, but I think we want to ultimately have consistency across all the standards and that means there has to be change going forward. I think a ratified TDWG Ontology would provide the foundation upon which to start building those changes.
Chuck
________________________________
Please note again that at this moment the resources Stan mentioned are not in sync because I am finishing the revision based on peer review. When that is done (in a day or two) they will be in sync, and that state will be versioned and sent as a package back to the editor.
On Mon, Apr 27, 2009 at 6:22 PM, Blum, Stan sblum@calacademy.org wrote:
Please note, the most current draft of the DarwinCore is: here http://code.google.com/p/darwincore/%C2%A0and here http://rs.tdwg.org/dwc/index.htm, not here http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard),
My primary concern with the latest draft is the absence of an explicit class identifier (and an implicit class definition) that indicates what kind of data object the sender is transmitting. If an indexer/aggregator is indexing multiple kinds of resources, as GBIF is, and a publisher provides a record with these elements [ScientificName, ScientificNameAuthorship, NamePublishedIn, and Country], how should the indexer interepret this record? Is it an organism occurrence record, an authoritative taxonomic record (the country name indicating the entire known range of the taxon), or part of a taxonomic checklist for that country? The term/element [BasisOfRecord] is the first step in narrowing the possible meanings, but it's the only step and it appears not to be a required step. (I interpret the "Status: recommended" to mean that it's optional.) At a minimum, BasisOfRecord should be required. It would still be possible to publish garbage (at least hard to interpret records) because our tools don't constrain structure, and there isn't (yet) any guidance controlling the structures for different classes of objects.
The new GBIF Internet Publishing Toolkit (IPT) supports one-to-many relationships among a series of flat tables and looks like it's going to make it easier to transmit more complicated data than we were doing with DiGIR and TAPIR. In conjunction with this new expanded bag of elements we could see a lot more complex data get published. If I may use a skiing metaphor, we've been on the bunny slopes (for beginners) up until now. The new DarwinCore looks like a nicely groomed black diamond slope, but we haven't had any lessons or even watched anyone else do what we're going to attempt. I think we're going to end up in a heap at the bottom of the hill, and the aggregators are going to have to sort it out. ( Woohoo! Extreme biodiversity informatics! )
Finally, I did not mean to say that the Ontology should not be ratified, as in not supported; what I meant was that it should not be a standard because we will want to change its contents without versioning the ontology as a whole. (Also, its not an application schema, so it can't be used directly unless we venture into somewhat uncharted territory.) Its role is to help us keep our application schemas coordinated.
The earlier versions of the DarwinCore (or our protocols, or the way we used them together) were too limiting (see Greg's comments); this version allows terms to be combined in nonsensical ways. It could make life very difficult for data integrators.
I think the bottom line is that we URGENTLY need a similar concerted effort to advance the TDWG (biodiversity informatics) ontology, and a companion set of application schemas coming forward from the collections, marine biology, paleo, observation, taxon-name-concept groups as soon as possible.
-Stan
-----Original Message----- From: Chuck Miller [mailto:Chuck.Miller@mobot.org] Sent: Monday, April 27, 2009 7:22 AM To: Kevin Richards; Blum, Stan; Technical Architecture Group mailinglist; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin,
I agree with you and Stan that the ontology is useful to all schemas. It seems to me that a “TDWG Ontology” is a totally new and different kind of thing than all the data exchange standards of the prior 10 years – DwC, SDD, TCS, etc. But, it is a very useful and important new kind of thing that should be part of the TDWG standards architecture. It challenges prior thinking about the nature of TDWG standards to grasp what standardizing on an ontology means. But, I think it’s what is needed.
If TDWG standardized on one Ontology, then the vocabulary of all data exchange could be standardized on it. Then all TDWG standards could be revised over time to comply to that vocabulary standard, including DwC.
Stan said: “ I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence.” An occurrence record generally has an organism’s name associated with it in the real world. It is necessary and inevitable that vocabulary about organism names will be used in an occurrence data exchange schema like DwC. We have been stymied with this idea for years. A standard Ontology/vocabulary for the elements of name information needed to be associated with an occurrence, or a description, or a taxon concept would go a long way toward solving this duality. The “standard vocabulary” would not be standardized within DwC but it would be used in DwC.
Of course there is the problem of the hundreds of installations of DiGIR that use DwC “classic” and are no doubt not going to change for a long time. I think they just have to be accepted and worked around going forward. It’s impractical to think of anything else. But, the past should not roadblock the future and we need to get moving toward that future.
Stan thinks that the Ontology is not appropriate for TDWG ratification. Why not? Change has to start somewhere. Yes, other standards would probably be in conflict if the Ontology were ratified, but I think we want to ultimately have consistency across all the standards and that means there has to be change going forward. I think a ratified TDWG Ontology would provide the foundation upon which to start building those changes.
Chuck
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
OK, so it took four days, not one or two, but it wasn't for lack of effort.
The version of DwC addressing the peer review comments are now on the official site at http://rs.tdwg.org/dwc/index.htm and submitted to the standards track.
If I didn't still have to pack to go to Argentina in the morning I might be tempted to address some of the concerns that have arisen. That'll have to wait. Have fun with it.
On Mon, Apr 27, 2009 at 6:25 PM, John R. WIECZOREK tuco@berkeley.edu wrote:
Please note again that at this moment the resources Stan mentioned are not in sync because I am finishing the revision based on peer review. When that is done (in a day or two) they will be in sync, and that state will be versioned and sent as a package back to the editor.
On Mon, Apr 27, 2009 at 6:22 PM, Blum, Stan sblum@calacademy.org wrote:
Please note, the most current draft of the DarwinCore is: here http://code.google.com/p/darwincore/%C2%A0and here http://rs.tdwg.org/dwc/index.htm, not here http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard),
My primary concern with the latest draft is the absence of an explicit class identifier (and an implicit class definition) that indicates what kind of data object the sender is transmitting. If an indexer/aggregator is indexing multiple kinds of resources, as GBIF is, and a publisher provides a record with these elements [ScientificName, ScientificNameAuthorship, NamePublishedIn, and Country], how should the indexer interepret this record? Is it an organism occurrence record, an authoritative taxonomic record (the country name indicating the entire known range of the taxon), or part of a taxonomic checklist for that country? The term/element [BasisOfRecord] is the first step in narrowing the possible meanings, but it's the only step and it appears not to be a required step. (I interpret the "Status: recommended" to mean that it's optional.) At a minimum, BasisOfRecord should be required. It would still be possible to publish garbage (at least hard to interpret records) because our tools don't constrain structure, and there isn't (yet) any guidance controlling the structures for different classes of objects.
The new GBIF Internet Publishing Toolkit (IPT) supports one-to-many relationships among a series of flat tables and looks like it's going to make it easier to transmit more complicated data than we were doing with DiGIR and TAPIR. In conjunction with this new expanded bag of elements we could see a lot more complex data get published. If I may use a skiing metaphor, we've been on the bunny slopes (for beginners) up until now. The new DarwinCore looks like a nicely groomed black diamond slope, but we haven't had any lessons or even watched anyone else do what we're going to attempt. I think we're going to end up in a heap at the bottom of the hill, and the aggregators are going to have to sort it out. ( Woohoo! Extreme biodiversity informatics! )
Finally, I did not mean to say that the Ontology should not be ratified, as in not supported; what I meant was that it should not be a standard because we will want to change its contents without versioning the ontology as a whole. (Also, its not an application schema, so it can't be used directly unless we venture into somewhat uncharted territory.) Its role is to help us keep our application schemas coordinated.
The earlier versions of the DarwinCore (or our protocols, or the way we used them together) were too limiting (see Greg's comments); this version allows terms to be combined in nonsensical ways. It could make life very difficult for data integrators.
I think the bottom line is that we URGENTLY need a similar concerted effort to advance the TDWG (biodiversity informatics) ontology, and a companion set of application schemas coming forward from the collections, marine biology, paleo, observation, taxon-name-concept groups as soon as possible.
-Stan
-----Original Message----- From: Chuck Miller [mailto:Chuck.Miller@mobot.org] Sent: Monday, April 27, 2009 7:22 AM To: Kevin Richards; Blum, Stan; Technical Architecture Group mailinglist; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin,
I agree with you and Stan that the ontology is useful to all schemas. It seems to me that a “TDWG Ontology” is a totally new and different kind of thing than all the data exchange standards of the prior 10 years – DwC, SDD, TCS, etc. But, it is a very useful and important new kind of thing that should be part of the TDWG standards architecture. It challenges prior thinking about the nature of TDWG standards to grasp what standardizing on an ontology means. But, I think it’s what is needed.
If TDWG standardized on one Ontology, then the vocabulary of all data exchange could be standardized on it. Then all TDWG standards could be revised over time to comply to that vocabulary standard, including DwC.
Stan said: “ I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence.” An occurrence record generally has an organism’s name associated with it in the real world. It is necessary and inevitable that vocabulary about organism names will be used in an occurrence data exchange schema like DwC. We have been stymied with this idea for years. A standard Ontology/vocabulary for the elements of name information needed to be associated with an occurrence, or a description, or a taxon concept would go a long way toward solving this duality. The “standard vocabulary” would not be standardized within DwC but it would be used in DwC.
Of course there is the problem of the hundreds of installations of DiGIR that use DwC “classic” and are no doubt not going to change for a long time. I think they just have to be accepted and worked around going forward. It’s impractical to think of anything else. But, the past should not roadblock the future and we need to get moving toward that future.
Stan thinks that the Ontology is not appropriate for TDWG ratification. Why not? Change has to start somewhere. Yes, other standards would probably be in conflict if the Ontology were ratified, but I think we want to ultimately have consistency across all the standards and that means there has to be change going forward. I think a ratified TDWG Ontology would provide the foundation upon which to start building those changes.
Chuck
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Im trying my best to catch up with all the mails in this thread lately, but surely missed something. Ive ended up with a rather long post, please excuse, but I think the new dwc draft needs some explanations...
The idea of darwin core is to provide a simple list of terms which are useful in different contexts - just like dublin core does. We have been thinking about giving the terms a "domain", the dublin core terminology for binding a property to a class. Although we did declare a terms domain in the description, we finally decided that this is not more than a grouping of terms. Assigning a domain is a rather controversary task as properties can belong to several classes and the granularity of class definitions is strongly depending on your views. This is likely the mayor obstacle in getting an agreed ontology on the road too I believe.
But in order to express the context of a collection of darwin core terms we also defined class terms representing a taxon/name, an occurrence/specimen, a site, a collecting event and more. Actually there are 2 ways of expressing the type of a thing in the new terms - using a class term (e.g. as the parent xml element instead of just record or inside the rowType attribute of the text archive meta file) or using the basis of record when using the simple/classic darwin core with a meaningfree DarwinRecord parent element.
During last TDWG I got very attracted to the KISS idea that was present all over, but especially in Chucks talk. Why has darwin core been so much more successful than other tdwg formats? Why can't we use the same approach to share taxonomic or nomenclatural data? In the end the core properties for a taxon or name are rather limited and if you take TCS apart there is not much more you can do with it than what the taxonomic dwc terms provide - leaving concept relations aside. We also decided not to separate a name from a taxon. That happens in the context and due to the present of a taxonomic status or taxonomic classification. In order to save you from looking up the latest draft, here is the list of the taxonomic terms with a few quick annoations from myself:
[ taken from http://darwincore.googlecode.com/svn/trunk/terms/ index.htm ]
taxonID # the taxon/name ID, preferrably a GUID but local ids are permitted too. Any ID scientificName # the full name incl authorship higherTaxonID # ID pointing to the next higher taxon higherTaxon # full scientific name of the next higher taxon. Can be used alternatively or in addition to the ID above kingdom # the classic dwc higher ranks to simplify transfer in many cases and useful to disambiguate homonyms phylum class order family genus subgenus specificEpithet # atomised epitheton part as you would guess taxonRank # any rank string, preferrably taken from a controlled list such as the TCS one infraspecificEpithet scientificNameAuthorship # the authorship alone. Similar to the other atomised parts this might not be needed as the complete sciname string is good enough, but it seemed desired by many nomenclaturalCode # the code to disambiguate homonyms across the codes taxonAccordingTo # the taxon concept sec reference namePublishedIn # the nomenclatural reference, original publication of the name taxonomicStatus # a status such as accepted, (heterotypic) synonym, missapplied name. Ideally also a vocabulary such as the ontology one for nomenclatural note types nomenclaturalStatus # similar to above, but only nomenclaturaly relevant statuses such as nom. illeg. acceptedTaxonID # pointer to accepted taxon in case of synonyms or misapplied names acceptedTaxon # explicit string version of above basionymID # pointer to the basionym basionym # explicit string version of above
So frankly I think there is a lot you can do with these terms. Every synonym will be a name on its own and link to the accepted taxon, having its own taxonomic status that can provide you with details about the reason for it being a synonym or misapplied name. If combined with 1 to many extensions, you could even exchange pretty detailed species information in a *very* simple model. Imagine you have datat based on extensions for species such as distribution (multiple distribution status per area per species), descriptions (just a simple title, body, and description type ala SPM classes would be great), multimedia, alternative links/guids, types data. As an exercise I have also translated parts of the GISIN schema into extensions based around species described by darwin core terms and it works fine.
Surely darwin core terms can leave you in doubt about the exact context, just as dublin core does. But it immediately allows you to share data in a simple way that people are comfortable with and it is not tight to a technology per se. I see this as the biggest advantage of all. It plays nice in the world of xml, rdf, plain text, xhtml, microformats - just anything as it is based on just a list of terms with an associated globally unique URI. And by decoupling the recommendations on how to use the dwc terms in the context of different technologies, we can keep the term definitions stable no matter what comes next.
Also I wanted to avoid confusion about IPT and the dwc text archives. The text archive, extensions and vocabularies idea implemented in the IPT are not IPT specific. It's just a very simple exchanging format based on the dwc recommendations for how to use darwin core in the context of text/csv files. For the interested I am writing on client code currently, so there will be a java archive reader in a few days that allows you to easily iterate over the core dwc records in an archive and pulls out the related extension records together with the core record.
I am glad to see this list being so alive again! Markus
On Apr 28, 2009, at 3:22, Blum, Stan wrote:
@font-face { font-family: Tahoma; } @page Section1 {size: 8.5in 11.0in; margin: 1.0in 1.25in 1.0in 1.25in; } P.MsoNormal { FONT- SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman" } LI.MsoNormal { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman" } DIV.MsoNormal { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman" } A:link { COLOR: blue; TEXT- DECORATION: underline } SPAN.MsoHyperlink { COLOR: blue; TEXT- DECORATION: underline } A:visited { COLOR: blue; TEXT-DECORATION: underline } SPAN.MsoHyperlinkFollowed { COLOR: blue; TEXT- DECORATION: underline } P { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman" } SPAN.EmailStyle18 { COLOR: navy; FONT-FAMILY: Arial; mso-style-type: personal-reply } DIV.Section1 { page: Section1 } Please note, the most current draft of the DarwinCore is: here http://code.google.com/p/darwincore/ and here http://rs.tdwg.org/dwc/index.htm, not here http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard) ,
My primary concern with the latest draft is the absence of an explicit class identifier (and an implicit class definition) that indicates what kind of data object the sender is transmitting. If an indexer/aggregator is indexing multiple kinds of resources, as GBIF is, and a publisher provides a record with these elements [ScientificName, ScientificNameAuthorship, NamePublishedIn, and Country], how should the indexer interepret this record? Is it an organism occurrence record, an authoritative taxonomic record (the country name indicating the entire known range of the taxon), or part of a taxonomic checklist for that country? The term/element [BasisOfRecord] is the first step in narrowing the possible meanings, but it's the only step and it appears not to be a required step. (I interpret the "Status: recommended" to mean that it's optional.) At a minimum, BasisOfRecord should be required. It would still be possible to publish garbage (at least hard to interpret records) because our tools don't constrain structure, and there isn't (yet) any guidance controlling the structures for different classes of objects.
The new GBIF Internet Publishing Toolkit (IPT) supports one-to-many relationships among a series of flat tables and looks like it's going to make it easier to transmit more complicated data than we were doing with DiGIR and TAPIR. In conjunction with this new expanded bag of elements we could see a lot more complex data get published. If I may use a skiing metaphor, we've been on the bunny slopes (for beginners) up until now. The new DarwinCore looks like a nicely groomed black diamond slope, but we haven't had any lessons or even watched anyone else do what we're going to attempt. I think we're going to end up in a heap at the bottom of the hill, and the aggregators are going to have to sort it out. ( Woohoo! Extreme biodiversity informatics! )
Finally, I did not mean to say that the Ontology should not be ratified, as in not supported; what I meant was that it should not be a standard because we will want to change its contents without versioning the ontology as a whole. (Also, its not an application schema, so it can't be used directly unless we venture into somewhat uncharted territory.) Its role is to help us keep our application schemas coordinated.
The earlier versions of the DarwinCore (or our protocols, or the way we used them together) were too limiting (see Greg's comments); this version allows terms to be combined in nonsensical ways. It could make life very difficult for data integrators.
I think the bottom line is that we URGENTLY need a similar concerted effort to advance the TDWG (biodiversity informatics) ontology, and a companion set of application schemas coming forward from the collections, marine biology, paleo, observation, taxon-name-concept groups as soon as possible.
-Stan
-----Original Message----- From: Chuck Miller [mailto:Chuck.Miller@mobot.org] Sent: Monday, April 27, 2009 7:22 AM To: Kevin Richards; Blum, Stan; Technical Architecture Group mailinglist; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin,
I agree with you and Stan that the ontology is useful to all schemas. It seems to me that a “TDWG Ontology” is a totally new and different kind of thing than all the data exchange standards of the prior 10 years – DwC, SDD, TCS, etc. But, it is a very useful and important new kind of thing that should be part of the TDWG standards architecture. It challenges prior thinking about the nature of TDWG standards to grasp what standardizing on an ontology means. But, I think it’s what is needed.
If TDWG standardized on one Ontology, then the vocabulary of all data exchange could be standardized on it. Then all TDWG standards could be revised over time to comply to that vocabulary standard, including DwC.
Stan said: “ I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence.” An occurrence record generally has an organism’s name associated with it in the real world. It is necessary and inevitable that vocabulary about organism names will be used in an occurrence data exchange schema like DwC. We have been stymied with this idea for years. A standard Ontology/ vocabulary for the elements of name information needed to be associated with an occurrence, or a description, or a taxon concept would go a long way toward solving this duality. The “standard vocabulary” would not be standardized within DwC but it would be used in DwC.
Of course there is the problem of the hundreds of installations of DiGIR that use DwC “classic” and are no doubt not going to change for a long time. I think they just have to be accepted and worked around going forward. It’s impractical to think of anything else. But, the past should not roadblock the future and we need to get moving toward that future.
Stan thinks that the Ontology is not appropriate for TDWG ratification. Why not? Change has to start somewhere. Yes, other standards would probably be in conflict if the Ontology were ratified, but I think we want to ultimately have consistency across all the standards and that means there has to be change going forward. I think a ratified TDWG Ontology would provide the foundation upon which to start building those changes.
Chuck
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Hi Markus, This is very cool. :-)
I had some ideas, suggestions:
Add: TaxonConceptID => unique identifier of taxon concept, some sort of resolvable GUID
Change: TaxonID to TaxonNameID since it seems to link to something like a uBio namebankid
Question: higherTaxonID, higherTaxon are these the accepted names or the names on the collection label?
If accepted, change to higherTaxonNameID, higherTaxonName
Add: HigherTaxonConceptID, this would link to a the taxon concept for the next highest group with a more stable identifier than the family name / lsid.
Question: scientificNameAuthorship does this include the year? also I replace @ with "and" to avoid encoding, decoding e.g. "Smith and Jones"
Question: taxonAccordingTo is this a DOI? or linked identifier?
Comment: namePublishedIn should these be a subproperty of dcterms:BibliographicCitation?
Change: acceptedTaxonID => acceptedTaxonNameID since it appears this would be a uBio namebank ID type identfier acceptedTaxon => acceptedTaxonName since this is the current accepted name.
These changes should not effect any of your goals but it would help disambiguate a species concept i.e. this is a real species, from the current taxonomic hypothesis that this species belongs in this genus and family.
The recognition that something is a real species changes very little overtime, but the taxonomic hypothesis implicit in the Genus and species names seems to change fairly often.
It also would have all the synonyms point to a relatively stable GUID for the species concept, not the latest accepted taxonomic hypothesis i.e. "name"
The rational is that this makes it easier to find all records of the same species that happen to labeled either *Aedes triseriatus* or *Ochleratatus triseriatus *via names and the LSID's for those names.
[Caution Red Herring]
I will also throw this idea out there, as red herring. Is there any advantage to differentiating between different taxon levels for instance, species, family, order etc?
For instance having
SpeciesTaxon, speciesTaxonNameID speciesScientificName ... speciesInFamilyTaxonID speciesInFamily
FamilyTaxon familyTaxonNameID familyScientificName familyinOrderTaxonID familyinOrderName
...
This would make it clearer as to what attributes apply to which records. For instance, an order would have a kingdom, phylum, class, but not a family. Another way to achieve this is to assume that Taxon is a species, or paraspecies entity like subspecies, variety, and not one of the higher level groupings. For example, a Taxon is a OTU operating at or near the species level. I think most of the DarwinCore observation records will be of species or paraspecies anyway.
These are just some things to think about,
Respectfully,
- Pete
On Tue, Apr 28, 2009 at 12:15 AM, "Markus Döring (GBIF)" mdoering@gbif.orgwrote:
Im trying my best to catch up with all the mails in this thread lately, but surely missed something. Ive ended up with a rather long post, please excuse, but I think the new dwc draft needs some explanations...
The idea of darwin core is to provide a simple list of terms which are useful in different contexts - just like dublin core does. We have been thinking about giving the terms a "domain", the dublin core terminology for binding a property to a class. Although we did declare a terms domain in the description, we finally decided that this is not more than a grouping of terms. Assigning a domain is a rather controversary task as properties can belong to several classes and the granularity of class definitions is strongly depending on your views. This is likely the mayor obstacle in getting an agreed ontology on the road too I believe.
But in order to express the context of a collection of darwin core terms we also defined class terms representing a taxon/name, an occurrence/specimen, a site, a collecting event and more. Actually there are 2 ways of expressing the type of a thing in the new terms - using a class term (e.g. as the parent xml element instead of just record or inside the rowType attribute of the text archive meta file) or using the basis of record when using the simple/classic darwin core with a meaningfree DarwinRecord parent element.
During last TDWG I got very attracted to the KISS idea that was present all over, but especially in Chucks talk. Why has darwin core been so much more successful than other tdwg formats? Why can't we use the same approach to share taxonomic or nomenclatural data? In the end the core properties for a taxon or name are rather limited and if you take TCS apart there is not much more you can do with it than what the taxonomic dwc terms provide - leaving concept relations aside. We also decided not to separate a name from a taxon. That happens in the context and due to the present of a taxonomic status or taxonomic classification. In order to save you from looking up the latest draft, here is the list of the taxonomic terms with a few quick annoations from myself:
[ taken from http://darwincore.googlecode.com/svn/trunk/terms/ index.htm ]
taxonID # the taxon/name ID, preferrably a GUID but local ids are permitted too. Any ID scientificName # the full name incl authorship higherTaxonID # ID pointing to the next higher taxon higherTaxon # full scientific name of the next higher taxon. Can be used alternatively or in addition to the ID above kingdom # the classic dwc higher ranks to simplify transfer in many cases and useful to disambiguate homonyms phylum class order family genus subgenus specificEpithet # atomised epitheton part as you would guess taxonRank # any rank string, preferrably taken from a controlled list such as the TCS one infraspecificEpithet scientificNameAuthorship # the authorship alone. Similar to the other atomised parts this might not be needed as the complete sciname string is good enough, but it seemed desired by many nomenclaturalCode # the code to disambiguate homonyms across the codes taxonAccordingTo # the taxon concept sec reference namePublishedIn # the nomenclatural reference, original publication of the name taxonomicStatus # a status such as accepted, (heterotypic) synonym, missapplied name. Ideally also a vocabulary such as the ontology one for nomenclatural note types nomenclaturalStatus # similar to above, but only nomenclaturaly relevant statuses such as nom. illeg. acceptedTaxonID # pointer to accepted taxon in case of synonyms or misapplied names acceptedTaxon # explicit string version of above basionymID # pointer to the basionym basionym # explicit string version of above
So frankly I think there is a lot you can do with these terms. Every synonym will be a name on its own and link to the accepted taxon, having its own taxonomic status that can provide you with details about the reason for it being a synonym or misapplied name. If combined with 1 to many extensions, you could even exchange pretty detailed species information in a *very* simple model. Imagine you have datat based on extensions for species such as distribution (multiple distribution status per area per species), descriptions (just a simple title, body, and description type ala SPM classes would be great), multimedia, alternative links/guids, types data. As an exercise I have also translated parts of the GISIN schema into extensions based around species described by darwin core terms and it works fine.
Surely darwin core terms can leave you in doubt about the exact context, just as dublin core does. But it immediately allows you to share data in a simple way that people are comfortable with and it is not tight to a technology per se. I see this as the biggest advantage of all. It plays nice in the world of xml, rdf, plain text, xhtml, microformats - just anything as it is based on just a list of terms with an associated globally unique URI. And by decoupling the recommendations on how to use the dwc terms in the context of different technologies, we can keep the term definitions stable no matter what comes next.
Also I wanted to avoid confusion about IPT and the dwc text archives. The text archive, extensions and vocabularies idea implemented in the IPT are not IPT specific. It's just a very simple exchanging format based on the dwc recommendations for how to use darwin core in the context of text/csv files. For the interested I am writing on client code currently, so there will be a java archive reader in a few days that allows you to easily iterate over the core dwc records in an archive and pulls out the related extension records together with the core record.
I am glad to see this list being so alive again! Markus
On Apr 28, 2009, at 3:22, Blum, Stan wrote:
@font-face { font-family: Tahoma; } @page Section1 {size: 8.5in 11.0in; margin: 1.0in 1.25in 1.0in 1.25in; } P.MsoNormal { FONT- SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman" } LI.MsoNormal { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman" } DIV.MsoNormal { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman" } A:link { COLOR: blue; TEXT- DECORATION: underline } SPAN.MsoHyperlink { COLOR: blue; TEXT- DECORATION: underline } A:visited { COLOR: blue; TEXT-DECORATION: underline } SPAN.MsoHyperlinkFollowed { COLOR: blue; TEXT- DECORATION: underline } P { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman" } SPAN.EmailStyle18 { COLOR: navy; FONT-FAMILY: Arial; mso-style-type: personal-reply } DIV.Section1 { page: Section1 } Please note, the most current draft of the DarwinCore is: here http://code.google.com/p/darwincore/ and here http://rs.tdwg.org/dwc/index.htm, not here
http://wiki.tdwg.org/twiki/bin/view/DarwinCore/DarwinCoreDraftStandard)
,
My primary concern with the latest draft is the absence of an explicit class identifier (and an implicit class definition) that indicates what kind of data object the sender is transmitting. If an indexer/aggregator is indexing multiple kinds of resources, as GBIF is, and a publisher provides a record with these elements [ScientificName, ScientificNameAuthorship, NamePublishedIn, and Country], how should the indexer interepret this record? Is it an organism occurrence record, an authoritative taxonomic record (the country name indicating the entire known range of the taxon), or part of a taxonomic checklist for that country? The term/element [BasisOfRecord] is the first step in narrowing the possible meanings, but it's the only step and it appears not to be a required step. (I interpret the "Status: recommended" to mean that it's optional.) At a minimum, BasisOfRecord should be required. It would still be possible to publish garbage (at least hard to interpret records) because our tools don't constrain structure, and there isn't (yet) any guidance controlling the structures for different classes of objects.
The new GBIF Internet Publishing Toolkit (IPT) supports one-to-many relationships among a series of flat tables and looks like it's going to make it easier to transmit more complicated data than we were doing with DiGIR and TAPIR. In conjunction with this new expanded bag of elements we could see a lot more complex data get published. If I may use a skiing metaphor, we've been on the bunny slopes (for beginners) up until now. The new DarwinCore looks like a nicely groomed black diamond slope, but we haven't had any lessons or even watched anyone else do what we're going to attempt. I think we're going to end up in a heap at the bottom of the hill, and the aggregators are going to have to sort it out. ( Woohoo! Extreme biodiversity informatics! )
Finally, I did not mean to say that the Ontology should not be ratified, as in not supported; what I meant was that it should not be a standard because we will want to change its contents without versioning the ontology as a whole. (Also, its not an application schema, so it can't be used directly unless we venture into somewhat uncharted territory.) Its role is to help us keep our application schemas coordinated.
The earlier versions of the DarwinCore (or our protocols, or the way we used them together) were too limiting (see Greg's comments); this version allows terms to be combined in nonsensical ways. It could make life very difficult for data integrators.
I think the bottom line is that we URGENTLY need a similar concerted effort to advance the TDWG (biodiversity informatics) ontology, and a companion set of application schemas coming forward from the collections, marine biology, paleo, observation, taxon-name-concept groups as soon as possible.
-Stan
-----Original Message----- From: Chuck Miller [mailto:Chuck.Miller@mobot.org] Sent: Monday, April 27, 2009 7:22 AM To: Kevin Richards; Blum, Stan; Technical Architecture Group mailinglist; exec@tdwg.org Subject: RE: [tdwg-tag] darwin core terms inside tdwg ontology
Kevin,
I agree with you and Stan that the ontology is useful to all schemas. It seems to me that a “TDWG Ontology” is a totally new and different kind of thing than all the data exchange standards of the prior 10 years – DwC, SDD, TCS, etc. But, it is a very useful and important new kind of thing that should be part of the TDWG standards architecture. It challenges prior thinking about the nature of TDWG standards to grasp what standardizing on an ontology means. But, I think it’s what is needed.
If TDWG standardized on one Ontology, then the vocabulary of all data exchange could be standardized on it. Then all TDWG standards could be revised over time to comply to that vocabulary standard, including DwC.
Stan said: “ I'd like to hear the rationale for combining taxonomic name/concept with organism occurrence.” An occurrence record generally has an organism’s name associated with it in the real world. It is necessary and inevitable that vocabulary about organism names will be used in an occurrence data exchange schema like DwC. We have been stymied with this idea for years. A standard Ontology/ vocabulary for the elements of name information needed to be associated with an occurrence, or a description, or a taxon concept would go a long way toward solving this duality. The “standard vocabulary” would not be standardized within DwC but it would be used in DwC.
Of course there is the problem of the hundreds of installations of DiGIR that use DwC “classic” and are no doubt not going to change for a long time. I think they just have to be accepted and worked around going forward. It’s impractical to think of anything else. But, the past should not roadblock the future and we need to get moving toward that future.
Stan thinks that the Ontology is not appropriate for TDWG ratification. Why not? Change has to start somewhere. Yes, other standards would probably be in conflict if the Ontology were ratified, but I think we want to ultimately have consistency across all the standards and that means there has to be change going forward. I think a ratified TDWG Ontology would provide the foundation upon which to start building those changes.
Chuck
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
There is an OMG Metamodel for ontology description. At least one implementation of it http://www.eclipse.org/m2m/atl/usecases/ODMImplementation/ claims to support designing OWL ontologies with UML,
I think it was Markus who pointed me at a commercial product that has OWL<-->UML<-->XML Schema or something like that. I think I was underwhelmed by it at Bratislava when I last tried it, but maybe it is better now.
One gotcha about OWL as normative that always worries me is that a lot non-specialist folks thing RDF means RDF/XML. That wouldn't stop me from using it though. After all, it's a golden road(*) to machine reasoning.... :-)
Bob
(*)Not to be confused with a yellow-brick road.
On Fri, Apr 24, 2009 at 9:38 AM, Roger Hyam rogerhyam@mac.com wrote:
Hi Markus,
My plan is to produce an abstract model (a text description of the classes and properties) and a series of example concrete models. One would be the OWL ontology (perhaps this would be normative not sure yet) another would be a CSV file format and a third would be a simple (one namespace) XML file format. I will define mappings between all these (possibly XSLT to get from XML to CSV and OWL).
I also plan to do a mapping from the IPT checklist format to one or all of these things.
It would be good to write a couple of tools to do some auto conversion.
If I cut out all the rubbish and only include what is actually used we are only talking two classes and < thirty properties so I should be able beat the thing into submission quiet quickly.
All the best,
Roger
On 24 Apr 2009, at 12:07, Markus Döring (GBIF) wrote:
Roger, if you are working on that already, what do you think are the chances to have a one to one mapping from the taxonomic dwc terms to the names & taxon classes in the ontology?
We are using the dwc terms not only for occurrence records, but are already encoding pure taxonomic or nomenclatural checklists with simple dwc terms. Within ECAT and also the Global Names Index / Architecture we have started sharing such checklists encoded as text files which are far easier to handle than TCS-RDF, especially when dealing with hundreds of thousands of names. The expressiveness is of course limited, but it allows us to easily share all of the basic taxonomic/nomenclatural information such as synonyms, basionyms, taxonomic hierarchies, taxon concept reference, original publication, nomenclatural code, nomenclatural & taxonomic status, rank, atomised name parts and the classic higher dwc ranks for each name. I would be super glad to see a convergence with the ontology here.
Markus
On Apr 24, 2009, at 11:54, Roger Hyam wrote:
Hi Peter,
The TaxonOccurrence standard is not actually a standard. It is part of the vocabulary that was put together two years ago and maps well to DwC.
Meanwhile, as Markus points out, DwC has gone its own way.
The whole TDWG ontology needs some clarification and sorting out and as Bob points out this is now done on a voluntary basis.
I am working on the TaxonName and TaxonConcept vocabularies because I need to use them. I hope I will provide what Markus requests (good return type definitions for GUIDs) but I will only do it for taxon names and concepts because that is all I am working on. I hope to take this to a stage where it is solid, documented and standardizable.
This will entail doing some stuff on the way the TDWG ontology/ vocabulary is presented but really it is not my department to do that alone.
This is very important for TDWG (as Markus points out).
What really needs to happen is for people who need things to happen in the ontology to role up their sleeves and make them happen.
For occurrence stuff I recommend you join forces with John Wieczorek.
All the best,
Roger
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Roger Hyam Roger@BiodiversityCollectionsIndex.org http://www.BiodiversityCollectionsIndex.org
Royal Botanic Garden Edinburgh 20A Inverleith Row, Edinburgh, EH3 5LR, UK Tel: +44 131 552 7171 ext 3015 Fax: +44 131 248 2901 http://www.rbge.org.uk/
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Like Stan I was hoping for something that was more accessible that allowed for open participation in the process. Even tools like web protege are too complicated from what we need to achieve. A 300+ page OMG metamodel specification is not a good place to start.
I was also thinking of something based on DCMI made available via a simple editable tabular structure using the TDWG wiki. A Poor man's RDFS to function as a common repository and cross reference for our Ontologies. One concept per page using the wiki page graph to establish relationships. Extraction from this place to Metamodel compliant implementations or to application level schema should be trivial. The aim being to turn the process around so that those of us required to work at the more esoteric end of this spectrum are aware of and building against community level requirements.
I personally believe this to be within our current capabilities and certainly achievable before the next TDWG meeting.
greg
2009/4/25 Bob Morris morris.bob@gmail.com:
There is an OMG Metamodel for ontology description. At least one implementation of it http://www.eclipse.org/m2m/atl/usecases/ODMImplementation/ claims to support designing OWL ontologies with UML,
I think it was Markus who pointed me at a commercial product that has OWL<-->UML<-->XML Schema or something like that. I think I was underwhelmed by it at Bratislava when I last tried it, but maybe it is better now.
One gotcha about OWL as normative that always worries me is that a lot non-specialist folks thing RDF means RDF/XML. That wouldn't stop me from using it though. After all, it's a golden road(*) to machine reasoning.... :-)
Bob
(*)Not to be confused with a yellow-brick road.
On Fri, Apr 24, 2009 at 9:38 AM, Roger Hyam rogerhyam@mac.com wrote:
Hi Markus,
My plan is to produce an abstract model (a text description of the classes and properties) and a series of example concrete models. One would be the OWL ontology (perhaps this would be normative not sure yet) another would be a CSV file format and a third would be a simple (one namespace) XML file format. I will define mappings between all these (possibly XSLT to get from XML to CSV and OWL).
I also plan to do a mapping from the IPT checklist format to one or all of these things.
It would be good to write a couple of tools to do some auto conversion.
If I cut out all the rubbish and only include what is actually used we are only talking two classes and < thirty properties so I should be able beat the thing into submission quiet quickly.
All the best,
Roger
On 24 Apr 2009, at 12:07, Markus Döring (GBIF) wrote:
Roger, if you are working on that already, what do you think are the chances to have a one to one mapping from the taxonomic dwc terms to the names & taxon classes in the ontology?
We are using the dwc terms not only for occurrence records, but are already encoding pure taxonomic or nomenclatural checklists with simple dwc terms. Within ECAT and also the Global Names Index / Architecture we have started sharing such checklists encoded as text files which are far easier to handle than TCS-RDF, especially when dealing with hundreds of thousands of names. The expressiveness is of course limited, but it allows us to easily share all of the basic taxonomic/nomenclatural information such as synonyms, basionyms, taxonomic hierarchies, taxon concept reference, original publication, nomenclatural code, nomenclatural & taxonomic status, rank, atomised name parts and the classic higher dwc ranks for each name. I would be super glad to see a convergence with the ontology here.
Markus
On Apr 24, 2009, at 11:54, Roger Hyam wrote:
Hi Peter,
The TaxonOccurrence standard is not actually a standard. It is part of the vocabulary that was put together two years ago and maps well to DwC.
Meanwhile, as Markus points out, DwC has gone its own way.
The whole TDWG ontology needs some clarification and sorting out and as Bob points out this is now done on a voluntary basis.
I am working on the TaxonName and TaxonConcept vocabularies because I need to use them. I hope I will provide what Markus requests (good return type definitions for GUIDs) but I will only do it for taxon names and concepts because that is all I am working on. I hope to take this to a stage where it is solid, documented and standardizable.
This will entail doing some stuff on the way the TDWG ontology/ vocabulary is presented but really it is not my department to do that alone.
This is very important for TDWG (as Markus points out).
What really needs to happen is for people who need things to happen in the ontology to role up their sleeves and make them happen.
For occurrence stuff I recommend you join forces with John Wieczorek.
All the best,
Roger
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
Roger Hyam Roger@BiodiversityCollectionsIndex.org http://www.BiodiversityCollectionsIndex.org
Royal Botanic Garden Edinburgh 20A Inverleith Row, Edinburgh, EH3 5LR, UK Tel: +44 131 552 7171 ext 3015 Fax: +44 131 248 2901 http://www.rbge.org.uk/
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
-- Robert A. Morris Professor of Computer Science UMASS-Boston ram@cs.umb.edu http://bdei.cs.umb.edu/ http://www.cs.umb.edu/~ram http://www.cs.umb.edu/~ram/calendar.html phone (+1)617 287 6466
If you have received this transmission in error please notify us immediately by return e-mail and delete all copies. If this e-mail or any attachments have been sent to you in error, that error does not constitute waiver of any confidentiality, privilege or copyright in respect of information in the e-mail or attachments.
Please consider the environment before printing this email.
tdwg-tag mailing list tdwg-tag@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tag
participants (13)
-
"Markus Döring (GBIF)"
-
Blum, Stan
-
Bob Morris
-
Chuck Miller
-
Dave Vieglais
-
David Remsen
-
greg whitbread
-
Jerry Cooper
-
John R. WIECZOREK
-
Kevin Richards
-
Paul Kirk
-
Peter DeVries
-
Roger Hyam