Thanks, Dan.
Quite correct. What I had in mind was a twin approach that supports versioning of data. For many resources, no one is making further updates to the primary data. Placing it in a repository with stable management of identifiers and annotation services would allow the whole community to collaborate in enhancing the data. For data sets with active management at source, good practice with identifiers would allow issuing of new versions at regular intervals. If every data set has a DOI and every record has a unique id within that data set, we should be able to handle these issues. Hybrid approaches with active offline management and community annotation tools are more complex but not unachievable. FilteredPush has been working on some of this. Standarising approaches to identifiers and annotations would make it all work.
Donald
----------------------------------------------------------------------
Donald Hobern - GBIF Director - mailto:dhobern@gbif.org dhobern@gbif.org
Global Biodiversity Information Facility http://www.gbif.org/ http://www.gbif.org/
GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark
Tel: +45 3532 1471 Mob: +45 2875 1471 Fax: +45 2875 1480
----------------------------------------------------------------------
From: Daniel Janzen [mailto:djanzen@sas.upenn.edu] Sent: Sunday, October 13, 2013 8:37 PM To: Donald Hobern [GBIF] Cc: 'Richard Pyle'; 'TDWG Content Mailing List'; 'Chuck Miller' Subject: Re: [tdwg-content] A plea around basisOfRecord (Was: Proposed new Darwin Core terms - abundance, abundanceAsPercent)
Ahem. These zipped up archives will be obsolete approximately 1 minute after zipping, if not earlier.
How is it planned to update them as names change, names of localities change, date and code and lat and long and collector and and and and and field contents are corrected over the coming days, months, weeks and years? And then when the new version of the database, field instrument output, etc. is zipped up tomorrow, how will that replace the one zipped up the day before, and already fed into the user community, along with its previous versions of the field contents? Which version of "my" database will the harvester use? My most recently zipped up and submitted, or the one from a week ago or month ago or year ago? If the db is zipped up, how will the user annotate it and that annotation get back to the parent db?
Just a thought from out in a mud hole.
Smile.
Dan and Winnie
On Oct 13, 2013, at 1:24 PM, Donald Hobern [GBIF] dhobern@gbif.org wrote:
Thanks, Rich.
Very pleased to see this. With this encouragement, I'll say just a little bit more about why I think this is a critical need.
I see the model I describe as the perfect real-world realisation of most of the key components in the GBIO Framework ( http://www.biodiversityinformatics.org/ http://www.biodiversityinformatics.org/), as follows:
1. Everyone zips up whatever data they have from each resource (databases, field instruments, sequencers, data extracted from literature, checklists, whatever) into a DwC Archive using whatever DwC elements they can for data elements and describing other elements not currently recognised in DwC (the GBIO DATA layer)
2. These archives should be placed in repositories that offer basic services (DOIs, annotation services, etc.) (the GBIO CULTURE layer)
3. Harvesters assess the contents of each archive and determine what views can be supported from the supplied elements (occurrence records for GBIF, name usage records, species interactions, etc.) and catalogue these views in relevant discovery indexes (GBIF, Catalogue of Life, TraitBank, etc.) (the GBIO EVIDENCE layer)
4. Users can at any time annotate elements in the archives to provide mappings for (potentially more recently defined) DwC or other properties, opening up new options for reuse
Donald
----------------------------------------------------------------------
Donald Hobern - GBIF Director - mailto:dhobern@gbif.org dhobern@gbif.org
Global Biodiversity Information Facility http://www.gbif.org/ http://www.gbif.org/
GBIF Secretariat, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark
Tel: +45 3532 1471 Mob: +45 2875 1471 Fax: +45 2875 1480
----------------------------------------------------------------------
-----Original Message----- From: Richard Pyle [mailto:deepreef@ http://bishopmuseum.org bishopmuseum.org] Sent: Sunday, October 13, 2013 6:49 PM To: 'Donald Hobern [GBIF]'; 'TDWG Content Mailing List' Cc: 'Chuck Miller' Subject: RE: [tdwg-content] A plea around basisOfRecord (Was: Proposed new Darwin Core terms - abundance, abundanceAsPercent)
Hi Donald,
MANY thanks for this! And you are certainly not alone in your concerns about these issues. In fact, we have planned a Symposium for Documenting DarwinCore
( <https://mbgserv18.mobot.org/ocs/index.php/tdwg/2013/schedConf/trackPolicies
https://mbgserv18.mobot.org/ocs/index.php/tdwg/2013/schedConf/trackPolicies
#track11), and one of the four sessions (Session 3, to be precise) of the symposium focuses exactly on this issue of basisOfRecord/dcterms:type/etc.
Another session (Session 2) will focus on proposed and perhaps-to-be-proposed new classes (Individual, MaterialSample, Evidence), and will start out with a series graphs illustrating the existing high-level ontology and possible alternative high-level ontologies, as you indicate in your items 3 & 4.
Aloha,
Rich
_______________________________________________ tdwg-content mailing list mailto:tdwg-content@lists.tdwg.org tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content http://lists.tdwg.org/mailman/listinfo/tdwg-content