[dwc-mixs] Draft DwC extension updated

Thomas Stjernegaard Jeppesen tsjeppesen at gbif.org
Mon Apr 19 08:07:55 UTC 2021

Dear all

I have now applied all the changes to the extension I can; please can you see this<https://tdwg.github.io/gbwg/dwc-mixs/dwc/extension/dna_derived_data.xml>?
I updated this pilot dataset<https://github.com/tdwg/gbwg/issues/53#issuecomment-806515780> accordingly.

We discussed the use of the extension for both Event and Occurrence, therefore I set the dwc:subject to <nothing> in order to make the extension available for all cores. It might also be useful for a reference sequence for a Taxon.

We wonder if it would be prudent to resolve the outstanding issues and then move this into use in GBIF with a clear statement that the extension is pending ratification by TDWG and subject to change. GBIF can then coordinate early adopters so that we will get real datasets that we can refer to when seeking ratification. We worry that a standard should not be ratified without significant testing, but this approach would alleviate that concern and would strengthen our task report.

We believe this is the outstanding tasks and decisions to be made are:

  1.  The extension is called “DNA Derived Data”. Is this reasonable?
  2.  Is this the correct format for the URIs, e.g. https://w3id.org/gensc/terms/MIXS:0000074 ?
  3.  We have added atomized (forward, reverse) fields for pcr_primers – these are currently GBIF name spaced (e.g. http://rs.gbif.org/terms/pcr_primer_name_forward). Should we rather have MIXS IRIs for those, and what would they be?
  4.  We have also added the following primer related fields that are currently GBIF name spaced. Should they rather have MIXS IRIs, and what would they be?
     *   pcr_primer_name_forward
     *   pcr_primer_name_reverse
     *   pcr_primer_reference
  5.  Based on outputs from GBIFs draft data publishing guide<https://docs.gbif-uat.org/publishing-dna-derived-data/1.0/en/>, we have included the following GGBN terms in addition. Is this reasonable?

     *   concentration
     *   concentrationUnit
     *   methodDeterminationConcentrationAndRatios
     *   ratioOfAbsorbance260_230
     *   ratioOfAbsorbance260_280
  1.  Based on outputs from GBIFs draft data publishing guide<https://docs.gbif-uat.org/publishing-dna-derived-data/1.0/en/>, we have additionally included the following MIQE terms. Is this reasonable?

     *   annealingTemp
     *   annealingTempUnit
     *   probeReporter
     *   probeQuencher
     *   ampliconSize
     *   thresholdQuantificationCycle
     *   baselineValue
     *   quantificationCycle
     *   automaticThresholdQuantificationCycle
     *   automaticBaselineValue
     *   contaminationAssessment
     *   partitionVolume
     *   partitionVolumeUnit
     *   estimatedNumberOfCopies
     *   amplificationReactionVolume
     *   amplificationReactionVolumeUnit
     *   pcr_analysis_software


