Hi Chris

This is explained in the task group report on page 8, section “Outcomes > DwC Extensions > Variations of the MIxS-DwC Extension”:

 

Additionally, the DNA-derived data extension also takes measures to optimise the formatting

and machine-readability of keys from MIxS. This stems from the fact that some MIxS

key-value pairs are not atomic, i.e. they include multiple values in the same field (e.g. the

MIxS key “pcr_primers” requires the user to enter a value which comprises a string that

represents both the forward and reverse primer sequence, separated by a semicolon). This

value-level formatting creates a bespoke data structure which then requires custom software

or code to parse, limiting interoperability with external systems. Thus, in the case of

pcr_primers, the DNA derived data extension uses alternative keys, based on the MIxS key,

which are associated with atomic values: pcr_primer_forward and pcr_primer_reverse. This

allows for more efficient and unambiguous data ingestion into search indices, relational

databases, or similar solutions with minimal processing.

 

We fully acknowledge resolvable references are wanted and at some point anticipate that. However the notion of using URIs (not URLs) as a means for identifying concepts has been around and used by GBIF for decades (A URI doesn't necessarily have to be a digitally accessible thing - it's just an identifier format).

Out of curiosity: Is MIxS 6 released or any news on when it will be?

 

Best

Thomas

 

From: Chris Mungall <cjmungall@lbl.gov>
Date: Wednesday, 29 September 2021 at 20:25
To: Chris H <only1chunts@gmail.com>
Cc: Thomas Stjernegaard Jeppesen <tsjeppesen@gbif.org>, "dwc-mixs@lists.tdwg.org" <dwc-mixs@lists.tdwg.org>, gensc-cig <gensc-cig@googlegroups.com>
Subject: Re: [gensc-cig] Use of the MIxS-based DWC extension

 

Excellent!

 

What is the status of mapping the remaining fields here:

 

  <extension encoding="UTF-8" fieldsTerminatedBy="\t" linesTerminatedBy="\n" fieldsEnclosedBy="" ignoreHeaderLines="1" rowType="http://rs.gbif.org/terms/1.0/DNADerivedData">

    <files>

      <location>dnaderiveddata.txt</location>

    </files>

    <coreid index="0" />

    <field index="1" term="https://w3id.org/gensc/terms/MIXS:0000012"/>

    <field index="2" term="https://w3id.org/gensc/terms/MIXS:0000013"/>

    <field index="3" term="https://w3id.org/gensc/terms/MIXS:0000014"/>

    <field index="4" term="https://w3id.org/gensc/terms/MIXS:0000044"/>

    <field index="5" term="https://w3id.org/gensc/terms/MIXS:0000090"/>

    <field index="6" term="http://rs.gbif.org/terms/pcr_primer_forward"/>

    <field index="7" term="http://rs.gbif.org/terms/pcr_primer_reverse"/>

    <field index="8" term="http://rs.gbif.org/terms/pcr_primer_name_forward"/>

    <field index="9" term="http://rs.gbif.org/terms/pcr_primer_name_reverse"/>

    <field index="10" term="http://rs.gbif.org/terms/pcr_primer_reference"/>

    <field index="11" term="http://rs.gbif.org/terms/dna_sequence"/>

  </extension>

 

Some are in MIxS but not as 1:1; e.g. MIxS has a combined field for primers whereas many databases separate these into two. Also it looks like the URIs like http://rs.gbif.org/terms/pcr_primer_forward don't resolve?

 

On Wed, Sep 29, 2021 at 7:44 AM Chris H <only1chunts@gmail.com> wrote:

Good news! Thank you Thomas for sharing.

I am forwarding this onto the GSC-MIxS group as I’m sure they will be pleased to see real-world usage of the work we do.

 

Thanks

Chris

 

 

From: Thomas Stjernegaard Jeppesen
Sent: 29 September 2021 15:09
To: dwc-mixs@lists.tdwg.org
Subject: [dwc-mixs] Use of the MIxS-based DWC extension

 

Hi All

I just wanted to let you know that GBIF now starts to see incoming  data using the MIxS-based DNA derived data extension for DwC.

A few examples:

  1. https://www.gbif.org/dataset/e0b59ee7-19ae-4eb0-9217-33317fb50d47
    1. Example occurrence record: https://www.gbif.org/occurrence/3357191905 (scroll to the bottom to see the MIxS / DNA data)
  1. https://www.gbif.org/dataset/9e29a2fe-d780-48a8-a93f-9ce041f9202f
    1. Example occurrence record: https://www.gbif.org/occurrence/3356905303
  1. https://www.gbif.org/dataset/040c5662-da76-4782-a48e-cdea1892d14c
    1. Example occurrence: https://www.gbif.org/occurrence/2979635351

 

Best wishes

 

Thomas Stjernegaard Jeppesen

Web developer

orcid.org/0000-0003-1691-239X

 

Global Biodiversity Information Facility

Universitetsparken 15, 2100 København Ø

www.gbif.org | www.catalogueoflife.org/

 

 

 

--
You received this message because you are subscribed to the Google Groups "gensc-cig" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensc-cig+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gensc-cig/E6030553-7655-47AA-AE08-D98DDF4B7BCD%40hxcore.ol.