Hi Chris
This is explained in the task group report on page 8, section “Outcomes > DwC Extensions > Variations of the MIxS-DwC Extension”:
Additionally, the DNA-derived data extension also takes measures to optimise the formatting
and machine-readability of keys from MIxS. This stems from the fact that some MIxS
key-value pairs are not atomic, i.e. they include multiple values in the same field (e.g. the
MIxS key “pcr_primers” requires the user to enter a value which comprises a string that
represents both the forward and reverse primer sequence, separated by a semicolon). This
value-level formatting creates a bespoke data structure which then requires custom software
or code to parse, limiting interoperability with external systems. Thus, in the case of
pcr_primers, the DNA derived data extension uses alternative keys, based on the MIxS key,
which are associated with atomic values: pcr_primer_forward and pcr_primer_reverse. This
allows for more efficient and unambiguous data ingestion into search indices, relational
databases, or similar solutions with minimal processing.
We fully acknowledge resolvable references are wanted and at some point anticipate that. However the notion of using URIs (not URLs) as a means for identifying concepts has been around and used by GBIF for decades (A URI doesn't necessarily
have to be a digitally accessible thing - it's just an identifier format).
Out of curiosity: Is MIxS 6 released or any news on when it will be?
Best
Thomas
From: Chris Mungall <cjmungall@lbl.gov>
Date: Wednesday, 29 September 2021 at 20:25
To: Chris H <only1chunts@gmail.com>
Cc: Thomas Stjernegaard Jeppesen <tsjeppesen@gbif.org>, "dwc-mixs@lists.tdwg.org" <dwc-mixs@lists.tdwg.org>, gensc-cig <gensc-cig@googlegroups.com>
Subject: Re: [gensc-cig] Use of the MIxS-based DWC extension
Excellent!
What is the status of mapping the remaining fields here:
<extension encoding="UTF-8" fieldsTerminatedBy="\t" linesTerminatedBy="\n"
fieldsEnclosedBy="" ignoreHeaderLines="1" rowType="http://rs.gbif.org/terms/1.0/DNADerivedData">
<files>
<location>dnaderiveddata.txt</location>
</files>
<coreid index="0" />
<field index="1" term="https://w3id.org/gensc/terms/MIXS:0000012"/>
<field index="2" term="https://w3id.org/gensc/terms/MIXS:0000013"/>
<field index="3" term="https://w3id.org/gensc/terms/MIXS:0000014"/>
<field index="4" term="https://w3id.org/gensc/terms/MIXS:0000044"/>
<field index="5" term="https://w3id.org/gensc/terms/MIXS:0000090"/>
<field index="6" term="http://rs.gbif.org/terms/pcr_primer_forward"/>
<field index="7" term="http://rs.gbif.org/terms/pcr_primer_reverse"/>
<field index="8" term="http://rs.gbif.org/terms/pcr_primer_name_forward"/>
<field index="9" term="http://rs.gbif.org/terms/pcr_primer_name_reverse"/>
<field index="10" term="http://rs.gbif.org/terms/pcr_primer_reference"/>
<field index="11" term="http://rs.gbif.org/terms/dna_sequence"/>
</extension>
Some are in MIxS but not as 1:1; e.g. MIxS has a combined field for primers whereas many databases separate these into two. Also it looks like the URIs like http://rs.gbif.org/terms/pcr_primer_forward
don't resolve?
On Wed, Sep 29, 2021 at 7:44 AM Chris H <only1chunts@gmail.com> wrote:
Good news! Thank you Thomas for sharing.
I am forwarding this onto the GSC-MIxS group as I’m sure they will be pleased to see real-world usage of the work we do.
Thanks
Chris
From: Thomas Stjernegaard Jeppesen
Sent: 29 September 2021 15:09
To: dwc-mixs@lists.tdwg.org
Subject: [dwc-mixs] Use of the MIxS-based DWC extension
Hi All
I just wanted to let you know that GBIF now starts to see incoming data using the MIxS-based DNA derived data extension for DwC.
A few examples:
- Example occurrence record: https://www.gbif.org/occurrence/3357191905 (scroll to the bottom to see the MIxS / DNA data)
- Example occurrence record: https://www.gbif.org/occurrence/3356905303
- Example occurrence: https://www.gbif.org/occurrence/2979635351
Best wishes
Thomas Stjernegaard Jeppesen
Web developer
Global Biodiversity Information Facility
Universitetsparken 15, 2100 København Ø
www.gbif.org | www.catalogueoflife.org/
--
You received this message because you are subscribed to the Google Groups "gensc-cig" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gensc-cig+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gensc-cig/E6030553-7655-47AA-AE08-D98DDF4B7BCD%40hxcore.ol.