Use of the MIxS-based DWC extension
Hi All I just wanted to let you know that GBIF now starts to see incoming data using the MIxS-based DNA derived data extension for DwC. A few examples:
1. https://www.gbif.org/dataset/e0b59ee7-19ae-4eb0-9217-33317fb50d47 * Example occurrence record: https://www.gbif.org/occurrence/3357191905 (scroll to the bottom to see the MIxS / DNA data) 2. https://www.gbif.org/dataset/9e29a2fe-d780-48a8-a93f-9ce041f9202f
* Example occurrence record: https://www.gbif.org/occurrence/3356905303 1. https://www.gbif.org/dataset/040c5662-da76-4782-a48e-cdea1892d14c
* Example occurrence: https://www.gbif.org/occurrence/2979635351
Best wishes
Thomas Stjernegaard Jeppesen Web developer orcid.org/0000-0003-1691-239Xhttps://orcid.org/0000-0003-1691-239X
Global Biodiversity Information Facility Universitetsparken 15, 2100 København Ø www.gbif.orghttps://www.gbif.org/ | www.catalogueoflife.org/https://www.catalogueoflife.org/
Excellent! Congratulations!
On Wed, Sep 29, 2021 at 11:09 AM Thomas Stjernegaard Jeppesen < tsjeppesen@gbif.org> wrote:
Hi All
I just wanted to let you know that GBIF now starts to see incoming data using the MIxS-based DNA derived data extension for DwC.
A few examples:
https://www.gbif.org/dataset/e0b59ee7-19ae-4eb0-9217-33317fb50d47
- Example occurrence record:
https://www.gbif.org/occurrence/3357191905 (scroll to the bottom to see the MIxS / DNA data)
https://www.gbif.org/dataset/9e29a2fe-d780-48a8-a93f-9ce041f9202f
Example occurrence record: https://www.gbif.org/occurrence/3356905303
https://www.gbif.org/dataset/040c5662-da76-4782-a48e-cdea1892d14c
Example occurrence: https://www.gbif.org/occurrence/2979635351
Best wishes
*Thomas Stjernegaard Jeppesen*
Web developer
orcid.org/0000-0003-1691-239X
*Global Biodiversity Information Facility*
Universitetsparken 15, 2100 København Ø
www.gbif.org | www.catalogueoflife.org/
dwc-mixs mailing list dwc-mixs@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/dwc-mixs
Excellent!
What is the status of mapping the remaining fields here:
<extension encoding="UTF-8" fieldsTerminatedBy="\t" linesTerminatedBy="\n" fieldsEnclosedBy="" ignoreHeaderLines="1" rowType=" http://rs.gbif.org/terms/1.0/DNADerivedData">
<files>
<location>dnaderiveddata.txt</location>
</files>
<coreid index="0" />
<field index="1" term="https://w3id.org/gensc/terms/MIXS:0000012"/>
<field index="2" term="https://w3id.org/gensc/terms/MIXS:0000013"/>
<field index="3" term="https://w3id.org/gensc/terms/MIXS:0000014"/>
<field index="4" term="https://w3id.org/gensc/terms/MIXS:0000044"/>
<field index="5" term="https://w3id.org/gensc/terms/MIXS:0000090"/>
<field index="6" term="http://rs.gbif.org/terms/pcr_primer_forward"/>
<field index="7" term="http://rs.gbif.org/terms/pcr_primer_reverse"/>
<field index="8" term="http://rs.gbif.org/terms/pcr_primer_name_forward "/>
<field index="9" term="http://rs.gbif.org/terms/pcr_primer_name_reverse "/>
<field index="10" term="http://rs.gbif.org/terms/pcr_primer_reference"/>
<field index="11" term="http://rs.gbif.org/terms/dna_sequence"/>
</extension>
Some are in MIxS but not as 1:1; e.g. MIxS has a combined field for primers whereas many databases separate these into two. Also it looks like the URIs like http://rs.gbif.org/terms/pcr_primer_forward don't resolve?
On Wed, Sep 29, 2021 at 7:44 AM Chris H only1chunts@gmail.com wrote:
Good news! Thank you Thomas for sharing.
I am forwarding this onto the GSC-MIxS group as I’m sure they will be pleased to see real-world usage of the work we do.
Thanks
Chris
*From: *Thomas Stjernegaard Jeppesen tsjeppesen@gbif.org *Sent: *29 September 2021 15:09 *To: *dwc-mixs@lists.tdwg.org *Subject: *[dwc-mixs] Use of the MIxS-based DWC extension
Hi All
I just wanted to let you know that GBIF now starts to see incoming data using the MIxS-based DNA derived data extension for DwC.
A few examples:
https://www.gbif.org/dataset/e0b59ee7-19ae-4eb0-9217-33317fb50d47
- Example occurrence record:
https://www.gbif.org/occurrence/3357191905 (scroll to the bottom to see the MIxS / DNA data)
https://www.gbif.org/dataset/9e29a2fe-d780-48a8-a93f-9ce041f9202f
Example occurrence record: https://www.gbif.org/occurrence/3356905303
https://www.gbif.org/dataset/040c5662-da76-4782-a48e-cdea1892d14c
Example occurrence: https://www.gbif.org/occurrence/2979635351
Best wishes
*Thomas Stjernegaard Jeppesen*
Web developer
orcid.org/0000-0003-1691-239X
*Global Biodiversity Information Facility*
Universitetsparken 15, 2100 København Ø
www.gbif.org | www.catalogueoflife.org/
-- You received this message because you are subscribed to the Google Groups "gensc-cig" group. To unsubscribe from this group and stop receiving emails from it, send an email to gensc-cig+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/gensc-cig/E6030553-7655-47AA-AE08-D98DDF4B... https://groups.google.com/d/msgid/gensc-cig/E6030553-7655-47AA-AE08-D98DDF4B7BCD%40hxcore.ol?utm_medium=email&utm_source=footer .
Hi Chris This is explained in the task group report on page 8, section “Outcomes > DwC Extensions > Variations of the MIxS-DwC Extension”:
Additionally, the DNA-derived data extension also takes measures to optimise the formatting and machine-readability of keys from MIxS. This stems from the fact that some MIxS key-value pairs are not atomic, i.e. they include multiple values in the same field (e.g. the MIxS key “pcr_primers” requires the user to enter a value which comprises a string that represents both the forward and reverse primer sequence, separated by a semicolon). This value-level formatting creates a bespoke data structure which then requires custom software or code to parse, limiting interoperability with external systems. Thus, in the case of pcr_primers, the DNA derived data extension uses alternative keys, based on the MIxS key, which are associated with atomic values: pcr_primer_forward and pcr_primer_reverse. This allows for more efficient and unambiguous data ingestion into search indices, relational databases, or similar solutions with minimal processing.
We fully acknowledge resolvable references are wanted and at some point anticipate that. However the notion of using URIs (not URLs) as a means for identifying concepts has been around and used by GBIF for decades (A URI doesn't necessarily have to be a digitally accessible thing - it's just an identifier format). Out of curiosity: Is MIxS 6 released or any news on when it will be?
Best Thomas
From: Chris Mungall cjmungall@lbl.gov Date: Wednesday, 29 September 2021 at 20:25 To: Chris H only1chunts@gmail.com Cc: Thomas Stjernegaard Jeppesen tsjeppesen@gbif.org, "dwc-mixs@lists.tdwg.org" dwc-mixs@lists.tdwg.org, gensc-cig gensc-cig@googlegroups.com Subject: Re: [gensc-cig] Use of the MIxS-based DWC extension
Excellent!
What is the status of mapping the remaining fields here:
<extension encoding="UTF-8" fieldsTerminatedBy="\t" linesTerminatedBy="\n" fieldsEnclosedBy="" ignoreHeaderLines="1" rowType="http://rs.gbif.org/terms/1.0/DNADerivedData">
<files>
<location>dnaderiveddata.txt</location>
</files>
<coreid index="0" />
<field index="1" term="https://w3id.org/gensc/terms/MIXS:0000012"/>
<field index="2" term="https://w3id.org/gensc/terms/MIXS:0000013"/>
<field index="3" term="https://w3id.org/gensc/terms/MIXS:0000014"/>
<field index="4" term="https://w3id.org/gensc/terms/MIXS:0000044"/>
<field index="5" term="https://w3id.org/gensc/terms/MIXS:0000090"/>
<field index="6" term="http://rs.gbif.org/terms/pcr_primer_forward"/>
<field index="7" term="http://rs.gbif.org/terms/pcr_primer_reverse"/>
<field index="8" term="http://rs.gbif.org/terms/pcr_primer_name_forward"/>
<field index="9" term="http://rs.gbif.org/terms/pcr_primer_name_reverse"/>
<field index="10" term="http://rs.gbif.org/terms/pcr_primer_reference"/>
<field index="11" term="http://rs.gbif.org/terms/dna_sequence"/>
</extension>
Some are in MIxS but not as 1:1; e.g. MIxS has a combined field for primers whereas many databases separate these into two. Also it looks like the URIs like http://rs.gbif.org/terms/pcr_primer_forward don't resolve?
On Wed, Sep 29, 2021 at 7:44 AM Chris H <only1chunts@gmail.commailto:only1chunts@gmail.com> wrote: Good news! Thank you Thomas for sharing. I am forwarding this onto the GSC-MIxS group as I’m sure they will be pleased to see real-world usage of the work we do.
Thanks Chris
From: Thomas Stjernegaard Jeppesenmailto:tsjeppesen@gbif.org Sent: 29 September 2021 15:09 To: dwc-mixs@lists.tdwg.orgmailto:dwc-mixs@lists.tdwg.org Subject: [dwc-mixs] Use of the MIxS-based DWC extension
Hi All I just wanted to let you know that GBIF now starts to see incoming data using the MIxS-based DNA derived data extension for DwC. A few examples:
1. https://www.gbif.org/dataset/e0b59ee7-19ae-4eb0-9217-33317fb50d47
* Example occurrence record: https://www.gbif.org/occurrence/3357191905 (scroll to the bottom to see the MIxS / DNA data)
1. https://www.gbif.org/dataset/9e29a2fe-d780-48a8-a93f-9ce041f9202f
* Example occurrence record: https://www.gbif.org/occurrence/3356905303
1. https://www.gbif.org/dataset/040c5662-da76-4782-a48e-cdea1892d14c
* Example occurrence: https://www.gbif.org/occurrence/2979635351
Best wishes
Thomas Stjernegaard Jeppesen Web developer orcid.org/0000-0003-1691-239Xhttps://orcid.org/0000-0003-1691-239X
Global Biodiversity Information Facility Universitetsparken 15, 2100 København Ø www.gbif.orghttps://www.gbif.org/ | www.catalogueoflife.org/https://www.catalogueoflife.org/
-- You received this message because you are subscribed to the Google Groups "gensc-cig" group. To unsubscribe from this group and stop receiving emails from it, send an email to gensc-cig+unsubscribe@googlegroups.commailto:gensc-cig+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/gensc-cig/E6030553-7655-47AA-AE08-D98DDF4B...https://groups.google.com/d/msgid/gensc-cig/E6030553-7655-47AA-AE08-D98DDF4B7BCD%40hxcore.ol?utm_medium=email&utm_source=footer.
participants (4)
-
Chris H
-
Chris Mungall
-
John Wieczorek
-
Thomas Stjernegaard Jeppesen