[dwc-mixs] DwC extension in the GBWG repository

Provoost, Pieter p.provoost at unesco.org
Thu Mar 11 13:28:44 UTC 2021


Hi Pier Luigi,

Maybe I'm missing the point, but I assume we are not questioning the existence of Darwin Core extensions, as creating a Darwin Core extension is exactly what this group set out to do? If the concern is with where some of the earlier discussions have happened, I can't speak for other communities but from the OBIS perspective we are happy with how things have progressed and we have had ample opportunity to contribute. I'm not sure this creates another silo, as far as I'm aware data formatting is outside the scope of the Darwin Core standard (although it offers some guidelines, https://dwc.tdwg.org/text/), and Darwin Core archives merely complement the standardized vocabularies by offering a flexible way to package vocabulary aligned data elements and metadata documents. 

Our main interest in this group is to be able to bring MIxS described datasets into our data model which is mostly Darwin Core based, so from a practical standpoint it makes sense to only include MIxS terms which have no Darwin Core equivalent into the extension (hence "extension").

Best,
Pieter

Pieter Provoost
OBIS Data Manager
Intergovernmental Oceanographic Commission (IOC) of UNESCO
IOC Project Office for IODE
Wandelaarkaai 7/61 - 8400 Oostende - Belgium
+32 478 574420

On 08/03/2021, 23:15, "dwc-mixs on behalf of Pier Luigi Buttigieg" <dwc-mixs-bounces at lists.tdwg.org on behalf of pier.buttigieg at awi.de> wrote:

    CAUTION: This email is external from UNESCO. Please be vigilant on its sender and content. ATTENTION : Cet e-mail est externe à l'UNESCO. Soyez vigilant sur son expéditeur et contenu.



    Hi Tim

    > [...]
    > The intention of this extension is to allow data exchange within the GBIF, OBIS and ALA infrastructures specifically using the Darwin Core Archive standard.

    Thanks for the clarification; however, does this occur outside of the
    standards space (i.e. TDWG/GSC)? This is then ad hoc?

    And is the DC-A standard is more about how to package (meta)data rather
    than the specification of the fields right (asking from ignorance here)?

    I still can't really understand the long-term role/utility of extensions
    if the fields they specify are not coordinated with the standards groups
    / used to extend the core standards "officially".

    > Being an extension to DwC this only draws in the terms not covered by Darwin Core, and also only brings in  fields that would complement the kinds of information that the Darwin Core provides

    Brings them in from where? These are made up ad hoc?

    >   Therefore the cross-mapping work of the group is not fully relevant to this extension, although any changes in MIxS would be followed now, or in the future. The background for how this extension came to be is described in section 2 of the forthcoming guide (in draft) for exchanging DNA-derived data in GBIFhttps://doi.org/10.35035/doc-vf1a-nr22

    Is this an internal way of handling this information outside of TDWG and
    GSC processes?

    This sets off all kinds of alarm bells, especially if they are marketed
    widely as a sort of parallel de facto standard.

    > The question on ratification is really one for the members of the task group to consider. It would be useful to have the task group approve that this was a sensible route for Darwin Core Archive use.

    If it's outside the standardisation processes of TDWG/GSC (as imperfect
    as they are), I don't really see how that's sensible in a global sense.

    In a local sense, working against time constraints, it does make sense;
    however, only with a declared intent and plan to fold advancements into
    the global processes/standards.

    I don't fully understand the nuances, probably, but this just doesn't
    sound like good strategy.

    > Ratification by TDWG isn't strictly necessary for GBIF/ALA/OBIS but would be desirable. GBIF have committed to having DwC-A support during Q2 2021 so there are time pressures to consider and we believe this is nearly ready.

    Indeed, it isn't - anyone can do anything anytime - the question here is
    if this is creating more silos and not going through a process that the
    community at large (aside from GBIF users) can also use.

    It feels like this is creating more work downstream, where we will then
    have three entities to map (GBIF/TDWG/MIxS) all with different ways of
    doing things. Does OBIS also want to create its own thing?

    > What GBIF are really seeking from the group is guidance on:
    >
    > 1. Is it correct use of MIxS in this specific application profile?

    Given that the MIxS standards are mainlined into the INSDC, it makes
    sense to use this for anything omic.

    My hesitation of using MIxS because it was basically a spreadsheet is
    more or less removed thanks to Bill et al.'s work moving them into the
    linked open data world and giving each term an IRI.

    That being said, many of the MIxS terms in the environmental packages
    (many being biogeochemical parameters etc) should be replaced by IRIs of
    terms from standards bodies in those communities, once we find good
    parallels.

    > 2. Are there considerations that OBIS would like to bring forward?

    This would be very good to know. I think my concerns above would carry
    over.

    > 3. Is there scope to split the MIxS fields Thomas identified?

    MIxS v6 is almost out, but we can lodge issues on the GSC tracker to
    this effect, cross-linking them to those in our GBWG tracker.

    > 4. What should the name of this extension be? (bearing in mind 5 below)

    I'm not sure what you're referring to.

    > 5. Is it reasonable to supplement the MIxS fields with the additional ones to accommodate more use cases

    The best way to go about this is to post issues on the MIxS tracker to
    get them in there (they accept new environmental packages or extensions
    all the time, recently one from a global consortium of food agencies and
    one for the COVID response)

    >
    > We'll open github issues specifically for some of these, but I thought I'd share here for context.

    Thanks - I still feel like I don't get the relationship between these
    actors over the archives, vs the core standards, vs the unilateral move,
    etc.

    Is there somewhere where these things are explained?

    Best,
    Pier Luigi

    --
    https://orcid.org/0000-0002-4366-3088

    _______________________________________________
    dwc-mixs mailing list
    dwc-mixs at lists.tdwg.org
    http://lists.tdwg.org/mailman/listinfo/dwc-mixs



More information about the dwc-mixs mailing list