Dear all
In preparation for the expert review and the subsequent public review we have been focussing on some key areas of the MIDS standard which still had some questions remaining. One of these is the approach taken to handle specimens which have missing data and how these would be scored.
We are therefore planning a special focus session of the regular MIDS meetings to be held on Thursday 28th August at 14:00 UTC.
There are two main points here which relate to the users of MIDS - in particular, curators/collection managers and researchers.
The proposed option in MIDS would enable any specimen to have the potential to reach MIDS3, meaning that a collection manager/curator could aim to have an entire collection digitised to MIDS3.
However, researchers may expect all specimens at a certain MIDS levels to have 'informative' data in each required field, rather than an indication that the data are missing. It would therefore need to be made clear that reaching MIDS3 is not a guarantee of the quality of the data in the record, but rather a statement on the level of digitisation that has been carried out. This would be in alignment with the view that MIDS is not a measure of quality but of data presence or absence.
Option 1
These specimens would reach MIDS3 if values were entered for missing data which do not match any proposed excluded values.
Option 2
These specimens could potentially never achieve MIDS3 if the data were not available on the specimen label or associated literature.
The proposed management of missing data in MIDS is presented below.
Handling of unknown and incomplete data
Best practice dictates that wherever possible data should not be published with empty field values as this is misleading for both human users and machines. There are many reasons why data can be missing, unknown, incomplete or explicitly withheld (Groom 2019) and various tactics have been used in the past to deal with such situations. However, with the increasing use of machines to interpret and act upon data, more consistent practices should be promoted.
Entering values to provide information relating to the unknown or incomplete data available enables curators and researchers to make decisions about the relevant records. For curators and collection managers, knowing why the data are unknown may help the digitisation process, potentially indicating a different workflow from basic transcription. For researchers, it can help in highlighting records that need additional research to determine an unknown value or that may not be possible to include in a desired analysis.
Unknown values for data information elements
If information is missing or incomplete in the specimen record for any field mapped to a MIDS information element then it is recommended to enter one of the terms for missing data values proposed by Groom et al. (2019) (Table 5). For calculating the MIDS level, it would be possible to exclude some values as having insufficient non-null values. For example: "unknown", "unknown:undigitised", "known:undigitised", "NULL", whitespace only.
There will be some questions that will need to be considered, and one of the largest of these relates to the translation of insufficient non-null values into different languages. In addition, some systems may not allow empty fields and an enforced missing data value would be automatically entered. In this case, the default value could be flagged for exclusion.
We have a GitHub Discussion for everyone to participate even if you can't join the meeting. https://github.com/tdwg/mids/discussions/159
There is also a Googledoc for people to add comments and notes here:
https://docs.google.com/document/d/1Wlku5hoKLrFKWq9dL29D_cChB9PWw_fzQBJNwv5…
With best wishes
Elspeth and Cat
MIDS Task Group Convenors
[Logo]
Dr Elspeth Haston
Deputy Herbarium Curator
Tel 00 44 (0)131 248 2800
20a Inverleith Row, Edinburgh, EH3 5LR Scotland
rbge.org.uk<https://rbge.org.uk/>
@emhaston | Google Scholar | https://orcid.org/0000-0001-9144-2848
Search our Herbarium collections online at http://data.rbge.org.uk/herb
The Royal Botanic Garden Edinburgh<https://www.rbge.org.uk> is a charity registered in Scotland (No SC007983) | Support Us<https://www.rbge.org.uk/support-us>
This notice applies to this email and to any other email subsequently sent by anyone at RBGE and appearing in the same chain of email correspondence. References below to "this email" should be read accordingly. This e-mail and its attachments (if any) are confidential, may be protected by copyright and may be privileged. If you receive this e-mail in error, notify us immediately by reply e-mail, delete it and do not use, disclose or copy it. Unless we expressly say otherwise in this e-mail, this e-mail does not create, form part of, or vary, any contractual or unilateral obligation. No liability is accepted for viruses and it is your responsibility to scan attachments (if any). Where this e-mail is unrelated to the business of RBGE, the opinions expressed within this e-mail are the opinions of the sender and do not necessarily constitute those of RBGE. RBGE emails are filtered and monitored.