Fwd: Humboldt Extension Public Review preparations
From Wesley!
---------- Forwarded message --------- From: Wesley M. Hochachka wmh6@cornell.edu Date: Wed, Aug 16, 2023 at 4:26 AM Subject: Re: [tdwg-humboldt] Humboldt Extension Public Review preparations To: Yanina Sica yanina.sica@gmail.com, tuco@berkeley.edu < tuco@berkeley.edu>, Humboldt Core TG tdwg-humboldt@lists.tdwg.org CC: Yi Ming Gan ymgan@naturalsciences.be, Markus Döring (GBIF) < mdoering@gbif.org>, Peter Desmet peter.desmet.work@gmail.com
Hi everyone,
I expect that, as usual, my attempt to send this message to the Humboldt Extension listserv will fail, so someone please forward this on to the list.
I think that I will *not* be attending this Wednesday's meeting, unfortunately, because Wednesday is the last of my 3 days in Ithaca before travelling again...and I do not feel that I have everything under control.
I was able to look at the new "bycatch" terms, because I think that this is the document that would be most useful for me to read.
As general thoughts, I think that these terms are not absolutely essential, because the same information could be derived from looking at the taxonomic scope. However, I also think that they are useful "convenience" terms. Basically, I do not have a strong opinion either way about whether to include the terms. If the number of terms balloons beyond these three, because the same information is added for other scopes, then I'm thinking that I would tend towards not wanting to include any new terms.
Regarding the terms themselves I have added a small number of questions to the document about the descriptions of the terms, such as whether eco:hasNonTargetTaxa mybe could be logically turned into a true Boolean, because leaving the term blank when no taxonomic scope is specified is the same as saying that every reported taxon is outside of the bounds of the taxonomic scope. There are also multiple places in which it looks like terms from the TDWG controlled vocabulary (i.e. "SHOULD", "MUST") need to be added to replace the general English-language words that are in the current text. Finally, the current wording is vague enough to suggest that if the eco:Event is an observation of an individual species, then the new terms could/should be applied to each individual record of each independent taxon (i.e. to each observationID). Was this the intent, or would it be appropriate to only apply these terms within the structure at levels about the observationID (i.e. the first "parent" above the level of the individual observation)? I expect that there is no correct answer to this question, so I am only asking it to see whether this issue had been considered yet.
I think that these are the main questions/comments that came to mind while I was reading the descriptions of the new terms.
Wesley
P.S. While I will be reading my e-mail at least twice daily for the remainder of August, I probably will not have any larger blocks of time to devote to any work until 1 September, when I finally stop moving for a little while and I'm no longer in darkened conference rooms.
******************* Wesley Hochachka Senior Research Associate Cornell Lab of Ornithology ph. (607) 254-2484 ******************* ------------------------------ *From:* Yanina Sica yanina.sica@gmail.com *Sent:* Tuesday, August 15, 2023 04:57 *To:* tuco@berkeley.edu tuco@berkeley.edu; Humboldt Core TG < tdwg-humboldt@lists.tdwg.org> *Cc:* Yi Ming Gan ymgan@naturalsciences.be; Wesley M. Hochachka < wmh6@cornell.edu>; Markus Döring (GBIF) mdoering@gbif.org; Peter Desmet < peter.desmet.work@gmail.com>
*Subject:* Re: [tdwg-humboldt] Humboldt Extension Public Review preparations
Thank you all for this lively discussion!
It seems we are moving towards including "bycatch" terms. To keep up with the proposed timeline for ratification, I will urge *everybody to attend tomorrow's meeting, if possible. Otherwise, provide your comments regarding these terms before tomorrow's meeting. *
The rest of the Humboldt Extension documents should be reviewed by the end of this week (*let's try to finish everything by Wed 22!*).
Agenda:
- Discuss 'bycatch' terms. Please share your opinion here https://docs.google.com/document/d/184BjPL-Kd9lv_x39sVGNxFKFlb4hY_Xsli5_dmnBMqs/edit, I included and suggested edits for the terms proposed by John - State of the Hierarchical document https://docs.google.com/document/d/1r_XMEgB7p7OI7a5Ouq6G9oa7LmQFPcFhZZCLD9gWOIE/edit#heading=h.a0hj80sigtm6 : - discuss examples provided by Ming. Please see section 4 - discuss comments in the text. Please see section 3 - Review List of Authors https://github.com/tdwg/hc/blob/main/build/authors_configuration.yaml (thanks Steve!) - Assign people responsible* for: - Reviewing isLeastSpecificTargetCategoryQuantityInclusive https://github.com/tdwg/hc/blob/main/docs/inclusive/index.md - Reviewing Implementation report https://docs.google.com/document/d/1RFdSHoyzWCQk9qO6uup4xQjWOMzPyBb-A0mcjj98hbk/edit#heading=h.l89l0grun7cp, making sure latest advances are captured in the text (e.g. Bycatch or non-target species recorded section) - Reviewing Landing Page https://github.com/tdwg/hc/blob/main/docs/index.md - Reviewing User Guide https://docs.google.com/document/d/1rX4m94rtZDR_8iIe3RvRnNYKDJcmSX3ii4S5hCznEA0/edit#heading=h.rr2lqlf80rn0
*If you are not able to attend the meeting but are willing to critically review any document please reply to this email saying so.
If people are available and willing, I suggest we consider extending the meeting for 30-45min.
Final push!! we got this!
Yani
https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail Virus-free.www.avast.com https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail <#m_-3994289178551888878_x_m_5260918668296950599_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
On Mon, Aug 14, 2023 at 7:14 PM John Wieczorek tuco@berkeley.edu wrote:
Herein lies the problem of not having eco:nonTargetTaxa:
"Assuming everything that is not the target is non-target..."
That assumption isn't supportable because the taxonomic opinions applied to the taxa (target, excluded, and nonTarget) can differ over time and between people. Thus, a taxon that might have been out of scope at the time of the Event could come into scope later analysis by virtue of changes in taxonomy. It seems to me never safe to make any assumptions in this respect and that would mean that the eco:nonTargetTaxa is actually required in order to make inferences if any non-target Occurrences are reported. These terms are in support of post-facto use of the data, and it's becoming clear thinking about it that inventory data can be very fragile for some purposes, and the down-stream researcher will have to do a lot of due diligence to be able use them wisely.
On Mon, Aug 14, 2023 at 11:48 AM Yi Ming Gan ymgan@naturalsciences.be wrote:
Thank you so much everyone! John I appreciate your attempt to jump-start the term definition.
In my humboldt opinion, eco:hasNontargetTaxa and eco:areNontargetOrganismsFullyReported are certainly helpful! eco:hasNontargetTaxa + eco:isTaxonomicScopeFullyReported can also help to differentiate whether an Event caught something vs nothing.
Assuming everything that is not the target is non-target, eco:nonTargetTaxa is helpful, but I feel like it could get complicated - for example, taking into account of lifeStageScope etc. (adult fish is *not* a target, but larvae and juvenile are) User can look in the Occurrences to search for whatever not in eco:targetTaxonomicScope (and other scopes) to determine which of those Organisms are non-target. So I suggest to have this implicit, instead of having every non-target listed out in eco:nonTargetTaxa. It is much easier to have those non-target in Occurrence, because they have their own lifeStage, degreeOfEstablishment etc field. I hope this makes sense.
Thanks again!
Cheers Ming ------------------------------ *From:* tdwg-humboldt tdwg-humboldt-bounces@lists.tdwg.org on behalf of Kate Ingenloff kathryn.ingenloff@gmail.com *Sent:* 14 August 2023 16:00 *To:* tuco@berkeley.edu tuco@berkeley.edu *Cc:* Wesley M. Hochachka wmh6@cornell.edu; Markus Döring (GBIF) < mdoering@gbif.org>; Peter Desmet peter.desmet.work@gmail.com; Humboldt Core TG tdwg-humboldt@lists.tdwg.org *Subject:* Re: [tdwg-humboldt] Humboldt Extension Public Review preparations
Ahh, good point on that John.
I will amend my vote to support option 2. :)
On Mon, Aug 14, 2023 at 3:38 PM John Wieczorek tuco@berkeley.edu wrote:
Hi folks,
Thanks for this input, Kate.
It seems we are of divided opinions about when to address non-target organisms than whether to do so. Assuming it will be done eventually, here is an attempt to get the issues on the table.
Kate is proposing terms from the Occurrence perspective in the sense that they would be populated for child-most Occurrence Events (I'll use eco: isNonTargetOrganism). That is, The Event record associated with the Occurrence would have a Humboldt extension term eco:isNonTargetOrganism populated).
Last Wednesday Anahita was proposing terms at a parent Event level (I'll use eco:hasNontargetTaxa, eco:areNontargetOrganismsFullyReported, and eco:nonTargetTaxa). This parent Event perspective alerts a user about what to expect among the children. It should be possible to confirm the expectations among the children by investigating them.
These two approaches are very different. The first approach seems like a bit of a contorsion to me in that the term really ought to be an Occurrence term, and yet it would be going into the Event extension. A second problem I see is that it would be a challenge for users to know whether a data set, or a particular Event within a data set, is potentially fit for their purpose because all of the child Occurrences would have to be investigated to see if there were relevant non-target taxa.
The second approach circumvents the issues mentioned above. The term eco:hasNontargetTaxa for an Event would be a single boolean that could immediately alert a user about what to expect among the child Occurrences. It could also be populated for the Occurrence Events, serving the role intended for eco:isTargetOrganism. The term eco:nonTargetTaxa could actually list those taxa so the user would know what was considered non-target. The term eco:areNontargetOrganismsFullyReported seems extremely important, but it is also extremely problematic, the problem being that it seems like the value of this term would almost ALWAYS have to be 'false'.
So, there are clearly things to discuss, but here is an attempt to jump-start term definitions at least.
eco:hasNontargetTaxa
- definition: One or more dwc:Organisms of taxa outside the target taxonomic and organismal scopes were detected and reported for this dwc:Event. - comments: Should be empty if no taxonomic scope is declared. Should be 'true' if Occurrences of taxa outside the taxonomic and organismal scopes as defined at the time of the dwc:Event are included for the dwc:Event. Should be 'false' if no Occurrences of taxa outside the taxonomic and organismal scopes as defined at the time of the dwc:Event are included for the dwc:Event. - examples: 'true'; 'false'
eco:nonTargetTaxa
- definition: A list (concatenated and separated) of taxa reported during the dwc:Event that are outside of the eco:targetTaxonomicScope. - comments: Non-target taxa can be reported at any taxonomic level. Recommended best practice is to separate multiple values in a list with space vertical bar space ( | ). - examples: `Parabuteo unicinctus | Geranoaetus melanoleucus`; `Cetoniinae | Aclopinae | Cyclocephala modesta`
eco:areNontargetOrganismsFullyReported
- definition: Every dwc:Organism that was outside the eco:targetTaxonomicScope, and was detected during the dwc:Event, and was detectable using the given protocol, was reported. - comments: This term is only relevant if the dwc:Event used restricted search or open search methods and the value should otherwise be empty. If all dwc:Organisms not included within the eco:targetTaxonomicScope and detected during the dwc:Event were reported, the value should be 'true'. If all dwc:Organisms not included within the eco:targetTaxonomicScope and detected during the dwc:Event were not reported, the value should be 'false'. - examples: 'true; 'false'
Cheers,
John
On Mon, Aug 14, 2023 at 5:09 AM Kate Ingenloff kathryn.ingenloff@gmail.com wrote:
Hi John, Steve, Yani, et al.,
Thank you so much John and Steve. The landing page looks great! I'll be happy to help look over documentation this week.
As for bycatch, as Yani said, we've discussed this several times and I think it's important to include a couple of the obvious terms for data providers to denote if individual occurrences within a survey are bycatch (e.g., isBycatch or isNonTargetOrganism,) or perhaps to identify is a separate dataset (Event) of some level is nothing but bycatch (e.g., allBycatchReported or allNonTargetOrganismsReported). I think we would be remiss to not include a couple of relevant terms prior to public review. Discussion shouldn't have to take too long and feedback from the first round of review can help fill in the rest (if more than those two terms are necessary).
Just my two cents :)
Cheers, Kate
On Sat, Aug 12, 2023 at 2:43 AM John Wieczorek tuco@berkeley.edu wrote:
Hi folks,
Steve and I have been working through and finished (to the extent we can) the preparations of the documents needed for the public review of the Humboldt Extension. The idea is that the basic entry point to the review would be this landing page https://github.com/tdwg/hc/blob/main/docs/index.md and that everything to review would be accessible from there.
We need the Task Group to finalize all documents to be included and to authorize the Darwin Core Maintenance Group to initiate the review. When authorized, the Darwin Core Task Group will send a message introducing the submission and how people should review it. It would be great to have a brief statement presenting the proposal from the Task Group to have at the beginning of that message. The DwC Maintenance Group will also solicit the TDWG Outreach folks to publicize the public review via various channels and social media. Anyone will be welcome to further publicize it in any community that TDWG misses.
The issue of new terms for by-catch came up late in last Wednesday's meeting after several people had to leave. I don't feel comfortable including anything official from that conversation without the Task Group making decisions. There are a few reasonable options.
The first option for the "by-catch" terms is to add those terms now and include them in the proposal. That means work up front to make sure the terms are well-defined and thought through. Think of this ratification process very much as if it was the publication of a manuscript with peer review. As such, an important goal is to try to avoid avoidable public discussion, which has the potential to slow things down or even derail ratification.
A second option might be to propose the new terms during public review and see if there is buy-in. This strategy is likely to make the ratification process slower, and runs a risk (that I might be inventing) that if such an added proposal came from people in the Task Group, reviewers might view that our work was submitted unfinished.
A third option might be to leave the proposal as is without additional terms, get it through ratification, and sometime afterwards propose new terms. This follows the normal evolution process of Darwin Core, so there would not be anything odd about it. It would also guarantee that there is demand for such terms, as that is a prerequisite for accepting new term proposals.
It isn't for the Darwin Core Maintenance Group to decide the strategy the Task Group should take, but rather to advise and facilitate in the search for a successful proposal
I hope this feels like we are getting close.
Cheers,
John and Steve on behalf of the Darwin Core Maintenance Group _______________________________________________ tdwg-humboldt mailing list tdwg-humboldt@lists.tdwg.org https://lists.tdwg.org/mailman/listinfo/tdwg-humboldt
participants (1)
-
Yanina Sica