Hi all,
I just wanted to forward the updates to everyone (especially for Kate and Peter Brenton) to keep you all posted about the updates for the absence use case.
Peter - Your datasets from ALA for the Humboldt use case for the new data model were mentioned in the meeting. Yani is looking into adding that for the absence use case?
Related to this, I also wanted to bring up a previous email thread that I had with John Wieczorek to keep challenges to be solved in separate use cases. I was asking if it is possible to use the same dataset for combined use cases (community measurements + absence), this is his reply which makes lots of sense: "
From our (GBIF's use case) perspective, keeping the challenges that are to be solved in separate use cases will make for easier review, not only by us, but by others interested in the proposed solutions. For that reason I would keep an absence use case distinct from a community measurements use case, though both of them can use the same underlying dataset as the example. It sounds like there will be many variations on the absence challenge, and it might end up that putting several of those together in a single use case might eventually be of benefit (how to deal with every kind of absence), but for now I would like to identify the distinct variations and solve each of them. Does that make sense to you all who have the more intimate knowledge?
"
Other than that, it was also acknowledged that taxonomic knowledge changes over time and it is a challenges for any taxonomic data. You can follow the discussions in the google doc herehttps://docs.google.com/document/d/1xQGLHQW3hlMapT5utYi6kWXE7XbFjAGvoWI_V5IfbFA/edit?usp=sharing. I hope this is sufficient to keeping everyone updated.
Cheers Ming
Begin forwarded message:
From: John Wieczorek tuco@berkeley.edu Subject: Re: Absence Use Case Meetings Summary Date: 12 January 2023 at 21:28:36 CET To: "Benson, Abigail L" albenson@usgs.gov Cc: Yi Ming Gan ymgan@naturalsciences.be, Anton Van de Putte avandeputte@naturalsciences.be, Wesley Hochachka wmh6@cornell.edu, "debora.arlt@slu.se" debora.arlt@slu.se, Tim Robertson trobertson@gbif.org, Yanina Sica yanina.sica@yale.edu, Rob Stevenson rdstevenson10@gmail.com, Ruben Perez ruben.perez@vliz.be, "Provoost, Pieter" p.provoost@unesco.org Reply-To: tuco@berkeley.edu
Thank you for the summary, Abby, and sorry to everyone for not making the meetings yesterday.
But this will not work for something like eBird where a matrix would still be too unwieldy so we will still need a way to report taxon lists for events or datasets. The Humboldt extensiontargetTaxonomicScopehttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftdwg.github.io%2Fhc%2Fterms%2F%23hc%3AtargetTaxonomicScope&data=05%7C01%7Cymgan%40naturalsciences.be%7Cf761a1969398457c97de08daf4db9d72%7C4da8d35db472419c9e15665786aadbfb%7C1%7C0%7C638091521324499981%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=JU1Ct1Gji1iAd7eqUbH%2FIb%2BCej5%2FVmL9%2FC9rdOV0aMs%3D&reserved=0 could be the answer for this. One concern I had from yesterday is providing these lists pipe delimited in that field but the new model should give us the ability to make terms expandable (John do I have that correct?).
Yes, that is correct. By expandable we mean that there could be many different targetTaxonomicScope entries for an Event, each with one Taxon rather than only one targetTaxonomicScope entry for an Event with a list in a single |-delimited string,
Cheers,
John
From: "Benson, Abigail L" albenson@usgs.gov Subject: Absence Use Case Meetings Summary Date: 12 January 2023 at 20:58:46 CET To: Yi Ming Gan ymgan@naturalsciences.be, Anton Van de Putte avandeputte@naturalsciences.be, Wesley Hochachka wmh6@cornell.edu, "debora.arlt@slu.se" debora.arlt@slu.se, Tim Robertson trobertson@gbif.org, John Wieczorek tuco@berkeley.edu, Yanina Sica yanina.sica@yale.edu, Rob Stevenson rdstevenson10@gmail.com, "'Ruben Perez'" ruben.perez@vliz.be, "Provoost, Pieter" p.provoost@unesco.org
Hi All,
Thank you for the time yesterday and today to continue working through this topic. Yesterday we reviewed the mini use cases we have so far and agreed that a similarity is the need to infer absences to reduce data size and a major difference was the need to document life stage / sex absences. We determined that while changing taxonomy is an issue that needs to be considered, it's bigger than the absence use case can address. Also there is no need to determine a priori the taxon list. It was great to get an overview of the progress to date for the Humboldt extension- thanks Yani!
I met with John today and have a few updates to share with you all from that meeting.
1. Occurrence as an event has been added to the unified modelhttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1QpXwole_j32QZAg6ddqOrAB5OOdqVJKdoKKzz06CK-o%2Fedit%23heading%3Dh.z3p24m5xv4v9&data=05%7C01%7Cymgan%40naturalsciences.be%7C47c900d02320469671c608daf4d77239%7C4da8d35db472419c9e15665786aadbfb%7C1%7C0%7C638091503421941023%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=16ni%2Fzj36fRB2vVCyLCBS%2BJOkU4kf7SCXNW8uxumSNM%3D&reserved=0. I don't think that affects what we are doing with this use case because we would prefer absences are inferred rather than making them explicit but it's good to know that we have the option of continuing to use occurrenceStatus if needed. 2. A species site matrix data format is being considered for inclusion in the new data model. This would probably simplify our ability to report the sampling event type of data that we are usually working with where taxa are provided as column headers. 3. But this will not work for something like eBird where a matrix would still be too unwieldy so we will still need a way to report taxon lists for events or datasets. The Humboldt extension targetTaxonomicScopehttps://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftdwg.github.io%2Fhc%2Fterms%2F%23hc%3AtargetTaxonomicScope&data=05%7C01%7Cymgan%40naturalsciences.be%7C47c900d02320469671c608daf4d77239%7C4da8d35db472419c9e15665786aadbfb%7C1%7C0%7C638091503421941023%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=w724WKqKoXPpv6IgoUOC8bIOsWIurKvsH6KJnagiyPQ%3D&reserved=0 could be the answer for this. One concern I had from yesterday is providing these lists pipe delimited in that field but the new model should give us the ability to make terms expandable (John do I have that correct?). 4. One use case we should keep in mind for the true absence concept are folks that may want to report that something is not in a geographic area yet, e.g. an invasive species is not on a particular island yet.
The plan going forward:
1. It would still be good to develop definitions for the concepts we have identified related to absences and share that out to the community for feedback. We should continue to work on this over the next month or two. 2. John will take the two mini-use cases we have (BROKE-west trawl and eBird) and work them up. Hopefully this will be ready for us to review before the end of the month I believe. But if not it may not be ready until sometime closer to May or June I'm guessing.
Hopefully that gives some sense of where we are headed. I'll schedule a meeting for us soon to work on developing the definitions for the concepts.
Thank you! Abby
------------ Abby Benson (she/her) Biologist U.S. Geological Survey Science Analytics and Synthesis Lakewood, CO 303.202.4087 ORCID: orcid.org/0000-0002-4391-107Xhttps://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Forcid.org%2F0000-0002-4391-107X&data=05%7C01%7Cymgan%40naturalsciences.be%7C47c900d02320469671c608daf4d77239%7C4da8d35db472419c9e15665786aadbfb%7C1%7C0%7C638091503421941023%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xCED8gbc%2FpvZ%2BN6cl%2FT5QfhgBTtnEguVn51H7mvhFAw%3D&reserved=0