Hi all,

Thank you Debbie and John!

I am almost done with the mapping of the dataset with the updated list of terms, but I need a bit more time because it’s very difficult to populate certain fields (e.g. eco:isLifeStageScopeComplete) due to it being at the dwc:Event level (limitation of star schema). I am also writing script to infer non-detection for presence-only data using Humboldt Extension for this dataset. This is still work in progress. I have what I need and logic in mind, but my R is rusty. I am doing this because it is otherwise difficult for me to see how I should populate the fields (and what not to do) in order to be able to interpret the data (make inference of non-detection). 

With this exercise, I came to a conclusion that all of the Humboldt terms related to scope and completeness MUST be populated at the lowest level of dwc:Event in order to infer non-detection within a dataset. The results of the exercise and best practices, I believe, is secondary and not the most important for the ratification process, but I hope my conclusion above could help to shape the Hierarchical Events document.

Regarding the __ID fields that Debbie mentioned, I formatted the table with dwciri: prefix of the Humboldt terms “pretending” that this feature is available on IPT: https://github.com/gbif/ipt/issues/1947 
This is especially useful for the scope terms (eco:targetTaxonomicScope, eco:targetLifeStageScope) This leads to my question - how to deal with list/array value for terms like dwciri:samplingPerformedBy? I noted that Ben Norton worked on a white paper about complex values/arrays (https://github.com/gbif/ipt/issues/1947#issuecomment-1447101388), wonder if anyone knows what happened? 


Cheers
Ming


On 31 Jul 2023, at 16:52, Paul, Deborah <dlpaul@illinois.edu> wrote:

Greetings John, et al.

Stepping in gently here (having not been at all these conversations). So please feel free to ignore this if just too off-basae.

I'm wondering why not add 

Identification,identifiedByID? to

Identification,identifiedBy
Identification,identificationReferences

And coherently then, 

Sampling Effort,samplingPerformedByID
to go with 
Sampling Effort,samplingPerformedBy

Same for Sampling and Vouchers?

Material Collected,hasVouchers
Material Collected,voucherInstitutions

Who vouchered? (recordedBy, recordedByID?)

And, for all of the above information outlined in the XML list of groups and related terms, where does the information go about who did this sampling, as part of what project? grant numbers? etc. The EML file? (Where's the metadata?). Thanks for clueing me in.

Exciting to see the work getting to the place where it can be implemented 🙂

Debbie

From: tdwg-humboldt <tdwg-humboldt-bounces@lists.tdwg.org> on behalf of John Wieczorek <tuco@berkeley.edu>
Sent: Monday, July 31, 2023 9:18 AM
To: Humboldt Core TG <tdwg-humboldt@lists.tdwg.org>
Cc: Wesley M. Hochachka <wmh6@cornell.edu>; tdwg-humboldt <tdwg-humboldt-bounces@lists.tdwg.org>
Subject: Re: [tdwg-humboldt] Humboldt next steps!
 
Hi folks,

I hope the listserv is working.

I have made the necessary edits to the eco: terms in preparation for a new extension XML file. I set up the script so that the extension XML can be generated from the term_versions.csv file (the canonical management file where the term versions are archived). The script allows the XML file to contain sections in which to group terms so that they can be seen and mapped together in the IPT. I have proposed some new group names and rearranged the terms in them in the hope that it might make the mapping process flow better in the IPT.  The names of the group, the order of groups and the order of terms within the groups can all be set easily in a configuration file separate from the term_versions.csv file. It would be nice if we could decide as a Task Group on these groupings and orders before making a release. It will help people assess the proposal too, I believe. The groupings and term order I propose follow:

group,term
Site,siteCount
Site,siteNestingDescription
Site,verbatimSiteDescriptions
Site,verbatimSiteNames
Site,geospatialScopeAreaInSquareKilometers
Site,totalAreaSampledInSquareKilometers
Site,reportedWeather
Site,reportedExtremeConditions

Habitat Scope,targetHabitatScope
Habitat Scope,excludedHabitatScope

Temporal Scope,eventDuration
Temporal Scope,eventDurationUnit

Taxonomic Scope,targetTaxonomicScope
Taxonomic Scope,excludedTaxonomicScope
Taxonomic Scope,taxonCompletenessReported
Taxonomic Scope,taxonCompletenessProtocols
Taxonomic Scope,isTaxonomicScopeComplete
Taxonomic Scope,isAbsenceReported
Taxonomic Scope,absentTaxa

Organismal Scope,targetLifeStageScope
Organismal Scope,excludedLifeStageScope
Organismal Scope,isLifeStageScopeComplete
Organismal Scope,targetDegreeOfEstablishmentScope
Organismal Scope,excludedDegreeOfEstablishmentScope
Organismal Scope,isDegreeOfEstablishmentScopeComplete
Organismal Scope,targetGrowthFormScope
Organismal Scope,excludedGrowthFormScope
Organismal Scope,isGrowthFormScopeComplete

Identification,identifiedBy
Identification,identificationReferences

Methodology Description,compilationType
Methodology Description,compilationSourceTypes
Methodology Description,inventoryTypes
Methodology Description,protocolNames
Methodology Description,protocolDescription
Methodology Description,protocolReferences
Methodology Description,isAbundanceReported
Methodology Description,isAbundanceCapReported
Methodology Description,abundanceCap
Methodology Description,isVegetationCoverReported
Methodology Description,isLeastSpecificTargetCategoryQuantityInclusive

Material Collected,hasVouchers
Material Collected,voucherInstitutions
Material Collected,hasMaterialSamples
Material Collected,materialSampleTypes

Sampling Effort,samplingPerformedBy
Sampling Effort,isSamplingEffortReported
Sampling Effort,samplingEffortProtocol
Sampling Effort,samplingEffortValue
Sampling Effort,samplingEffortUnit

The original groups and term order were:

General Dataset & Identification,samplingPerformedBy
General Dataset & Identification,identificationReferences
General Dataset & Identification,identifiedBy

Geospatial Scope & Habitat,geospatialScopeAreaInSquareKilometers
Geospatial Scope & Habitat,totalAreaSampledInSquareKilometers
Geospatial Scope & Habitat,siteNestingDescription
Geospatial Scope & Habitat,siteCount
Geospatial Scope & Habitat,verbatimSiteDescriptions
Geospatial Scope & Habitat,verbatimSiteNames
Geospatial Scope & Habitat,targetHabitatScope
Geospatial Scope & Habitat,excludedHabitatScope
Geospatial Scope & Habitat,reportedWeather
Geospatial Scope & Habitat,reportedExtremeConditions

Temporal Scope,eventDuration
Temporal Scope,eventDurationUnit

Taxonomic & organismal Scope,targetTaxonomicScope
Taxonomic & organismal Scope,targetDegreeOfEstablishmentScope
Taxonomic & organismal Scope,targetGrowthFormScope
Taxonomic & organismal Scope,targetLifeStageScope
Taxonomic & organismal Scope,excludedTaxonomicScope
Taxonomic & organismal Scope,excludedDegreeOfEstablishmentScope
Taxonomic & organismal Scope,excludedGrowthFormScope
Taxonomic & organismal Scope,excludedLifeStageScope

Methodology Description,compilationType
Methodology Description,compilationSourceTypes
Methodology Description,inventoryTypes
Methodology Description,protocolNames
Methodology Description,protocolDescription
Methodology Description,protocolReferences
Methodology Description,isAbundanceReported
Methodology Description,isAbundanceCapReported
Methodology Description,abundanceCap
Methodology Description,isVegetationCoverReported
Methodology Description,hasVouchers
Methodology Description,voucherInstitutions
Methodology Description,hasMaterialSamples
Methodology Description,materialSampleTypes
Methodology Description,isLeastSpecificTargetCategoryQuantityInclusive

Completeness & Effort,isSamplingEffortReported
Completeness & Effort,samplingEffortProtocol
Completeness & Effort,samplingEffortValue
Completeness & Effort,samplingEffortUnit
Completeness & Effort,taxonCompletenessReported
Completeness & Effort,taxonCompletenessProtocols
Completeness & Effort,isTaxonomicScopeComplete
Completeness & Effort,isDegreeOfEstablishmentScopeComplete
Completeness & Effort,isGrowthFormScopeComplete
Completeness & Effort,isLifeStageScopeComplete
Completeness & Effort,isAbsenceReported
Completeness & Effort,absentTaxa

On Wed, Jul 26, 2023 at 12:14 PM Yanina Sica <yanina.sica@gmail.com> wrote:
Hi all, 

Sorry about the listserve errors, duplicate emails, and event cancellations..the universe conspired against us today. Despite this, we managed to meet today and had a very good and productive meeting.

We have resolved our latest doubts regarding the principles in the hierarchical doc and summarized our next steps:
- John will finish some small edits to the list of terms and update the IPT
- Ming will map again the BROKE-West campaign with the latest eco: terms and guiding principles
- John and Ming will populate some good examples in the hierarchical document
- Dmitry will try to track GBIF's attention (and funding?) to push the Humboldt User Guide to the next level and transform it into a nice format that allows easy updates.

I cleaned and edited some of the previous documents and organized the Drive.
Next week, we will take a look at the examples from Ming and John and, if we all agree, we could formally submit the extension!We are getting there!

Thank you all for your last efforts!

All the best!

PS: You should have received a calendar invite for next week. In case you did not, please let me know. You can also find the info below:

When to meet
Wednesdays 8 am EST (see your time here)
Where to meet

Yani

Virus-free.www.avast.com
_______________________________________________
tdwg-humboldt mailing list
tdwg-humboldt@lists.tdwg.org
https://lists.tdwg.org/mailman/listinfo/tdwg-humboldt
_______________________________________________
tdwg-humboldt mailing list
tdwg-humboldt@lists.tdwg.org
https://lists.tdwg.org/mailman/listinfo/tdwg-humboldt