Hi Joel and all,

The current MIxS specification (attached) does indeed specify the use of a single value for biome, feature and material. I think this is an error, and probably arose from MIxS's history in microorganisms, where it was not common to think about multiple values for those fields. I will file an issue on the MIxS tracker, suggesting that they change that.

In an application I am working on, we allow multiple values for those fields, which I guess is officially a violation of MIxS.

Ramona

------------------------------------------------------
Ramona L. Walls, Ph.D.
Scientific Analyst, The iPlant Collaborative, University of Arizona
Research Associate, Bio5 Institute, University of Arizona
Laboratory Research Associate, New York Botanical Garden

On Fri, May 29, 2015 at 10:10 AM, joel sachs <jsachs@csee.umbc.edu> wrote:
Hi Pier,

I'll be happy to put my comments in the issue tracker. I also wanted to follow up on your comment that

"ENVO's guidelines suggest that there should be *at least* one class from
each hierarchy used. Indeed, multiple feature and material classes can and should be used to fully characterize an entity's environment."

I remember now that the ENVO paper mentions this, and gives, as an example, a duck in the water. Such a duck would have properties like
<dorsally_surrounded_by> <water>
<ventrally_surrounded_by> <air>

However, the sample data records provided as supplemental material to the MIXS paper all have one and only one value for each of "environmental feature", "environmental material", and "biome". The MIXS paper itself appears to be silent on the question of whether multiple values are allowed. The prototype annotation interface that my colleagues at AAFC are developing currently allows only a single value for each field. Do you know if this is right? wrong? unspecified? Maybe the real question is not what the standard permits/requires, but what will the repositories allow? Do you know what the repositories are thinking on this?

As an aside, I notice that MIXS uses the terms "biome", "environmental_material", and "environmental_feature" as properties, while ENVO uses them as classes. I like that, because I believe that we should allow semantics to be derived from syntax, and other forms of context.

I attended the Open Data conference here in Ottawa this week, and was a bit pressed for time, but would be happy to join a call about habitats.

Best,
Joel.



On Thu, 21 May 2015, Pier Luigi Buttigieg wrote:

Hi Joel,

These are valuable thoughts.
Could you post them to the ENVO issue tracker? It can be found here:
https://github.com/EnvironmentOntology/envo/issues

We can develop from there more effectively.

The habitat issue is one I plan to work on with Grant Godden next week in
Claremont.
I'm in Berkeley/Oakland right now working with a group interesting in urban
environments.
Shall we perhaps arrange a call?

Best,
Pier


On Wed, 20 May 2015 14:28:40 -0700, Ramona Walls <rlwalls2008@gmail.com>
wrote:
Hi Joel,

I completely agree about the need to unpack the multiple dimensions of
"habitat" in DWC, while not making DWC into an ontology. I suggest the
discussion of ENVO terms should probably move to the ENVO list (then come
back here, if/when we are ready to deal with adding them to DWC.

I'd be very happy to work with you on BCO/DWC stuff and here your needs.
Please feel free to just email me off list or email
bco-discuss@googlegroups.com. We have bi-weekly calls that are open to
anyone who wants to participate.

Ramona
[...]
On Mon, May 18, 2015 at 5:50 AM, joel sachs <jsachs@csee.umbc.edu> wrote:

Ramona,

Thanks for engaging. My thinking is this:

Our goal is to address the mess that is “habitat”. So much -
environmental
conditions, associated taxa, geography, geology - gets crammed into that
one poor term.

We want to either replace or augment or subclass “habitat” with some new
terms, with each new term capturing some key dimension of information
that
“people” are interested in. Before talking specifically about the terms
under consideration, let me disclose some of my biases:

<statement of biases>
i. I’d like to see terms that will help organize the thousands of
habitat
terms used by the authors of taxonomic treatments in the Flora of North
America. You can see some of these terms here - http://bit.ly/1Fnow3E

We’re currently categorizing these terms to enable semantic search over
habitat; as we do so, we are also thinking about the user interface of
treatment templates for future contributors. Our opinion is that we
should
look at existing habitat terms, answer the question “what have authors
been
trying to tell us?”, and then design interfaces that make it easier for
them to tell us those things. (This isn’t to say that we shouldn’t also
coax new types of information out of treatment authors, based on our
understanding of what data consumers want to know.)

The habitat dimensions that, for plants, keep coming up are: wet vs dry;
open canopy vs closed canopy; soil type; sloped vs. flat; and associated
taxa. We also see a range of geography/geology related dimensions.

ii. I think the semantics of Darwin Core terms should be clear from
their
natural language definitions. Ontologies that can be used to provide
values
for the terms should not be relied upon as documentation for the terms
themselves.

iii. I’ve long argued that Darwin Core should not make commitments to
any
particular upper ontologies. I explained some of my reasons in my 2013
TDWG
talk (http://bit.ly/1FjCimJ). I recall that you spoke with me about this
last year, and expressed support for the idea of creating ontological
layers on top of Darwin Core. I’ve since tried to do this for BCO, but
struggled. I’ve been meaning to ask for your help with this.
</statement of biases>

Back to the proposed terms:

        Biome. As has been noted by others, the currently proposed
definition is at odds with common understanding of the term, and is
somewhat confusing due to its dependance on evolution. I think I
understand
what the definition is getting at - namely, expanding the traditional
notion of biome to include microbiomes.

        Environmental feature. We often see habitat terms such as “rocky
outcrops”; “arroyos”; “bogs”; etc. So I can see a lot of utility in a
“feature” term. My preference would be to call it “geographic feature”
or
“geologic feature”, since I think that’s how most people think of such
features (see, e.g. geonames). The currently proposed definition is “A
material entity which determines an environmental system.” This
requirement
that the feature “determine” the environmental system strikes me as too
strong.

        Environmental material. The currently proposed definition is “A
portion of environmental material is a fiat object which forms the
medium
or part of the medium of an environmental system.”  A much clearer
definition is one that I’ve heard Pierre give, which was along the lines
of
“The substance that was displaced by the sample prior to its being
removed
from its environment”. In addition to being clearer, this second
definition
has the advantage of not relying on the BFO notion of “fiat object”.


Happy Victoria Day to all my fellow British subjects!
Joel.



On Wed, 13 May 2015, Ramona Walls wrote:

 This is my first chance to reply to this thread, but I think several of
Joel's comments need to be addressed.

1. re.: ENVO terms: "As far as I can tell, no one knows how to use
these
them."
 -- I know how to use them, and I know a community of people who know
 how
to
use them. True, that just like Darwin Core, they are often used
incorrectly,
but further documentation and outreach can help with that.

2. "There was a lot of confusion over whether particular aspects of an
environment constituted an environmental feature, an environmental
material,
or a biome. The correct answer was often dependent on context. For
example
if a small mammal were found in leaf litter, then "leaf litter" would
be
the
environmental material, and
the biome would be "forest". But if a microbe were sampled from the
same
leaf litter, then "leaf litter" would be the biome, and I'm not sure
what
the environmental material would be."
 -- ENVO very clearly distinguishes between a biome, a feature, and a
material. It is never the case that the same ENVO class can be use as
both a
biome and a feature or a feature and a material. Although the same
entity,
depending on its role, may serve as either a biome or material (or
feature
for that matter), in that case, it would be an instance of two
different
classes in ENVO. Take the leaf litter example. A correct annotation
would
need to point to both a "leaf litter biome" class and a "leaf litter
material" class. It is really crucial not to confuse material entities
in
world with the roles they take on as instances of classes in ENVO.
 -- Joel and I seem to remember outcomes of that RCN meeting quite
differently (probably we were in different break-out groups). As I
recall,
the major problem was that people couldn't use ENVO because the classes
they
needed were not in there, not because they didn't know how. This is a
problem that would actually be helped by DwC adopting ENVO, because it
would
create more users, and therefore more contributors to the ontology.
Another
major problem was that people often want to describe environments in
terms
of parameters like light level, salinity, temperature, etc. ENVO does
not
currently include classes like this, but a movement is underfoot to
perhaps
add such a branch to ENVO.

3. "Creating tripartite (biome/feature/material) decompositions of
habitats
sometimes makes sense. Certainly, it made sense for some of the early
metagenomic assays that gave rise to ENVO. But it doesn't always make
sense, and there are often better ways to characterize an environment."
 -- True, there are cases when you cannot specify a biome, feature, and
material for an organism, but usually you can provide at least one of
two
of
them, which goes a long way toward standardizing environmental records
and
making large-scale queries possible. I have not yet seen a better way
to
classify environment on this scale. As I mentioned above, when it comes
to
describing environments in terms of their physico-chemical paramaters,
ENVO
does not serve, but that does not negate the utility of ENVO-style
descriptions. Furthermore, as with most DwC terms, these are optional,
and
people/institutions who don't have to provide them if they are not
relevant.

4. "The terms "env_biome", "env_feature", and "env_material" already
exist
in
the MIxS Sample extension to Darwin Core (along with "submitted to
INSDC",
etc.). Why do they need to be moved into the core?"
 -- The main reason I can see is that they have a lot of applicability
outside of MIXS, that is, for occurrences that do not have any
sequences
associated with them, and should not be hidden away in a place that
suggests
they can only be applied to sequence data.

Ramona

------------------------------------------------------
Ramona L. Walls, Ph.D.
Scientific Analyst, The iPlant Collaborative, University of Arizona
Research Associate, Bio5 Institute, University of Arizona
Laboratory Research Associate, New York Botanical Garden

On Fri, Apr 24, 2015 at 3:00 AM, <tdwg-content-request@lists.tdwg.org>
wrote:
      Send tdwg-content mailing list submissions to
              tdwg-content@lists.tdwg.org



      Today's Topics:

         1. Re: Darwin Core Proposal - environment terms (joel sachs)


----------------------------------------------------------------------

      Message: 1
      Date: Thu, 23 Apr 2015 13:29:47 -0400 (EDT)
      From: joel sachs <jsachs@csee.umbc.edu>
      Subject: Re: [tdwg-content] Darwin Core Proposal - environment
      terms
      To: John Wieczorek <tuco@berkeley.edu>
      Cc: TDWG Content Mailing List <tdwg-content@lists.tdwg.org>
      Message-ID:

      <Pine.LNX.4.64.1504231321240.18117@linuxserver1.cs.umbc.edu>
      Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

      John,

      I have some concerns with these terms. As far as I can tell, no
      one knows
      how to use these them. I was at a phenotype RCN meeting last
      year where
      the theme was environmental ontologies. The attendees were
      pretty savvy in
      terms of both ontologies, and environmental terminology. We were
      given an
      overview of ENVO, and then, as an experiment, we broke into
      groups, and
      each group tried to use ENVO to describe particular
      environments. I don't
      recall any group being successful. There was a lot of confusion
      over
      whether particular aspects of an environment constituted an
      environmental
      feature, an environmental material, or a biome. The correct
      answer was
      often dependent on context. For example if a small mammal were
      found in
      leaf litter, then "leaf litter" would be the environmental
      material, and
      the biome would be "forest". But if a microbe were sampled from
      the same
      leaf litter, then "leaf litter" would be the biome, and I'm not
      sure what
      the environmental material would be.

      Due to the confusion, Pier Luigi gave us a more in-depth
      tutorial when we
      re-convened. We didnt break back out into groups, but I wish we
      had,
      because I wonder if we would have had much more success.

      Creating tripartite (biome/feature/material) decompositions of
      habitats
      sometimes makes sense. Certainly, it made sense for some of the
      early
      metagenomic assays that gave rise to ENVO. But it doesn't always
      make
      sense, and there are often better ways to characterize an
      environment. I
      think it was a mistake for these terms to be made mandatory in
      MIxS/MIMARKS.

      But the question isn't "What should MIxS do four years ago?",
      but "What
      should TDWG do now?". One wrinkle is that dwc:Habitat already
      exists. Will it stay in the core? Is the idea to create usage
      guides that
      explain when to use dwc:Habitat and when and how to use biome,
      feature,
      and material? Such an approach could work, but I'd like to see
      our usage
      guides differ from current ENVO/MIxS guidelines which mandate
      one and only
      one value for each of the terms. "Environmental feature", in
      particular,
      often merits multiple uses within the same record, and I think
      disallowing
      such usage would impede uptake of the term set. (As far as I can
      see
      from browsing metagenomic sampling metadata, it *has* impeded
      uptake of the term set.)

      So I'm not necessarily opposed to the addition of these terms,
      but I do
      wonder why we need them.

      You wrote that "there is currently no possibility of a Darwin
      Core
      PreservedSpecimen or MaterialSample record to meet the minimum
      requirements of a Mimarks Specimen record[6], as there is
      currently no way
      to share required environment terms." But MIMARKS specimen
      records are
      also required to have the fields "Submitted to INSDC",
      "Investigation-type", "Project name", "Nucleic acid sequence
      source",
      "Target gene or locus", and "Sequencing method". So won't it
      still be the
      case that there will be no possibility of a Darwin Core record
      being MIMARKS compliant, without appropriate
      augmentation?

      The terms "env_biome", "env_feature", and "env_material" already
      exist in
      the MIxS Sample extension to Darwin Core (along with "submitted
      to INSDC",
      etc.). Why do they need to be moved into the core?

      Cheers,
      Joel.



      On Thu, 26 Mar 2015, John Wieczorek wrote:

     > Dear all,
     >
     > This message pertains to a proposal[1] set forth in September
      2013
     > concerning the environment terms biome, environmentalFeature,
      and
     > environmentalMaterial. I'm renewing the proposal because so
      much time has
     > passed and the original proposal was not carried through to
      completion.
     > There were no objections to the addition of those terms during
      the initial
     > public commentary. Discussion revolved around how the
      recommendations for
     > how to populate them.
     >
     > The recommendations for all three terms will suggest using a
      controlled
     > vocabulary such as ENVO. The examples will be based on the set
      of
     > subclasses of the corresponding ENVO terms for biome[2],
     > environmentalFeature[3], and environmentalMaterial[4]. As with
      all Darwin
     > Core terms, the constraints on content are not part of the
      definition -
     > they are only illustrative recommendations.
     >
     > The importance of these terms was recognized anew at a Darwin
      Core and MIxS
     > Hackathon in Florence in Sep 2014[5]. One important outcome of
      that
     > workshop was the the realization that there is currently no
      possibility of
     > a Darwin Core PreservedSpecimen or MaterialSample record to
      meet the
     > minimum requirements of a Mimarks Specimen record[6], as there
      is currently
     > no way to share required environment terms. This creates a
      huge and easy to
     > solve barrier to integration of data across the collection,
      sample, and
     > sequence realms.
     >
     > This proposal is not substantively different from the one
      discussed in
     > 2013. It differs from the final amended previous proposal in
      two ways, 1)
     > only the three terms biome, environmentalFeature, and
      environmentalMaterial
     > are proposed here (the proposal to change to the term
      'habitat' has been
     > dropped), and 2) the term definitions have been updated to
      agree with those
     > in ENVO. The terms will be in the Darwin Core namespace
      (following the TDWG
     > community consensus in the previous discussion as well the
      consensus to
     > coin the MaterialSample class in the Darwin Core namespace
      rather than use
     > obi:specimen, with the equivalency being made on the ontology
      side in
     > BCO[7]).
     >
     > The complete definitions of the three proposed terms is given
      below the
     > following references. This reopens the 30-day public
      commentary period for
     > the addition of new terms as described in the Darwin Core
      Namespace
     > Policy[8].
     >
     > [1] Original tdwg-content proposal for environment terms.
     >

http://lists.tdwg.org/pipermail/tdwg-content/2013-September/003066.html
     > [2] ENVO biome. http://purl.obolibrary.org/obo/ENVO_00000428
     > [3] ENVO environmentalFeature.
      http://purl.obolibrary.org/obo/ENVO_00002297
     > [4] ENVO environmentalMaterial.
      http://purl.obolibrary.org/obo/ENVO_00010483
     > [5] DwC MIxS Meeting Notes.
     >

https://docs.google.com/document/d/1Zexgsiol6WC83vDzMTCF3uUB7DcFmKL15DFEPbw
      5w6c/edit?usp=sharing
     > [6] Table of the core items of Mimarks checklists.
     >
      http://www.nature.com/nbt/journal/v29/n5/fig_tab/nbt.1823_T1.html
     > [7] Biological Collections Ontology.
      https://github.com/tucotuco/bco
     > [8] Darwin Core Namespace Policy.
     >
      http://rs.tdwg.org/dwc/terms/namespace/index.htm#classesofchanges
     >
     >
     > Term Name: biome
     > Identifier: http://rs.tdwg.org/dwc/terms/biome
     > Namespace: http://rs.tdwg.org/dwc/terms/
     > Label: Biome
     > Definition: An environmental system to which resident
      ecological
     > communities have evolved adaptations.
     > Comment: Recommended best practice is to use a controlled
      vocabulary such
     > as defined by the biome class of the Environment Ontology
      (ENVO). Examples:
     > "flooded grassland biome",
     > "http://purl.obolibrary.org/obo/ENVO_01000195".
     > Type of Term:
      http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
     > Refines:
     > Status: proposed
     > Date Issued: 2013-09-26
     > Date Modified: 2015-03-26
     > Has Domain:
     > Has Range:
     > Refines:
     > Version: biome-2015-03-26
     > Replaces:
     > IsReplaceBy:
     > Class: http://rs.tdwg.org/dwc/terms/Event
     > ABCD 2.0.6: not in ABCD
     >
     > Term Name: environmentalFeature
     > Identifier: http://rs.tdwg.org/dwc/terms/environmentalFeature
     > Namespace: http://rs.tdwg.org/dwc/terms/
     > Label: Environmental Feature
     > Definition: A material entity which determines an
      environmental system.
     > Comment: Recommended best practice is to use a controlled
      vocabulary such
     > as defined by the environmental feature class of the
      Environment Ontology
     > (ENVO). Examples: "meadow",
     > "http://purl.obolibrary.org/obo/ENVO_00000108".
     > Type of Term:
      http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
     > Refines:
     > Status: proposed
     > Date Issued: 2013-09-26
     > Date Modified: 2015-03-26
     > Has Domain:
     > Has Range:
     > Refines:
     > Version: environmentalFeature-2015-03-26
     > Replaces:
     > IsReplaceBy:
     > Class: http://rs.tdwg.org/dwc/terms/Event
     > ABCD 2.0.6: not in ABCD
     >
     > Term Name: environmentalMaterial
     > Identifier: http://rs.tdwg.org/dwc/terms/environmentalMaterial
     > Namespace: http://rs.tdwg.org/dwc/terms/
     > Label: Environmental Material
     > Definition: A portion of environmental material is a fiat
      object which
     > forms the medium or part of the medium of an environmental
      system.
     > Comment: Recommended best practice is to use a controlled
      vocabulary such
     > as defined by the environmental feature class of the
      Environment Ontology
     > (ENVO). Examples: "scum",
     > "http://purl.obolibrary.org/obo/ENVO_00003930".
     > Type of Term:
      http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
     > Refines:
     > Status: proposed
     > Date Issued: 2013-09-26
     > Date Modified: 2015-03-26
     > Has Domain:
     > Has Range:
     > Refines:
     > Version: environmentalMaterial-2015-03-26
     > Replaces:
     > IsReplaceBy:
     > Class: http://rs.tdwg.org/dwc/terms/Event
     > ABCD 2.0.6: not in ABCD
     >