tdwg-content
Threads by month
- ----- 2024 -----
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2003 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2002 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2001 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2000 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1999 -----
- December
- November
- October
- September
- August
- 1557 discussions
04 Sep '14
> clear need to distinguish betweendistinguish between processes that generate a specimen as an output (a material entity) and those that instead generate only information or data (what we call an information content entity).
Absolutely. O&M has two distinct "Process" classes - one used in observations to generate results, and one used to transform specimens. I also have resisted all suggestions that these be merged, since, despite them both being 'processes', the type of output is critically different!
Somehow the wires got crossed in this conversation and it appears the wrong impression has been received. Not sure how that happened.
Simon Cox | Research Scientist
CSIRO Land and Water
PO Box 56, Highett Vic 3190, Australia
Tel +61 3 9252 6342 | Mob +61 403 302 672
simon.cox(a)csiro.au | http://csiro.au/people/SimonCox
----------------------------------------------------------------------
Message: 1
Date: Wed, 3 Sep 2014 10:54:06 -0700
From: Ramona Walls <rlwalls2008(a)gmail.com>
Subject: [tdwg-content] Darwin Core: proposed news terms for
expressing sample data
To: TDWG Content Mailing List <tdwg-content(a)lists.tdwg.org>
Message-ID:
<CAJYF1k5k=AMKQgK1a7eZ7nyLZtoD+DDcgdFJYoCxm7o=yZy=bA(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Simon,
It is an interesting perspective to say that "the only real reason to
collect and curate a specimen is to support observations". Although I don't
disagree, this does depend on a very broad sense definition "observation".
In the BCO, following OBI, we define the process of specimen collection as
contingent on the material collected potentially being used for some
current or future investigation. I don't think it is much of a stretch to
see the overlap between "use in some investigation" and "support
observations", so I don't think we are at odds here.
Nonetheless, as Rob mentioned, there is a clear need (for data tracking
purposes) to distinguish between processes that generate a specimen as an
output (a material entity) and those that instead generate only information
or data (what we call an information content entity). We should probably
start a new email thread if we want to continue this discussion on the TDWG
list, but I am glad we got your input on this and hope we can continue to
coordinate.
Ramona
------------------------------------------------------
Ramona L. Walls, Ph.D.
Scientific Analyst, The iPlant Collaborative, University of Arizona
Research Associate, Bio5 Institute, University of Arizona
Laboratory Research Associate, New York Botanical Garden
1
0
OK - There were actually two points hidden in my email.
The first (relationship of sampling and observations) was really just a throwaway.
Probably shouldn't have even gone there in the context of this conversation.
But please note that sampling and observations are very very clearly separated in O&M (as is the feature-of-interest from an observation (the former is usually a sample), and as is the notion of 'result' from 'observation' (the latter being an action or event which generates the former)).
These are the key features of the O&M model, the development of which was triggered by conversations where these things were confused.
O&M was written up as an attempt to disentangle things (it is rooted in earlier "patterns" work from Fowler and O'Dea).
Please don't surmise what might be said in O&M without taking a look - there is no conflation.
(The perversity I referred to is rather that a model for sampling could be established without describing the relationship with observations, but I was being a smart-aleck and probably shouldn't have even mentioned it.
In O&M there is merely a loose link between them, in that a SamplingFeature has an (optional) property 'relatedObservation' to link a sample to observations made on it. )
The second point is that, in the fields where I have experience, the key sampling metadata relates to provenance.
Since there is now a standard ontology for provenance, maybe a sample model can be derived from this.
Hence
sam:Specimen rdfs:subClassOf prov:Entity , sam:SamplingFeature .
Simon
----------------------------------------------------------------------
Message: 1
Date: Tue, 2 Sep 2014 08:41:50 -0600
From: Robert Guralnick <Robert.Guralnick(a)colorado.edu>
Subject: Re: [tdwg-content] Darwin Core: proposed news terms for
expressing sample data
To: Simon.Cox(a)csiro.au
Cc: TDWG Content Mailing List <tdwg-content(a)lists.tdwg.org>, Ramona
Walls <rlwalls2008(a)gmail.com>
Message-ID:
<CADAgxGWFmcKgEfYhfQ6u9=7gEzRgSSDdguaaOiPVcwK=ftwz=g(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Simon --- Perhaps your email also illustrates Ramona's point about the complexity of the landscape right now when it comes to the _details_ of how
we might model samples, and sampling processes. Darwin Core recently
added a materialSample term, that was a refinement of OBI's "specimen" (
http://purl.obolibrary.org/obo/OBI_0100051) and with a description: "A resource describing the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed" (
http://rs.tdwg.org/dwc/dwctype/MaterialSample)
Maybe what you and what others are developing are equivalent expressions and perhaps they are not, but my initial reaction was that separating specimens and observations is a remarkably important and clarifying idea, and not perverse at all. I think the difference is that what you might call an observation is what the Biocollections Ontology community would refer to as a "sampling process" and what we call an "observation" is more related to some documentation of a "thing" in nature.
I say this recognizing that others, even those working on the Biocollections Ontology or Darwin Core, may have radically different notions of those same concepts. I hope not, but this is not simple to model. What I would like to see is some clarity among those putting in all this effort to do so more strategically, with bridges built across efforts and locations.
Best, Rob
On Mon, Sep 1, 2014 at 10:31 PM, <Simon.Cox(a)csiro.au> wrote:
> Hi Ramona -
>
> I understand your concern, though I would counterpoint that the only
> real reason to collect and curate a specimen is to support
> observations, either contemporaneously or at some future time.
> So it could be seen as slightly perverse to suggest that a model for
> specimens and samples could be divorced from the notion of observations.
>
> FWIW I'm right now trying to develop a simplified SamplingFeatures
> ontology, still conceptually based on the ISO 19156 model, but with no
> commitments to marginal ontologies (i.e. lift it out of the ISO 19100
> ghetto). This has led me to consider re-use of more standard ontologies.
> W3C Prov-O is interesting. Since a lot of the information that you
> would want to record about a specimen concerns its provenance, then it
> probably makes sense to align with prov. Currently I have
>
> sam:Specimen a owl:Class ;
> rdfs:comment "A Specimen is a physical sample, obtained for
> observation(s) normally carried out ex-situ, sometimes in a
> laboratory."^^xsd:string ;
> rdfs:label "Specimen"@en ;
> rdfs:subClassOf prov:Entity , sam:SamplingFeature ;
> rdfs:subClassOf [ a owl:Restriction ;
> owl:cardinality "1"^^xsd:nonNegativeInteger ;
> owl:onProperty sam:samplingTime
> ] .
>
> Simon
>
1
0
Simon,
It is an interesting perspective to say that "the only real reason to
collect and curate a specimen is to support observations". Although I don't
disagree, this does depend on a very broad sense definition "observation".
In the BCO, following OBI, we define the process of specimen collection as
contingent on the material collected potentially being used for some
current or future investigation. I don't think it is much of a stretch to
see the overlap between "use in some investigation" and "support
observations", so I don't think we are at odds here.
Nonetheless, as Rob mentioned, there is a clear need (for data tracking
purposes) to distinguish between processes that generate a specimen as an
output (a material entity) and those that instead generate only information
or data (what we call an information content entity). We should probably
start a new email thread if we want to continue this discussion on the TDWG
list, but I am glad we got your input on this and hope we can continue to
coordinate.
Ramona
------------------------------------------------------
Ramona L. Walls, Ph.D.
Scientific Analyst, The iPlant Collaborative, University of Arizona
Research Associate, Bio5 Institute, University of Arizona
Laboratory Research Associate, New York Botanical Garden
On Wed, Sep 3, 2014 at 3:00 AM, <tdwg-content-request(a)lists.tdwg.org> wrote:
> Send tdwg-content mailing list submissions to
> tdwg-content(a)lists.tdwg.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> or, via email, send a message with subject or body 'help' to
> tdwg-content-request(a)lists.tdwg.org
>
> You can reach the person managing the list at
> tdwg-content-owner(a)lists.tdwg.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of tdwg-content digest..."
>
>
> Today's Topics:
>
> 1. Re: Darwin Core: proposed news terms for expressing sample
> data (Robert Guralnick)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 2 Sep 2014 08:41:50 -0600
> From: Robert Guralnick <Robert.Guralnick(a)colorado.edu>
> Subject: Re: [tdwg-content] Darwin Core: proposed news terms for
> expressing sample data
> To: Simon.Cox(a)csiro.au
> Cc: TDWG Content Mailing List <tdwg-content(a)lists.tdwg.org>, Ramona
> Walls <rlwalls2008(a)gmail.com>
> Message-ID:
> <CADAgxGWFmcKgEfYhfQ6u9=7gEzRgSSDdguaaOiPVcwK=ftwz=
> g(a)mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Simon --- Perhaps your email also illustrates Ramona's point about the
> complexity of the landscape right now when it comes to the _details_ of how
> we might model samples, and sampling processes. Darwin Core recently
> added a materialSample term, that was a refinement of OBI's "specimen" (
> http://purl.obolibrary.org/obo/OBI_0100051) and with a description: "A
> resource describing the physical results of a sampling (or subsampling)
> event. In biological collections, the material sample is typically
> collected, and either preserved or destructively processed" (
> http://rs.tdwg.org/dwc/dwctype/MaterialSample)
>
> Maybe what you and what others are developing are equivalent expressions
> and perhaps they are not, but my initial reaction was that separating
> specimens and observations is a remarkably important and clarifying idea,
> and not perverse at all. I think the difference is that what you might
> call an observation is what the Biocollections Ontology community would
> refer to as a "sampling process" and what we call an "observation" is more
> related to some documentation of a "thing" in nature.
>
> I say this recognizing that others, even those working on the
> Biocollections Ontology or Darwin Core, may have radically different
> notions of those same concepts. I hope not, but this is not simple to
> model. What I would like to see is some clarity among those putting in all
> this effort to do so more strategically, with bridges built across efforts
> and locations.
>
> Best, Rob
>
>
> On Mon, Sep 1, 2014 at 10:31 PM, <Simon.Cox(a)csiro.au> wrote:
>
> > Hi Ramona -
> >
> > I understand your concern, though I would counterpoint that the only real
> > reason to collect and curate a specimen is to support observations,
> either
> > contemporaneously or at some future time.
> > So it could be seen as slightly perverse to suggest that a model for
> > specimens and samples could be divorced from the notion of observations.
> >
> > FWIW I'm right now trying to develop a simplified SamplingFeatures
> > ontology, still conceptually based on the ISO 19156 model, but with no
> > commitments to marginal ontologies (i.e. lift it out of the ISO 19100
> > ghetto). This has led me to consider re-use of more standard ontologies.
> > W3C Prov-O is interesting. Since a lot of the information that you would
> > want to record about a specimen concerns its provenance, then it probably
> > makes sense to align with prov. Currently I have
> >
> > sam:Specimen a owl:Class ;
> > rdfs:comment "A Specimen is a physical sample, obtained for
> > observation(s) normally carried out ex-situ, sometimes in a
> > laboratory."^^xsd:string ;
> > rdfs:label "Specimen"@en ;
> > rdfs:subClassOf prov:Entity , sam:SamplingFeature ;
> > rdfs:subClassOf [ a owl:Restriction ;
> > owl:cardinality "1"^^xsd:nonNegativeInteger
> ;
> > owl:onProperty sam:samplingTime
> > ] .
> >
> > Simon
> >
> > ===============
> > Date: Mon, 1 Sep 2014 20:38:33 -0700
> > From: Ramona Walls <rlwalls2008(a)gmail.com>
> > Subject: Re: [tdwg-content] tdwg-content Digest, Vol 63, Issue 14
> > To: TDWG Content Mailing List <tdwg-content(a)lists.tdwg.org>
> > Message-ID:
> > <CAJYF1k6wcwPWtUt6ZHr8OgEcBJVYHUq0jiYc0hqEJeMY_kQ=
> > 1A(a)mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > Thank you, Simon, for that explanation and the links. They were very
> > helpful. Amen to the point: "There is no ?sample? class, because it is
> such
> > an overloaded word (noun, verb, statistical sample vs ex-situ sample,
> > etc)." The documents you shared highlight the very important point that
> OGC
> > and OBO-E were designed specifically to describe observations.
> >
> > Darwin Core, on the other hand, was designed to capture information about
> > "taxa, their occurrence in nature as documented by observations,
> specimens,
> > samples, and related information" [1]. As such, observations are not
> > central to Darwin Core, but rather are included as evidence of the
> > occurrence of a taxon in nature. It works for communicating basic
> > information about an observation or other evidence of a taxon's
> occurrence,
> > but I think it would be mis-using and abusing DwC to try to shoe-horn the
> > complexity of observation data/metadata into it. It also does some
> > dis-service to the communities who have spent so much time developing OGC
> > and OBO-E.
> >
> > Eamonn, this is not meant to discredit the work that you and your
> > colleagues have done to develop a DwC archive schema for sampling data. I
> > think it is an important step toward developing a comprehensive framework
> > for biodiversity data, and just by proposing it, we have moved a step in
> > the right direction (even if I disagree about adopting it). Your point
> that
> > OBO-E is far more complex is true, and we may have to adopt more terms if
> > we accurately want to describe observation data in DwC. On the other
> hand,
> > we do not need to necessarily adopt every aspect of OBO-E to exchange
> > observation data.
> >
> > What the BCO participants -- and thanks to all the GBIF people who have
> > participated! -- are trying to do is build a framework that can work
> across
> > many (not necessarily all) types of biodiversity data, including specimen
> > collection and observations, while considering existing efforts such as
> > DwC, MIxS, OBO-E, and OBO Foundry ontologies. We started with specimens,
> > but the intention has always been to link to observation data as well
> [2].
> > Although the full BCO model probably will be large and complex, we fully
> > intend to offer views that are basically subsets of the ontology filtered
> > for applications. This is regular practice now in application ontologies.
> > Views makes it possible to provide a controlled vocabulary for data
> > annotation without burdening annotators with a confusing array of terms
> and
> > logical definitions.
> >
> > However, the point that BCO is not yet ready for your needs is correct,
> and
> > I would never tell anyone to just "hold on to your data until the
> ontology
> > is ready". Did you examine the possibility of using EML as an exchange
> > format for the sampling/survey related data? DwC-A already has an EML
> > component, so I wonder if some combination of an occurrence core with an
> > extended EML document (based on OGC) would work.
> >
> > Ramona
> >
> >
> >
> > [1] http://rs.tdwg.org/dwc/index.htm
> > [2]
> >
> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0089606
> >
> > ------------------------------------------------------
> > Ramona L. Walls, Ph.D.
> > Scientific Analyst, The iPlant Collaborative, University of Arizona
> > Research Associate, Bio5 Institute, University of Arizona
> > Laboratory Research Associate, New York Botanical Garden
> >
> > _______________________________________________
> > tdwg-content mailing list
> > tdwg-content(a)lists.tdwg.org
> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
> >
>
1
0
Thank you, Simon, for that explanation and the links. They were very
helpful. Amen to the point: "There is no ?sample? class, because it is such
an overloaded word (noun, verb, statistical sample vs ex-situ sample,
etc)." The documents you shared highlight the very important point that OGC
and OBO-E were designed specifically to describe observations.
Darwin Core, on the other hand, was designed to capture information about
"taxa, their occurrence in nature as documented by observations, specimens,
samples, and related information" [1]. As such, observations are not
central to Darwin Core, but rather are included as evidence of the
occurrence of a taxon in nature. It works for communicating basic
information about an observation or other evidence of a taxon's occurrence,
but I think it would be mis-using and abusing DwC to try to shoe-horn the
complexity of observation data/metadata into it. It also does some
dis-service to the communities who have spent so much time developing OGC
and OBO-E.
Eamonn, this is not meant to discredit the work that you and your
colleagues have done to develop a DwC archive schema for sampling data. I
think it is an important step toward developing a comprehensive framework
for biodiversity data, and just by proposing it, we have moved a step in
the right direction (even if I disagree about adopting it). Your point that
OBO-E is far more complex is true, and we may have to adopt more terms if
we accurately want to describe observation data in DwC. On the other hand,
we do not need to necessarily adopt every aspect of OBO-E to exchange
observation data.
What the BCO participants -- and thanks to all the GBIF people who have
participated! -- are trying to do is build a framework that can work across
many (not necessarily all) types of biodiversity data, including specimen
collection and observations, while considering existing efforts such as
DwC, MIxS, OBO-E, and OBO Foundry ontologies. We started with specimens,
but the intention has always been to link to observation data as well [2].
Although the full BCO model probably will be large and complex, we fully
intend to offer views that are basically subsets of the ontology filtered
for applications. This is regular practice now in application ontologies.
Views makes it possible to provide a controlled vocabulary for data
annotation without burdening annotators with a confusing array of terms and
logical definitions.
However, the point that BCO is not yet ready for your needs is correct, and
I would never tell anyone to just "hold on to your data until the ontology
is ready". Did you examine the possibility of using EML as an exchange
format for the sampling/survey related data? DwC-A already has an EML
component, so I wonder if some combination of an occurrence core with an
extended EML document (based on OGC) would work.
Ramona
[1] http://rs.tdwg.org/dwc/index.htm
[2]
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0089606
------------------------------------------------------
Ramona L. Walls, Ph.D.
Scientific Analyst, The iPlant Collaborative, University of Arizona
Research Associate, Bio5 Institute, University of Arizona
Laboratory Research Associate, New York Botanical Garden
On Fri, Aug 29, 2014 at 3:00 AM, <tdwg-content-request(a)lists.tdwg.org>
wrote:
> Send tdwg-content mailing list submissions to
> tdwg-content(a)lists.tdwg.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> or, via email, send a message with subject or body 'help' to
> tdwg-content-request(a)lists.tdwg.org
>
> You can reach the person managing the list at
> tdwg-content-owner(a)lists.tdwg.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of tdwg-content digest..."
>
>
> Today's Topics:
>
> 1. Re: Darwin Core: proposed news terms for expressing sample
> data (Simon.Cox(a)csiro.au)
> 2. Re: tdwg-content Digest, Vol 63, Issue 6 (sigh) (?amonn)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 29 Aug 2014 07:58:14 +0000
> From: <Simon.Cox(a)csiro.au>
> Subject: Re: [tdwg-content] Darwin Core: proposed news terms for
> expressing sample data
> To: <tdwg-content(a)lists.tdwg.org>
> Message-ID:
> <
> 2A7346E8D9F62D4CA8D78387173A054A5FFF5801(a)exmbx04-cdc.nexus.csiro.au>
> Content-Type: text/plain; charset="utf-8"
>
> G?day again TDWGers:
>
> Matt passed on links to this thread to me and suggested I comment, as I
> was the author of the O&M standard (published as ISO 19156:2011 and OGC
> Abstract Spec Topic 20).
>
> For those who are not aware of this work, there is a short Wikipedia page
> http://en.wikipedia.org/wiki/Observations_and_Measurements whose main
> value is it has links to a number of more detailed resources.
> Probably the richest of these is another Wiki page at CSIRO
> https://www.seegrid.csiro.au/wiki/AppSchemas/ObservationsAndSampling
> which hasn?t been updated much recently, but at least has some diagrams
> embedded.
> As Matt and others have hinted, as a result of a workshop at NCEAS a few
> years ago, there were some tweaks to allow it to meet some of the
> requirements identified in OBOE, just in time to beat the ISO deadline!
>
> O&M includes a generic model for ?Sampling Features? ? being those
> artefacts that are created to assist the observation process, but would not
> exist and have very much interest otherwise.
> Things like specimens, transects, sections, quadrats, scenes and swaths,
> drillholes, flightlines, trajectories, ships tracks, etc.
> Because it is a generic standard, you won?t find things with names
> familiar to any particular discipline, and there are a lot of stub classes
> for supporting information which need filling out for specific applications.
> But the intention is that it provides a framework for a discipline or
> community to specialize for their purposes, while retaining some topology
> and perhaps terminology (maybe just as super-classes) that help with
> information sharing across discipline boundaries.
> The main properties of a sampling feature are
>
> - The sampledFeature ? being the domain object which it is being
> used to characterize
>
> - Related sampling features ? other features related to the
> observational strategy
>
> - Related observations ? observation events that use this
> sampling feature (for which another generic model is provided)
> We?ve generally found it helpful in teasing apart observational records
> and protocols in a variety of environmental science applications, and other
> have applied it in oceans, meteorology, even air-traffic control!
>
> The primary classification of sampling features in O&M is by topological
> dimension (point, curve, surface, solid), because these are commonly used
> and afford common processing methods.
> ?Specimen? is the other concrete sampling-feature type.
> There is no ?sample? class, because it is such an overloaded word (noun,
> verb, statistical sample vs ex-situ sample, etc).
>
> O&M and its Sampling Feature model was designed in UML.
> As Matt notes that the original implementation in the OGC context was in
> XML, using GML
> http://schemas.opengis.net/samplingSpecimen/2.0/specimen.xsd and
> http://schemas.opengis.net/samplingSpatial/2.0/spatialSamplingFeature.xsd
> .
> However, it has been implemented other ways: there is an OWL2/RDFS
> representation at
> http://def.seegrid.csiro.au/isotc211/iso19156/2011/sampling which is
> linked in with OWL versions of a bunch of the other ISO standards, and
> therefore probably makes too many commitments for the faint hearted ? see
> paper from ISWC 2013 here http://ceur-ws.org/Vol-1063/paper1.pdf
> O&M was also one of the core inputs to the W3C Semantic Sensor Network
> ontology, reported here:
> http://www.w3.org/2005/Incubator/ssn/wiki/Incubator_Report though that
> focussed on the sensors and observations side of the equation, and hardly
> deals with sampling.
>
> Hope this helps.
>
> >> Date: Thu, 21 Aug 2014 18:52:06 -0800
> >> From: Matt Jones <jones(a)nceas.ucsb.edu<mailto:jones@nceas.ucsb.edu>>
> >> Subject: Re: [tdwg-content] Darwin Core: proposed news terms for
> >> expressing sample data
> >> To: ?amonn ? Tuama [GBIF] <eotuama(a)gbif.org<mailto:eotuama@gbif.org>>
> >> Cc: TDWG Content Mailing List <tdwg-content(a)lists.tdwg.org<mailto:
> tdwg-content(a)lists.tdwg.org>>
> >> Message-ID:
> >>
> <CAFSW8xkx7uRP9PC2g3=JT_VJanqujH8nPXoz8GXwh+JwKw5Ccw(a)mail.gmail.com
> <mailto:JT_VJanqujH8nPXoz8GXwh%2BJwKw5Ccw@mail.gmail.com>>
> >> Content-Type: text/plain; charset="utf-8"
> >>
> >> This proposal is treading on ground that is quite similar to other
> >> observations and measurements standards for data exchange that are
> already
> >> mature, in particular:
> >>
> >> * OGC Observations and Measurements (
> >> http://www.opengeospatial.org/standards/om)
> >> * Extensible Observation Ontology (OBOE;
> >> https://semtools.ecoinformatics.org/oboe)
> >>
> >> The former is a standard and broadly deployed, whereas the latter is
> part
> >> of a research program in the use of ontologies for measurements.
> Through
> >> collaboration between the two projects, they've been modified to be
> >> reasonably isomorphic, but O&M uses an XML serialization while OBOE uses
> an
> >> OWL-DL serialization. They largely express the same measurements and
> >> sampling model once one gets beyond the terminology differences.
> >>
> >> So, I'm wondering if it make much sense to extend Darwin Core, which is
> at
> >> heart an Occurrence exchange syntax, into this measurements area that is
> >> well represented by these other existing specifications? I'm curious to
> >> hear why people would even want to do this. And if we do go down this
> >> path, won't we just end up with a new syntax that does essentially what
> O&M
> >> and OBOE do now?
> >>
> >> Matt
>
>
>
2
2
Re: [tdwg-content] Darwin Core: proposed news terms for expressing sample data
by Simon.Cox@csiro.au 02 Sep '14
by Simon.Cox@csiro.au 02 Sep '14
02 Sep '14
Hi Ramona -
I understand your concern, though I would counterpoint that the only real reason to collect and curate a specimen is to support observations, either contemporaneously or at some future time.
So it could be seen as slightly perverse to suggest that a model for specimens and samples could be divorced from the notion of observations.
FWIW I'm right now trying to develop a simplified SamplingFeatures ontology, still conceptually based on the ISO 19156 model, but with no commitments to marginal ontologies (i.e. lift it out of the ISO 19100 ghetto). This has led me to consider re-use of more standard ontologies. W3C Prov-O is interesting. Since a lot of the information that you would want to record about a specimen concerns its provenance, then it probably makes sense to align with prov. Currently I have
sam:Specimen a owl:Class ;
rdfs:comment "A Specimen is a physical sample, obtained for observation(s) normally carried out ex-situ, sometimes in a laboratory."^^xsd:string ;
rdfs:label "Specimen"@en ;
rdfs:subClassOf prov:Entity , sam:SamplingFeature ;
rdfs:subClassOf [ a owl:Restriction ;
owl:cardinality "1"^^xsd:nonNegativeInteger ;
owl:onProperty sam:samplingTime
] .
Simon
===============
Date: Mon, 1 Sep 2014 20:38:33 -0700
From: Ramona Walls <rlwalls2008(a)gmail.com>
Subject: Re: [tdwg-content] tdwg-content Digest, Vol 63, Issue 14
To: TDWG Content Mailing List <tdwg-content(a)lists.tdwg.org>
Message-ID:
<CAJYF1k6wcwPWtUt6ZHr8OgEcBJVYHUq0jiYc0hqEJeMY_kQ=1A(a)mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Thank you, Simon, for that explanation and the links. They were very
helpful. Amen to the point: "There is no ?sample? class, because it is such
an overloaded word (noun, verb, statistical sample vs ex-situ sample,
etc)." The documents you shared highlight the very important point that OGC
and OBO-E were designed specifically to describe observations.
Darwin Core, on the other hand, was designed to capture information about
"taxa, their occurrence in nature as documented by observations, specimens,
samples, and related information" [1]. As such, observations are not
central to Darwin Core, but rather are included as evidence of the
occurrence of a taxon in nature. It works for communicating basic
information about an observation or other evidence of a taxon's occurrence,
but I think it would be mis-using and abusing DwC to try to shoe-horn the
complexity of observation data/metadata into it. It also does some
dis-service to the communities who have spent so much time developing OGC
and OBO-E.
Eamonn, this is not meant to discredit the work that you and your
colleagues have done to develop a DwC archive schema for sampling data. I
think it is an important step toward developing a comprehensive framework
for biodiversity data, and just by proposing it, we have moved a step in
the right direction (even if I disagree about adopting it). Your point that
OBO-E is far more complex is true, and we may have to adopt more terms if
we accurately want to describe observation data in DwC. On the other hand,
we do not need to necessarily adopt every aspect of OBO-E to exchange
observation data.
What the BCO participants -- and thanks to all the GBIF people who have
participated! -- are trying to do is build a framework that can work across
many (not necessarily all) types of biodiversity data, including specimen
collection and observations, while considering existing efforts such as
DwC, MIxS, OBO-E, and OBO Foundry ontologies. We started with specimens,
but the intention has always been to link to observation data as well [2].
Although the full BCO model probably will be large and complex, we fully
intend to offer views that are basically subsets of the ontology filtered
for applications. This is regular practice now in application ontologies.
Views makes it possible to provide a controlled vocabulary for data
annotation without burdening annotators with a confusing array of terms and
logical definitions.
However, the point that BCO is not yet ready for your needs is correct, and
I would never tell anyone to just "hold on to your data until the ontology
is ready". Did you examine the possibility of using EML as an exchange
format for the sampling/survey related data? DwC-A already has an EML
component, so I wonder if some combination of an occurrence core with an
extended EML document (based on OGC) would work.
Ramona
[1] http://rs.tdwg.org/dwc/index.htm
[2]
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0089606
------------------------------------------------------
Ramona L. Walls, Ph.D.
Scientific Analyst, The iPlant Collaborative, University of Arizona
Research Associate, Bio5 Institute, University of Arizona
Laboratory Research Associate, New York Botanical Garden
2
1
Hi Markus,
Much of the "muddle" that I was referring to has been exposed in a series
of recent hackathons (to which we have tried to always invite someone from
GBIF), and has not yet been published, although [1,2,3] touch on the issue.
I will try to describe some of the problems we have encountered here.
For those of you who are tired of this thread already, the short answer is
that the problem lies in the lack of underlying clarity and semantics in
Darwin Core, not DwC-A. I think Darwin Core archives do a pretty good job
of capturing the content of Darwin Core terminology, but as those terms are
often (sometime intentionally) ambiguous and imprecise, those aspects are
captured too.
Below are a few of the specific issues. I don't think much of this will be
new to the readers of this list.
1. Ambiguously defined classes: Darwin Core only has a handful of classes
(Occurence, Event. Location, etc.), and those are intended to be used as
grouping classes, rather than strictly defined entities. Therefore, if
someone uses the corresponding properties (OccurenceID, EvenID,
LocationID), they are more or less free to attach those IDs to whatever
they want. In many cases, the type of entity referred to by OccurenceID can
be inferred from the object of basisOfRecord, but not always, and it is
still an inference with some error. *IF* DwC does go in the direction of
capturing sample/survey data, one thing that would help would be to have it
specified via a set of controlled vocabulary terms in basisOfRecord. I
won't even start with Taxon.
2. Domain- and range-less properties: DwC properties (the bulk of the terms
in DwC) are intentionally left without domains and ranges to make them
maximally re-usable. For traditional museum specimen data, this works
reasonably well, and the intended meaning is often clear. We recently
completed the exercise of specifying domains and ranges for the whole DwC
vocabulary, as interpreted for an Occurence Core archive, and found that
many of them do not refer to DwC classes as their domain. This makes in
impossible to infer their meaning without a set of external assumptions.
Our interpreted ranges of DwC properties are a mix of data values
(literals) and potential classes in other ontologies. {I will share this
mapping on request - it is still under review}. The proposed properties
also have no domains or ranges. For example, the suggested range of
'quantity type' is a heterogeneous mix of entities (individuals, biomass,
%species, scale type) that sets off a huge red flag to my ontological self.
And what about the domain of 'quantity type' and 'quantity'? As they are in
the Occurence extension table, they probably are meant to describe whatever
is represented by occurenceID, but this is not specified.
Vague domains and ranges may work for simple sampling schemas that could be
interpreted as "taxon W (value of occurenceID) has quantity X (value of
quantity) with unit Y (value of samplingUnit) of quantity type Z (value of
quantityType)", but even then it is likely to be ambiguous. Now suppose
W=123someID (with taxonName as Bufo bufo), X=9, Y=individuals, and Z is
left blank, because that is just what people do sometimes. Let's be
generous and suppose that there is also some sampling geometry and location
information. How to interpret this? Does this mean that there is a jar in a
museum with 9 toads in it, all of which came form a single collecting event
at that location? Does it mean that there was a survey done of a the
location described, and 9 toads were counted but not collected? Does it
mean that they surveyed the area and estimated toad abundace at 9 toads
m^-2? Even if they are conscientious and put in m^2 as the sampling unit,
the meaning is still unclear. Did they measure an area of 1 m2 and find 9
toads or an area of 90 m2 and find 90 toads or did measure an area of 90 m2
and find 9 toads? All of these would be valid interpretations. Of course,
many sampling schemas are much more complex with this, with nested sampling
protocols or repeat sampling of the same area. There is just no way to
capture that without more semantics the DwC can provide.
3. Using the wrong property for information: Even if the larger community
interprets a DwC property with some certainty, that doesn't stop people
from using it wrong. This is often not the fault of DwC or DwC archives, if
people don't take the time to make sure they are filling out the fields
properly. Even with very clearly defined ontology terms, some curators just
don't take the time to read an understand the definitions. However, the
ambiguity and vagueness of DwC terms surely contributes to this.
-----
I think that sampling or survey data is just to structured to describe well
with a flat schema like DwC. That is why most of it to date has been stored
in relational databases. To repeat myself, I think it would make more sense
to first develop a sound semantic model for the data then extract a Darwin
Core archive format for transmitting it, rather than build the DwC archive
first and later try to map it to a semantic model.
Ramona
[1] http://www.biomedcentral.com/1471-2105/15/257
[2]
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0089606
[3]http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3746421/
------------------------------------------------------
Ramona L. Walls, Ph.D.
Scientific Analyst, The iPlant Collaborative, University of Arizona
Research Associate, Bio5 Institute, University of Arizona
Laboratory Research Associate, New York Botanical Garden
On Sat, Aug 30, 2014 at 3:00 AM, <tdwg-content-request(a)lists.tdwg.org>
wrote:
> Send tdwg-content mailing list submissions to
> tdwg-content(a)lists.tdwg.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
> or, via email, send a message with subject or body 'help' to
> tdwg-content-request(a)lists.tdwg.org
>
> You can reach the person managing the list at
> tdwg-content-owner(a)lists.tdwg.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of tdwg-content digest..."
>
>
> Today's Topics:
>
> 1. Re: tdwg-content Digest, Vol 63, Issue 6 (sigh) (Markus D?ring)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 29 Aug 2014 12:02:40 +0200
> From: Markus D?ring <m.doering(a)mac.com>
> Subject: Re: [tdwg-content] tdwg-content Digest, Vol 63, Issue 6
> (sigh)
> To: Ramona Walls <rlwalls2008(a)gmail.com>
> Cc: TDWG Content Mailing List <tdwg-content(a)lists.tdwg.org>, John Deck
> <jdeck(a)berkeley.edu>
> Message-ID: <A6B43FE0-A595-4526-BD13-EE196D950435(a)mac.com>
> Content-Type: text/plain; charset="windows-1252"
>
> Dear Ramona,
>
> could you point me to the evidence for muddled semantic mappings in
> existing dwc archives? I would like to better understand the problem. Is it
> a general issue with Darwin Core terms and their lose definition or just
> their application in dwc archives?
>
> best,
> Markus
>
>
>
> On 29 Aug 2014, at 06:35, Ramona Walls <rlwalls2008(a)gmail.com> wrote:
>
> > I also really do appreciate GBIF's pressing need to serve survey/sample
> data, and I don't have a major of a problem with the idea of an Event core
> or even adding new terms to DwC (in principle). Rather, I am urging caution
> in how we proceed with it.
> >
> > ?amonn, in response to your statement "Once the BCO model is available
> for uptake, it should be possible to develop a mapping between it and the
> simple DwC sample model" I will respond: possible, maybe, easy, no way,
> unambiguous, probably impossible. The problem I foresee is that once the
> "simple DwC sample model" is in place, people will start using it to do all
> kinds of not so simple things, and the mapping will become muddled. We have
> ample evidence that this is the case with existing Darwin Core archives.
> >
> > Going back to the five new terms that ?amonn proposed, I would like to
> see if we can link them NOW to existing ontology terms (as other have
> proposed), thus making their semantics explicit from the start, but still
> allowing GBIF/EU-BON to proceed with the work they need to do. This will
> not prevent people form misusing terms, but may at least help make mapping
> easier later. In cases where the terms can't be mapped to an existing term,
> BCO curators would be willing to help develop a term or set of terms that
> can convey meaning required, or work with other ontology developers to get
> the terms added elsewhere.
> >
> > Trying to be constructive, I attempted to do a quick and dirty,
> preliminary mapping of the five terms (quantity, quantity type, sampling
> geometry, sampling unity, and event series ID), bearing in mind that I am
> not an authority on OBOE or OGC ontologies. [Aside: Based on what I know,
> OGC ontologies are not yet sufficiently developed to provide the semantics
> we need, but I would love for someone to show me otherwise.]
> >
> > A serious problem with mapping these terms to existing ontologies is
> that some of them do NOT map to a single ontology term (namely, quantity,
> quantity type, and sampling geometry). This is evidence that the proposed
> terms could indeed be interpreted in multiple ways and further supports the
> argument that it would not be easy to retrospectively add them to a
> semantic framework at some later date.
> >
> > I think there is a path forward that would allow for both the
> expressiveness of OBOE and other ontologies and convenience of standard
> exchange formats.
> >
> > Ramona
> > ------------------------------------------------------
> > Ramona L. Walls, Ph.D.
> > Scientific Analyst, The iPlant Collaborative, University of Arizona
> > Research Associate, Bio5 Institute, University of Arizona
> > Laboratory Research Associate, New York Botanical Garden
> >
> >
> > On Thu, Aug 28, 2014 at 8:58 AM, Robert Guralnick <
> Robert.Guralnick(a)colorado.edu> wrote:
> >
> > Hi all --- Ok, I think the scope of the issue is quite clear. Let me
> summarize: 1) As ?amonn and the rest of GBIF has made quite clear, "GBIF
> is faced with the immediate task of making sample-based data discoverable
> and accessible using its current ecosystem of tools" given a funding
> mandate from EU-BON. 2) The solution for this problem is to develop an
> Event-core and to promote new terms to the Darwin Core to make this
> happen. I will note a small inconsistency here: the current ecosystem
> standards and tools of is Darwin Core (as it stands) and publishing systems
> such as IPT. That ecosystem of tools includes mechanisms to extend Darwin
> Core where needed, via extensions. The current ecosystem of tools doesn't
> include new Cores or new DwC terms, does it?
> >
> > So this leads in nicely to the contentious issue(s) and places where
> there seems to be discussions --- these have to do with the nature of the
> changes suggested and the scope of those changes, both in terms of an Event
> core and DwC term additions. Leaving aside the Event-core for now, the key
> questions simply about term additions to the Darwin Core that seem to be at
> heart here are: 1) Is the intent of the Darwin Core to model surveys,
> which usually involve multiple kinds and types of sampling over multiple
> sites using multiple methods? 2) Is the solution to invent new terms for
> the Darwin Core if there are already terms from other efforts, wouldn't we
> work with those existing efforts to assure interoperability?
> >
> > I appreciate the efforts of GBIF here fully, and am personally torn
> because on the one hand, I fully agree with the goal of extending Darwin
> Core to better represent richer biodiversity data. On the other hand, I
> worry about process here and how to make that happen in a way that isn't
> too hasty or locks us into just the opposite of what I think many of us
> want with regards to sharing data more broadly than within just one
> ecosystem of tools.
> >
> > Best, Rob
> >
> >
> >
> >
> >
> > On Thu, Aug 28, 2014 at 6:30 AM, John Deck <jdeck(a)berkeley.edu> wrote:
> > I see the rational for enabling this in Darwin Core Archives and adding
> the new terms. However, back to what Matt Jones brought up: "won't we just
> end up with a new syntax that does essentially what O&M and OBOE do now?".
> >
> > We should include explicit references to existing terms/definitions that
> encapsulate what we're talking about, e.g. in our MaterialSample proposal
> last year we linked the an existing term in OBI, which has a much richer
> description and context for MaterialSample than what we considered (
> https://code.google.com/p/darwincore/issues/detail?id=167)
> >
> > Have we explored the possibility of doing this with OBOE? I'm not
> suggesting we adopt OBOE wholesale, but it seems like we have a good
> opportunity to enable better semantic linking with that efforts.
> >
> > John
> >
> > On Thu, Aug 28, 2014 at 4:23 AM, ?amonn ? Tuama [GBIF] <eotuama(a)gbif.org>
> wrote:
> > Thanks, Ramona and Rob.
> >
> > I'd like to add a few points following on Markus's reply.
> >
> > I think your pressing of the need for a robust semantic model for
> > biodiversity sample/survey data is incontestable ? we do need one and it
> > should enable rich data integration once it is defined and the tools and
> > data standards to support it become available. However, GBIF is faced
> with
> > the immediate task of making sample-based data discoverable and
> accessible
> > using its current ecosystem of tools (IPT) and exchange standards (DwC;
> > EML). Waiting for a functional, implementable semantic model and the
> tools
> > and support services for it is just not an option for us right now.
> >
> > We have already spend considerable time in analysing the merits of
> > Occurrence core vs Event core and have opted for an Event core for
> reasons
> > previously given. I don?t believe we are trying to reconfigure Event (?an
> > action that occurs at a place and during a period of time?) and
> regardless
> > of whether we use Occurrence or Event, the need for some additional terms
> > arise (e.g., quantity, quantityType, samplingGeometry, samplingUnit).
> Once
> > the BCO model is available for uptake, it should be possible to develop a
> > mapping between it and the simple DwC sample model.
> >
> > So GBIF?s stance is that we need to take a two-pronged approach by
> exploring
> > how the IPT and DwC-A can be adapted for publishing sample-based data in
> the
> > near term while supporting the work of TDWG and groups such as the BCO in
> > advancing biodiversity informatics. GBIF has already engaged in the work
> of
> > the BCO and will continue to do so.
> >
> > ?amonn
> >
> > -----Original Message-----
> > From: tdwg-content-bounces(a)lists.tdwg.org
> > [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Markus D?ring
> > Sent: 28 August 2014 12:44
> > To: Ramona Walls
> > Cc: TDWG Content Mailing List
> > Subject: Re: [tdwg-content] tdwg-content Digest, Vol 63, Issue 6
> >
> > Hi Ramona & Rob,
> >
> > The Event proposal does not try to change the semantics of an Event, it
> just
> > uses the existing Darwin Core Event "class" at the core in Darwin Core
> > archives. The actual change proposed is simply adding 3 new terms to the
> > Event "group" to better share information about sampling methods &
> efforts,
> > extending the existing limited capabilities of Darwin Core which already
> has
> > the terms dwc:samplingProtocol and dwc:samplingEffort. It also proposes 2
> > new terms for dealing with quantity of Occurrences, something that has
> been
> > discussed since 2012 now, when I had proposed a new abundance term [2].
> >
> >
> > In general application of Darwin Core is not at all limited to specimens
> and
> > observations. It is used for sharing taxonomic datasets already and it's
> > definition and goal is broad. Let me cite some of the introduction to
> Darwin
> > Core [1]:
> >
> > What is the Darwin Core?
> > The Darwin Core is body of standards. It includes a glossary of terms (in
> > other contexts these might be called properties, elements, fields,
> columns,
> > attributes, or concepts) intended to facilitate the sharing of
> information
> > about biological diversity by providing reference definitions, examples,
> and
> > commentaries. The Darwin Core is primarily based on taxa, their
> occurrence
> > in nature as documented by observations, specimens, samples, and related
> > information.
> >
> > Motivation: The Darwin Core standard was originally conceived to
> facilitate
> > the discovery, retrieval, and integration of information about modern
> > biological specimens, their spatiotemporal occurrence, and their
> supporting
> > evidence housed in collections (physical or digital). The Darwin Core
> today
> > is broader in scope and more versatile. It is meant to provide a stable
> > standard reference for sharing information on biological diversity. As a
> > glossary of terms, the Darwin Core is meant to provide stable semantic
> > definitions with the goal of being maximally reusable in a variety of
> > contexts.
> >
> >
> > Markus
> >
> >
> > [1] http://rs.tdwg.org/dwc/index.htm
> > [2] https://code.google.com/p/darwincore/issues/detail?id=142
> >
> >
> > --
> > Markus D?ring
> > Software Developer
> > Global Biodiversity Information Facility (GBIF)
> > mdoering(a)gbif.org
> > http://www.gbif.org
> >
> >
> >
> > >> On Tue, Aug 19, 2014 at 6:11 AM, ?amonn ? Tuama [GBIF] <
> eotuama(a)gbif.org>
> > wrote:
> > >>
> > >> Dear All,
> > >>
> > >>
> > >>
> > >> GBIF is committed to exploring ways in which the IPT and Darwin Core
> > Archive format can be extended for publishing sample-based data sets. In
> > association with the EU BON project [1], a customised version of the IPT
> [2]
> > has been deployed to test this using a special type of Darwin Core
> Archive
> > in which the core is an ?Event? with associated taxon occurrences in an
> > ?Occurrence? extension.
> > >>
> > >>
> > >>
> > >> The Darwin Core vocabulary already provides a rich set of terms with
> many
> > relevant for describing sample-based data. Synthesising several sources
> of
> > input (GBIF organised workshop on sample data, May 2013 [3], discussions
> on
> > the TDWG mailing list in late 2013; internal discussion among EU BON
> project
> > partners), five new terms relating to sample data were identified as
> > essential. The complete model including these new terms are fully
> described
> > with examples in the online document ?Publishing sample data using the
> GBIF
> > IPT? [4].
> > >>
> > >>
> > >>
> > >> As a first step towards ratification, we would like to register the
> new
> > terms in the DwC Google Code tracker [5] if there are no major
> objections on
> > this list. The five terms are:
> > >>
> > >>
> > >>
> > >> 1. quantity: the number or enumeration value of the quantityType
> > (e.g., individuals, biomass, biovolume, BraunBlanquetScale) per
> samplingUnit
> > or a percentage measure recorded for the sample.
> > >>
> > >>
> > >>
> > >> 2. quantityType: : the entity being referred to by quantity,
> e.g.,
> > individuals, biomass, %species, scale type.
> > >>
> > >>
> > >>
> > >> 3. samplingGeometry: an indication of what kind of space was
> > sampled; select from point, line, area or volume.
> > >>
> > >>
> > >>
> > >> 4. samplingUnit: the unit of measurement used for reporting the
> > quantity in the sample, e.g., minute, hour, day, metre, metre^2, metre^3.
> > It is combined with quantity and quantityType to provide the complete
> > measurement, e.g., 9 individuals per day, 4 biomass-gm per metre^2.
> > >>
> > >>
> > >>
> > >> 5. eventSeriesID: an identifier for a set of events that are
> > associated in some way, e.g., a monitoring series; may be a global unique
> > identifier or an identifier specific to the series.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> Best regards,
> > >>
> > >>
> > >>
> > >> ?amonn
> > >>
> > >>
> > >>
> > >> [1] http://eubon.eu <http://eubon.eu/>
> > >>
> > >> [2] http://eubon-ipt.gbif.org <http://eubon-ipt.gbif.org/>
> > >>
> > >> [3]
> >
> http://www.standardsingenomics.org/index.php/sigen/article/view/sigs.4898640
> > >>
> > >> [4] <http://links.gbif.org/sample_data_model>
> > http://links.gbif.org/sample_data_model
> > >>
> > >> [5] https://code.google.com/p/darwincore/issues/list
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> ____________________________________________________
> > >>
> > >> ?amonn ? Tuama, M.Sc., Ph.D. (eotuama(a)gbif.org),
> > >>
> > >> Senior Programme Officer for Interoperability,
> > >>
> > >> Global Biodiversity Information Facility Secretariat,
> > >>
> > >> Universitetsparken 15, DK-2100, Copenhagen ?, DENMARK
> > >>
> > >> Phone: +45 3532 1494 <tel:%2B45%203532%201494> ; Fax: +45 3532 1480
> > <tel:%2B45%203532%201480>
>
> >
>
> End of tdwg-content Digest, Vol 63, Issue 15
> ********************************************
>
1
0
Dear all,
Not sure if this is appropriate use of this list, but I don't know where else to go. I was hoping I could prevail on someone to help me find out how to use extensions in the IPT provider for data from an SQL database, without creating duplicate rows in the core file or empty rows (apart from the core ID) in the extension file(s). I think I tried everything, but I must be overlooking something.
I found this, http://lists.gbif.org/pipermail/ipt/2013-October/000545.html, during my googling for an answer, which I think has the solution, but I can't work out how to implement it, as my IPT installation doesn't seem to let me create a mapping to just an extension and only lets me use one data mapping source for any resource.
Niels
___________________________________________________________
Niels Klazenga
Programmer Information Technology, Biodiversity Information Officer
National Herbarium of Victoria, Royal Botanic Gardens
Birdwood Avenue
South Yarra, VIC 3141
Australia
Tel: (03) 9252 2346
Fax: (03) 9252 2444
e-mail: niels.klazenga(a)rbg.vic.gov.au
--
This email and any attachments may contain information that is personal, confidential, legally privileged and/or copyright. No part of it should be reproduced, adapted or communicated without the prior written consent of the sender and/or copyright owner.
It is the responsibility of the recipient to check for and remove viruses.
If you have received this email in error, please notify the sender by return email, delete it from your system and destroy any copies. You are not authorised to use, communicate or rely on the information contained in this email.
Please consider the environment before printing this email.
This email was Anti Virus checked by Sophos UTM.
2
2
Dear Niels,
You're on the right track. You have to start by configuring different "database as a source". There is information about how to do this in the IPT User Manual here: https://code.google.com/p/gbif-providertoolkit/wiki/IPT2ManualNotes?tm=6#So…
Then you need to add a mapping (selecting the extension), and then choose the source you want to map to on the next page. You may select from your multiple sources, and there is more information about this in the IPT User Manual here: https://code.google.com/p/gbif-providertoolkit/wiki/IPT2ManualNotes?tm=6#Da…
For your information, there is an IPT mailing list ( http://lists.gbif.org/mailman/listinfo/ipt ) you can for IPT questions use in the future. Further information about the IPT is available from the GBIF website http://www.gbif.org/ipt or from the IPT Google Code site: https://code.google.com/p/gbif-providertoolkit/
Kind regards,
Kyle
>
>> From: tdwg-content-bounces(a)lists.tdwg.org [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Niels Klazenga
>> Sent: 01 September 2014 09:11
>> To: tdwg-content(a)lists.tdwg.org
>> Subject: [tdwg-content] Help with using extensions in IPT
>>
>>
>> Dear all,
>> Not sure if this is appropriate use of this list, but I don't know where else to go. I was hoping I could prevail on someone to help me find out how to use extensions in the IPT provider for data from an SQL database, without creating duplicate rows in the core file or empty rows (apart from the core ID) in the extension file(s). I think I tried everything, but I must be overlooking something.
>>
>> I found this, http://lists.gbif.org/pipermail/ipt/2013-October/000545.html, during my googling for an answer, which I think has the solution, but I can't work out how to implement it, as my IPT installation doesn't seem to let me create a mapping to just an extension and only lets me use one data mapping source for any resource.
>>
>> Niels
>>
>>
>>
>> ___________________________________________________________
>>
>> Niels Klazenga
>> Programmer Information Technology, Biodiversity Information Officer
>>
>> National Herbarium of Victoria, Royal Botanic Gardens
>> Birdwood Avenue
>> South Yarra, VIC 3141
>> Australia
>> Tel: (03) 9252 2346
>> Fax: (03) 9252 2444
>>
>> e-mail: niels.klazenga(a)rbg.vic.gov.au
>>
>> --
>> This email and any attachments may contain information that is personal, confidential, legally privileged and/or copyright. No part of it should be reproduced, adapted or communicated without the prior written consent of the sender and/or copyright owner.
>>
>> It is the responsibility of the recipient to check for and remove viruses.
>>
>> If you have received this email in error, please notify the sender by return email, delete it from your system and destroy any copies. You are not authorised to use, communicate or rely on the information contained in this email.
>>
>> Please consider the environment before printing this email.
>>
>> This email was Anti Virus checked by Sophos UTM.
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content(a)lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
1
0
29 Aug '14
I also really do appreciate GBIF's pressing need to serve survey/sample
data, and I don't have a major of a problem with the idea of an Event core
or even adding new terms to DwC (in principle). Rather, I am urging caution
in how we proceed with it.
Éamonn, in response to your statement "Once the BCO model is available for
uptake, it should be possible to develop a mapping between it and the
simple DwC sample model" I will respond: possible, maybe, easy, no way,
unambiguous, probably impossible. The problem I foresee is that once the
"simple DwC sample model" is in place, people will start using it to do all
kinds of not so simple things, and the mapping will become muddled. We have
ample evidence that this is the case with existing Darwin Core archives.
Going back to the five new terms that Éamonn proposed, I would like to see
if we can link them NOW to existing ontology terms (as other have
proposed), thus making their semantics explicit from the start, but still
allowing GBIF/EU-BON to proceed with the work they need to do. This will
not prevent people form misusing terms, but may at least help make mapping
easier later. In cases where the terms can't be mapped to an existing term,
BCO curators would be willing to help develop a term or set of terms that
can convey meaning required, or work with other ontology developers to get
the terms added elsewhere.
Trying to be constructive, I attempted to do a quick and dirty, preliminary
mapping of the five terms (quantity, quantity type, sampling geometry,
sampling unity, and event series ID), bearing in mind that I am not an
authority on OBOE or OGC ontologies. [Aside: Based on what I know, OGC
ontologies are not yet sufficiently developed to provide the semantics we
need, but I would love for someone to show me otherwise.]
A serious problem with mapping these terms to existing ontologies is that
some of them do NOT map to a single ontology term (namely, quantity,
quantity type, and sampling geometry). This is evidence that the proposed
terms could indeed be interpreted in multiple ways and further supports the
argument that it would not be easy to retrospectively add them to a
semantic framework at some later date.
I think there is a path forward that would allow for both the
expressiveness of OBOE and other ontologies and convenience of standard
exchange formats.
Ramona
------------------------------------------------------
Ramona L. Walls, Ph.D.
Scientific Analyst, The iPlant Collaborative, University of Arizona
Research Associate, Bio5 Institute, University of Arizona
Laboratory Research Associate, New York Botanical Garden
On Thu, Aug 28, 2014 at 8:58 AM, Robert Guralnick <
Robert.Guralnick(a)colorado.edu> wrote:
>
> Hi all --- Ok, I think the scope of the issue is quite clear. Let me
> summarize: 1) As Éamonn and the rest of GBIF has made quite clear, "GBIF
> is faced with the immediate task of making sample-based data discoverable
> and accessible using its current ecosystem of tools" given a funding
> mandate from EU-BON. 2) The solution for this problem is to develop an
> Event-core and to promote new terms to the Darwin Core to make this happen.
> I will note a small inconsistency here: the current ecosystem standards
> and tools of is Darwin Core (as it stands) and publishing systems such as
> IPT. That ecosystem of tools includes mechanisms to extend Darwin Core
> where needed, via extensions. The *current* ecosystem of tools doesn't
> include new Cores or new DwC terms, does it?
>
> So this leads in nicely to the contentious issue(s) and places where
> there seems to be discussions --- these have to do with the nature of the
> changes suggested and the scope of those changes, both in terms of an Event
> core and DwC term additions. Leaving aside the Event-core for now, the key
> questions simply about term additions to the Darwin Core that seem to be at
> heart here are: 1) Is the intent of the Darwin Core to model surveys,
> which usually involve multiple kinds and types of sampling over multiple
> sites using multiple methods? 2) Is the solution to invent new terms for
> the Darwin Core if there are already terms from other efforts, wouldn't we
> work with those existing efforts to assure interoperability?
>
> I appreciate the efforts of GBIF here fully, and am personally torn
> because on the one hand, I fully agree with the goal of extending Darwin
> Core to better represent richer biodiversity data. On the other hand, I
> worry about process here and how to make that happen in a way that isn't
> too hasty or locks us into just the opposite of what I think many of us
> want with regards to sharing data more broadly than within just one
> ecosystem of tools.
>
> Best, Rob
>
>
>
>
>
> On Thu, Aug 28, 2014 at 6:30 AM, John Deck <jdeck(a)berkeley.edu> wrote:
>
>> I see the rational for enabling this in Darwin Core Archives and adding
>> the new terms. However, back to what Matt Jones brought up: "won't we
>> just end up with a new syntax that does essentially what O&M and OBOE do
>> now?".
>>
>> We should include explicit references to existing terms/definitions that
>> encapsulate what we're talking about, e.g. in our MaterialSample proposal
>> last year we linked the an existing term in OBI, which has a much richer
>> description and context for MaterialSample than what we considered (
>> https://code.google.com/p/darwincore/issues/detail?id=167)
>>
>> Have we explored the possibility of doing this with OBOE? I'm not
>> suggesting we adopt OBOE wholesale, but it seems like we have a good
>> opportunity to enable better semantic linking with that efforts.
>>
>> John
>>
>> On Thu, Aug 28, 2014 at 4:23 AM, Éamonn Ó Tuama [GBIF] <eotuama(a)gbif.org>
>> wrote:
>>
>>> Thanks, Ramona and Rob.
>>>
>>> I'd like to add a few points following on Markus's reply.
>>>
>>> I think your pressing of the need for a robust semantic model for
>>> biodiversity sample/survey data is incontestable – we do need one and it
>>> should enable rich data integration once it is defined and the tools and
>>> data standards to support it become available. However, GBIF is faced
>>> with
>>> the immediate task of making sample-based data discoverable and
>>> accessible
>>> using its current ecosystem of tools (IPT) and exchange standards (DwC;
>>> EML). Waiting for a functional, implementable semantic model and the
>>> tools
>>> and support services for it is just not an option for us right now.
>>>
>>> We have already spend considerable time in analysing the merits of
>>> Occurrence core vs Event core and have opted for an Event core for
>>> reasons
>>> previously given. I don’t believe we are trying to reconfigure Event (“an
>>> action that occurs at a place and during a period of time”) and
>>> regardless
>>> of whether we use Occurrence or Event, the need for some additional terms
>>> arise (e.g., quantity, quantityType, samplingGeometry, samplingUnit).
>>> Once
>>> the BCO model is available for uptake, it should be possible to develop a
>>> mapping between it and the simple DwC sample model.
>>>
>>> So GBIF’s stance is that we need to take a two-pronged approach by
>>> exploring
>>> how the IPT and DwC-A can be adapted for publishing sample-based data in
>>> the
>>> near term while supporting the work of TDWG and groups such as the BCO in
>>> advancing biodiversity informatics. GBIF has already engaged in the work
>>> of
>>> the BCO and will continue to do so.
>>>
>>> Éamonn
>>>
>>> -----Original Message-----
>>> From: tdwg-content-bounces(a)lists.tdwg.org
>>> [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of Markus Döring
>>> Sent: 28 August 2014 12:44
>>> To: Ramona Walls
>>> Cc: TDWG Content Mailing List
>>> Subject: Re: [tdwg-content] tdwg-content Digest, Vol 63, Issue 6
>>>
>>> Hi Ramona & Rob,
>>>
>>> The Event proposal does not try to change the semantics of an Event, it
>>> just
>>> uses the existing Darwin Core Event "class" at the core in Darwin Core
>>> archives. The actual change proposed is simply adding 3 new terms to the
>>> Event "group" to better share information about sampling methods &
>>> efforts,
>>> extending the existing limited capabilities of Darwin Core which already
>>> has
>>> the terms dwc:samplingProtocol and dwc:samplingEffort. It also proposes 2
>>> new terms for dealing with quantity of Occurrences, something that has
>>> been
>>> discussed since 2012 now, when I had proposed a new abundance term [2].
>>>
>>>
>>> In general application of Darwin Core is not at all limited to specimens
>>> and
>>> observations. It is used for sharing taxonomic datasets already and it's
>>> definition and goal is broad. Let me cite some of the introduction to
>>> Darwin
>>> Core [1]:
>>>
>>> What is the Darwin Core?
>>> The Darwin Core is body of standards. It includes a glossary of terms (in
>>> other contexts these might be called properties, elements, fields,
>>> columns,
>>> attributes, or concepts) intended to facilitate the sharing of
>>> information
>>> about biological diversity by providing reference definitions, examples,
>>> and
>>> commentaries. The Darwin Core is primarily based on taxa, their
>>> occurrence
>>> in nature as documented by observations, specimens, samples, and related
>>> information.
>>>
>>> Motivation: The Darwin Core standard was originally conceived to
>>> facilitate
>>> the discovery, retrieval, and integration of information about modern
>>> biological specimens, their spatiotemporal occurrence, and their
>>> supporting
>>> evidence housed in collections (physical or digital). The Darwin Core
>>> today
>>> is broader in scope and more versatile. It is meant to provide a stable
>>> standard reference for sharing information on biological diversity. As a
>>> glossary of terms, the Darwin Core is meant to provide stable semantic
>>> definitions with the goal of being maximally reusable in a variety of
>>> contexts.
>>>
>>>
>>> Markus
>>>
>>>
>>> [1] http://rs.tdwg.org/dwc/index.htm
>>> [2] https://code.google.com/p/darwincore/issues/detail?id=142
>>>
>>>
>>> --
>>> Markus Döring
>>> Software Developer
>>> Global Biodiversity Information Facility (GBIF)
>>> mdoering(a)gbif.org
>>> http://www.gbif.org
>>>
>>>
>>>
>>>
>>>
>>> On 27 Aug 2014, at 18:57, Ramona Walls <rlwalls2008(a)gmail.com> wrote:
>>>
>>> > I think it is important to consider the purpose of both Darwin Core and
>>> DwC archives in deciding whether or not to expand them, but we should
>>> use
>>> that consideration to address the question at hand, which is whether or
>>> not
>>> to add an Event core and additional properties to describe events.
>>> >
>>> > Describing the exchange format before the semantics is the wrong way to
>>> go, given that we now have a framework for developing semantics.
>>> Expanding
>>> Darwin Core before we adequately model survey data is bound to lead to
>>> problems later, when we try to retro-fit the semantics to Darwin Core
>>> Event
>>> archives. This is exactly the problem we are running into now with
>>> Occurance
>>> archives, and we have the opportunity to avoid it.
>>> >
>>> > I suggest we first use existing ontologies to model survey data, then
>>> deal
>>> with if and how to exchange that information in DwC-A. This is what I was
>>> hinting at in my first email, but should have said more explicitly.
>>> >
>>> > Ramona
>>> >
>>> > ------------------------------------------------------
>>> > Ramona L. Walls, Ph.D.
>>> > Scientific Analyst, The iPlant Collaborative, University of Arizona
>>> > Research Associate, Bio5 Institute, University of Arizona
>>> > Laboratory Research Associate, New York Botanical Garden
>>> >
>>> >
>>> > On Wed, Aug 27, 2014 at 9:14 AM, Robert Guralnick
>>> <Robert.Guralnick(a)colorado.edu> wrote:
>>> >
>>> > It may be a sensible view for Darwin Core Archives and their intended
>>> use, but Tim's email suggests we should be putting the method of delivery
>>> ahead of the standard that delivers that content. If this was just about
>>> DwC-As, why not develop a survey extension that links each occurrence to
>>> information about the survey process using the existing star-schema
>>> methods
>>> we have in place? Why are we discussing adding terms to the Darwin Core
>>> or
>>> trying to fully reconfigure what we call an Event? That is what is on
>>> the
>>> table, not DwC-As and how we use them. Or am I missing something?
>>> >
>>> > Best, Rob
>>> >
>>> >
>>> >
>>> > On Wed, Aug 27, 2014 at 10:01 AM, Ramona Walls <rlwalls2008(a)gmail.com>
>>> wrote:
>>> > Thanks, Tim, and yes, DwC-A as a view (but not necessarily the primary
>>> archive) of data seems like the right point of view.
>>> >
>>> > Ramona
>>> >
>>> > ------------------------------------------------------
>>> > Ramona L. Walls, Ph.D.
>>> > Scientific Analyst, The iPlant Collaborative, University of Arizona
>>> > Research Associate, Bio5 Institute, University of Arizona
>>> > Laboratory Research Associate, New York Botanical Garden
>>> >
>>> >
>>> > On Wed, Aug 27, 2014 at 1:58 AM, Tim Robertson <trobertson(a)gbif.org>
>>> wrote:
>>> > Hi Ramona,
>>> >
>>> > Those are good points, and I’d like to come back to the original
>>> thinking
>>> behind the DwC-A.
>>> >
>>> > It was designed and intended to be a simple way of exposing a complete
>>> view of a dataset, primarily for building sophisticated indexes,
>>> inventories
>>> and allowing basic analytics (e.g. GBIF.org being one sophisticated
>>> index).
>>> We found that the star schema provided the flexibility to do a lot, and
>>> with
>>> the bundled metadata (e.g. EML) was enough to trace provenance and allow
>>> users to determine if the dataset might be fit for various uses. In many
>>> cases this represents the complete (e.g. lossless) view of a dataset.
>>> >
>>> > What we are discussing here are far richer datasets, where shoe-horning
>>> content into the star schema becomes lossy for some, although we’re
>>> finding
>>> other cases where it is indeed lossless. I believe we should be looking
>>> to
>>> harmonise ontologies / models etc as you mention but in parallel we
>>> should
>>> define one or more star schema views that can still be used for
>>> discovery /
>>> reporting / basic analytical purpose, and not long term archival of the
>>> dataset. The dataset would then have the canonical rich form and an
>>> additional DwC-A view. What I write here is applicable to all content
>>> types
>>> of course.
>>> >
>>> > Please also note that many people put supplementary files in the DwC-A
>>> which are ignored by DwC-A readers but could be a way of keeping the
>>> richer
>>> view in the bundle. If one wished you can describe those supplementary
>>> files in the EML document.
>>> >
>>> > Does this gel with the view of others as well?
>>> >
>>> > Cheers,
>>> > Tim
>>> >
>>> >
>>> >
>>> > On 27 Aug 2014, at 02:55, Ramona Walls <rlwalls2008(a)gmail.com> wrote:
>>> >
>>> >> I think Matt hit the nail on the head. Although Darwin Core can be
>>> used
>>> to exchange survey data, it lacks the semantics and structure necessary
>>> to
>>> archive the data without loss of information. I think the biodiversity
>>> community would be better served devoting energy to harmonizing existing
>>> technologies such as OGC, OBOE, and BCO, not to mention the many database
>>> for storing plot or survey data. The goal should be to preserve the data
>>> in
>>> the most informative manner possible.
>>> >>
>>> >> There is a strong a case for wanting to search across all evidence for
>>> occurences, including surveys and point occurences, so I can see possible
>>> demand for a tool that would extract occurences from survey data to a DwC
>>> archive. However, I am very concerned that making a DwC archive the
>>> primary
>>> exchange format for survey or plot data commits us to a path of losing
>>> information from the start, for all but the simplest sampling schemas.
>>> >>
>>> >> Ramona
>>> >>
>>> >> ------------------------------------------------------
>>> >> Ramona L. Walls, Ph.D.
>>> >> Scientific Analyst, The iPlant Collaborative, University of Arizona
>>> >> Research Associate, Bio5 Institute, University of Arizona
>>> >> Laboratory Research Associate, New York Botanical Garden
>>> >>
>>> >>
>>> >> On Fri, Aug 22, 2014 at 3:00 AM, <tdwg-content-request(a)lists.tdwg.org
>>> >
>>> wrote:
>>> >> Send tdwg-content mailing list submissions to
>>> >> tdwg-content(a)lists.tdwg.org
>>> >>
>>> >> To subscribe or unsubscribe via the World Wide Web, visit
>>> >> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>> >> or, via email, send a message with subject or body 'help' to
>>> >> tdwg-content-request(a)lists.tdwg.org
>>> >>
>>> >> You can reach the person managing the list at
>>> >> tdwg-content-owner(a)lists.tdwg.org
>>> >>
>>> >> When replying, please edit your Subject line so it is more specific
>>> >> than "Re: Contents of tdwg-content digest..."
>>> >>
>>> >>
>>> >> Today's Topics:
>>> >>
>>> >> 1. Re: Darwin Core: proposed news terms for expressing sample
>>> >> data (Matt Jones)
>>> >> 2. Re: Darwin Core: proposed news terms for expressing sample
>>> >> data (Donald Hobern [GBIF])
>>> >>
>>> >>
>>> >> ----------------------------------------------------------------------
>>> >>
>>> >> Message: 1
>>> >> Date: Thu, 21 Aug 2014 18:52:06 -0800
>>> >> From: Matt Jones <jones(a)nceas.ucsb.edu>
>>> >> Subject: Re: [tdwg-content] Darwin Core: proposed news terms for
>>> >> expressing sample data
>>> >> To: ?amonn ? Tuama [GBIF] <eotuama(a)gbif.org>
>>> >> Cc: TDWG Content Mailing List <tdwg-content(a)lists.tdwg.org>
>>> >> Message-ID:
>>> >>
>>> <CAFSW8xkx7uRP9PC2g3=JT_VJanqujH8nPXoz8GXwh+JwKw5Ccw(a)mail.gmail.com>
>>> >> Content-Type: text/plain; charset="utf-8"
>>> >>
>>> >> This proposal is treading on ground that is quite similar to other
>>> >> observations and measurements standards for data exchange that are
>>> already
>>> >> mature, in particular:
>>> >>
>>> >> * OGC Observations and Measurements (
>>> >> http://www.opengeospatial.org/standards/om)
>>> >> * Extensible Observation Ontology (OBOE;
>>> >> https://semtools.ecoinformatics.org/oboe)
>>> >>
>>> >> The former is a standard and broadly deployed, whereas the latter is
>>> part
>>> >> of a research program in the use of ontologies for measurements.
>>> Through
>>> >> collaboration between the two projects, they've been modified to be
>>> >> reasonably isomorphic, but O&M uses an XML serialization while OBOE
>>> uses
>>> an
>>> >> OWL-DL serialization. They largely express the same measurements and
>>> >> sampling model once one gets beyond the terminology differences.
>>> >>
>>> >> So, I'm wondering if it make much sense to extend Darwin Core, which
>>> is
>>> at
>>> >> heart an Occurrence exchange syntax, into this measurements area that
>>> is
>>> >> well represented by these other existing specifications? I'm curious
>>> to
>>> >> hear why people would even want to do this. And if we do go down this
>>> >> path, won't we just end up with a new syntax that does essentially
>>> what
>>> O&M
>>> >> and OBOE do now?
>>> >>
>>> >> Matt
>>> >>
>>> >>
>>> >>
>>> >> On Thu, Aug 21, 2014 at 12:22 AM, ?amonn ? Tuama [GBIF]
>>> <eotuama(a)gbif.org>
>>> >> wrote:
>>> >>
>>> >> > Hi Rob, Anne, Rich,
>>> >> >
>>> >> >
>>> >> >
>>> >> > I think Markus has answered your question as to why we opted for an
>>> Event
>>> >> > core which is being used in the sense described by Anne and Rich.
>>> For
>>> any
>>> >> > event, you can have a list of species in an Occurrence extension and
>>> for
>>> >> > each species, you can include quantity and quantityType, e.g.,
>>> biomass,
>>> >> > etc. The proposed term eventSeriesID was intended for linking
>>> together
>>> >> > related events, although it now looks like parentEventID might be a
>>> better,
>>> >> > more flexible term. The measurementOrFact extension is a good fit
>>> for
>>> >> > capturing environmental information relating to an event. See, e.g.,
>>> the
>>> >> > Gialova Lagoon brackish water invertebrate test data set [1] where a
>>> set
>>> >> > of 18 environmental variables, including temp, pH, Rdx, particulate
>>> organic
>>> >> > matter, dissolved oxygen, salinity, chlorophyll-a were measured for
>>> each
>>> >> > sampling station-sampling period combination. An example mapping is:
>>> >> >
>>> >> >
>>> >> >
>>> >> > Id measurementType measurementValue
>>> >> > measurementUnit measurementRemarks
>>> >> >
>>> >> > IA Tmp (sed) 21.5
>>> >> > degree C
>>> Tmp
>>> >> > (sed): temperature at the bottom surface
>>> >> >
>>> >> >
>>> >> >
>>> >> > **Controlled vocabularies**
>>> >> >
>>> >> > Ideally, the values for samplingUnit and quantityType would be
>>> selected
>>> >> > from controlled vocabularies. This is, effectively, what we do by
>>> >> > presenting a small list of values in a drop-down menu. The current
>>> values
>>> >> > are what we derived for example data sets and discussion but they
>>> can
>>> >> > undoubtedly be extended and improved.
>>> >> >
>>> >> >
>>> >> >
>>> >> > We capture ?bucket? type measures through a combination of
>>> samplingEffort,
>>> >> > samplingGeometry and samplingUnit. For example, a pitfall trap (in a
>>> point
>>> >> > location) left out for 16 days might have samplingEffort: 16,
>>> >> > samplingGeometry: point and samplingUnit: day. Three m^2 quadrats
>>> in a
>>> >> > shore survey might have samplingEffort: 3, samplingGeometry: area
>>> and
>>> >> > samplingUnit: m^2.
>>> >> >
>>> >> >
>>> >> >
>>> >> > It would be very useful to see your compilation of scope, effort and
>>> >> > completeness measures to see if we can express them in our model
>>> and/or
>>> if
>>> >> > we need to reconsider our approach.
>>> >> >
>>> >> >
>>> >> >
>>> >> > ?amonn
>>> >> >
>>> >> >
>>> >> >
>>> >> > [1] http://eubon-ipt.gbif.org/resource.do?r=ionian-brackish-lagoon
>>> >> >
>>> >> >
>>> >> >
>>> >> > *From:* tdwg-content-bounces(a)lists.tdwg.org [mailto:
>>> >> > tdwg-content-bounces(a)lists.tdwg.org] *On Behalf Of *Markus D?ring
>>> >> > *Sent:* 20 August 2014 23:47
>>> >> > *To:* Robert Guralnick
>>> >> >
>>> >> > *Cc:* TDWG Content Mailing List
>>> >> > *Subject:* Re: [tdwg-content] Darwin Core: proposed news terms for
>>> >> > expressing sample data
>>> >> >
>>> >> >
>>> >> >
>>> >> > Rob,
>>> >> >
>>> >> >
>>> >> >
>>> >> > this proposal if for monitoring surveys really, not to be confused
>>> with
>>> >> > material samples like environmental or tissue samples which have a
>>> distinct
>>> >> > new dwc class MaterialSample.
>>> >> >
>>> >> >
>>> >> >
>>> >> > We tend to overload the term sampling a lot and it helps treating
>>> material
>>> >> > samples different from pure observational "sampling". That is why
>>> the
>>> >> > existing Event class was used as the core and classic Occurrence
>>> records as
>>> >> > extensions. A classic example is a vegetation survey where each plot
>>> >> > represents an Event record and each recorded species in that plot
>>> will
>>> be
>>> >> > an Occurrence extension record with a given quantity. Darwin Core
>>> already
>>> >> > offers individualCount to specify quantity, but it is a very
>>> specific
>>> way
>>> >> > of measuring "abundance" restricted to only some use cases. Abiotic
>>> >> > measurements about the plot (e.g. soil type, pH, temperature) can be
>>> >> > published using the measurements or facts extension linked to the
>>> Event
>>> >> > core.
>>> >> >
>>> >> >
>>> >> >
>>> >> > Markus
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > On 20 Aug 2014, at 20:08, Robert Guralnick
>>> <Robert.Guralnick(a)colorado.edu>
>>> >> > wrote:
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > Anne -- I don't know the answers! These are questions for
>>> Eamonn. I
>>> >> > would presume that a sample could be a jumble of species or even
>>> just
>>> water
>>> >> > or soil samples, and biomass would refer to that sample - but maybe
>>> that
>>> >> > isn't a use case being considered? The examples given in the longer
>>> >> > document all link an event_id to species name and some measure of
>>> quantity
>>> >> > for that species (to the species, not an individual specimen), so I
>>> assume
>>> >> > that is the prevailing (or only) case?
>>> >> >
>>> >> > Best, Rob
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Wed, Aug 20, 2014 at 11:56 AM, Anne Thessen <
>>> annethessen(a)gmail.com>
>>> >> > wrote:
>>> >> >
>>> >> > Hi Rob
>>> >> > I would like to respond to your item number 2.
>>> >> > From my perspective, I deal with lots of published descriptions of
>>> taxa.
>>> >> > The text might say something like "I saw species A in the Chesapeake
>>> Bay,
>>> >> > the Adriatic Sea and the Indian Ocean and the biomass is 5 - 9
>>> grams".
>>> The
>>> >> > biomass range obviously corresponds to at least three different
>>> >> > occurrences, but how to divide the biomass data? I would love to be
>>> able to
>>> >> > have an *event* to attach it all to. There is almost two different
>>> levels
>>> >> > of events - a sampling event and a "study event". The "study event"
>>> would
>>> >> > correspond to the type of event I would like to use in the above
>>> example.
>>> >> > It may not be ideal, but for the old literature that might be the
>>> best
>>> we
>>> >> > can do.
>>> >> > I have to admit that I don't know enough about trawl data to
>>> understand
>>> >> > why an event core would be a problem. It seems that the trawl would
>>> be
>>> an
>>> >> > event and each biomass measure (of each fish) would be attached to a
>>> >> > separate occurrence which is attached to that event. Am I
>>> understanding
>>> >> > this wrong?
>>> >> > btw - I found a workaround for the example I gave, so it's not
>>> impossible
>>> >> > to model with the current structure....
>>> >> > Anne
>>> >> >
>>> >> >
>>> >> >
>>> >> > On 8/20/2014 1:16 PM, Robert Guralnick wrote:
>>> >> >
>>> >> >
>>> >> >
>>> >> > ?amonn et al. --- Thanks for the clarifications. I think these
>>> help a
>>> ton
>>> >> > but it raises a couple more questions for me.
>>> >> >
>>> >> >
>>> >> >
>>> >> > 1) I am surprised that you plan to use of MeasurementorFact
>>> extension
>>> in
>>> >> > relation to the Event core, which seems like a novel (or perhaps
>>> awkward or
>>> >> > unintended?) mechanism for capturing environmental data, but the
>>> same
>>> >> > extension was not be seen as relevant for describing samples? Can
>>> you
>>> >> > explain more about the thinking there?
>>> >> >
>>> >> >
>>> >> >
>>> >> > 2) There may be a subtle issue here extending "Event" to be more
>>> what
>>> you
>>> >> > call a "Sampling Event Core". My read of this is that Darwin Core
>>> serves
>>> >> > as a way to deal with point occurrences and Event reflects the
>>> context
>>> of a
>>> >> > single capture event (whether a single observation, or a bulk sample
>>> >> > capture). The changes recommended seem to dramatically extend and
>>> change
>>> >> > that meaning? Its simply a question that I don't have answer to,
>>> but
>>> is
>>> >> > Darwin Core, the right vehicle to start capturing repeated measures
>>> of
>>> >> > biomass values from trawls? I don't have answer but man, terms
>>> like
>>> >> > quantityType (as a property of occurrence?) give me pause.
>>> >> >
>>> >> >
>>> >> >
>>> >> > 3) Is Sampling Unit a controlled vocabulary? For another project, I
>>> have
>>> >> > looked through - and captured scope, effort and completeness
>>> measures
>>> from
>>> >> > - a large number of published biotic area inventories. The vast
>>> majorities
>>> >> > of these are measured in units like bucket hours, or trap nights.
>>> Is a
>>> >> > "bucket" part of SamplingGeometry or Sampling Unit? I'd be happy to
>>> send
>>> >> > along all the many examples of how biotic inventories of an area are
>>> >> > completed and perhaps it might be good to see how those might be
>>> >> > represented using the terms you are proposing?
>>> >> >
>>> >> >
>>> >> >
>>> >> > Best, Rob
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Wed, Aug 20, 2014 at 10:16 AM, Richard Pyle
>>> <deepreef(a)bishopmuseum.org>
>>> >> > wrote:
>>> >> >
>>> >> > Same here ? Events are central to the work that we do.
>>> >> >
>>> >> >
>>> >> >
>>> >> > Aloha,
>>> >> >
>>> >> > Rich
>>> >> >
>>> >> >
>>> >> >
>>> >> > *From:* tdwg-content-bounces(a)lists.tdwg.org [mailto:
>>> >> > tdwg-content-bounces(a)lists.tdwg.org] *On Behalf Of *Anne Thessen
>>> >> > *Sent:* Wednesday, August 20, 2014 2:59 AM
>>> >> > *To:* tdwg-content(a)lists.tdwg.org
>>> >> >
>>> >> >
>>> >> > *Subject:* Re: [tdwg-content] Darwin Core: proposed news terms for
>>> >> > expressing sample data
>>> >> >
>>> >> >
>>> >> >
>>> >> > Hello
>>> >> > I would just like to comment on *event core*.
>>> >> > I've been doing a lot of work translating published data into Darwin
>>> Core.
>>> >> > During that process I've wished several times that I could use
>>> Event as
>>> >> > core. I am happy to hear about that proposed change. It will make it
>>> easier
>>> >> > to model the data I am working with.
>>> >> > Anne
>>> >> >
>>> >> > On 8/20/2014 7:04 AM, ?amonn ? Tuama [GBIF] wrote:
>>> >> >
>>> >> > Hi Rob,
>>> >> >
>>> >> >
>>> >> >
>>> >> > Thank you for the feedback. I have tried to address the two main
>>> issues
>>> >> > you raise below. At the outset, I would like to emphasise that much
>>> of
>>> this
>>> >> > work is taking place in the context of the EU BON project which
>>> includes a
>>> >> > task on developing/enhancing tools and standards for data sharing
>>> with
>>> a
>>> >> > particular focus on the IPT for publishing sample-based data. So, we
>>> were
>>> >> > constrained by the need to publish sample-based data sets in the
>>> Darwin
>>> >> > Core Archive format and to demonstrate practical application using a
>>> >> > working prototype. When the discussion on the TDWG list faded out,
>>> we
>>> took
>>> >> > it to our EU BON partners whose requirements were essential input to
>>> >> > further development. We recognise that these discussions took place
>>> away
>>> >> > from TDWG (although the TDWG/EU BON contributors overlapped) and
>>> this
>>> is
>>> >> > the reason we are presenting the outcomes here for further
>>> consideration.
>>> >> >
>>> >> >
>>> >> >
>>> >> > **Event core**
>>> >> >
>>> >> > As the SIGS report indicated, sample data can be modelled in Darwin
>>> Core
>>> >> > Archives using either Occurrence or Event as core. This was the
>>> starting
>>> >> > point for our evaluation but as things progressed the data wrangling
>>> pushed
>>> >> > the model back towards the Event core. We actually went through the
>>> >> > exercise of mapping multiple test datasets in an iterative process
>>> spanning
>>> >> > several months' work. In the end, we found that using an Event core
>>> better
>>> >> > matched the typical sample data we were dealing with, allowing use
>>> of a
>>> >> > measurement-or-fact extension to be included for the efficient
>>> expression
>>> >> > of environmental information associated with the event. The choice
>>> comes
>>> >> > down to an Occurrence core or an Event core + Occurrence extension.
>>> In
>>> both
>>> >> > cases, the true observation records are Occurrences. The big
>>> difference
>>> is
>>> >> > what type the core has and therefore to which kind of records you
>>> can
>>> >> > attach further facts and extra information with DwC-A extensions.
>>> Many
>>> >> > sampling datasets have very rich information about the site and
>>> event,
>>> so
>>> >> > it is very natural to hang facts from an Event core. When picking
>>> the
>>> >> > Occurrence core those facts would have to be repeated for each and
>>> every
>>> >> > occurrence record. Moreover, our approach doesn?t stop anyone from
>>> using
>>> >> > the Occurrence core if they so wish. This just provides a different
>>> option
>>> >> > for datasets that better fit an Event core model.
>>> >> >
>>> >> >
>>> >> >
>>> >> > I want to stress that we are not building a ?specific IPT version?
>>> to
>>> >> > support an Event core but, rather, we adapted the IPT so that it
>>> can be
>>> >> > configured to support any generic ?core + extension? format to
>>> enable
>>> its
>>> >> > use for exploration of more data formats. This is part of the core
>>> >> > codebase and there were no custom forks of the IPT for this work.
>>> Our
>>> view
>>> >> > at GBIF is that if there are significant numbers of data publishers
>>> who
>>> are
>>> >> > keen to adopt, promote and use a (any) format, and the tools can be
>>> >> > configured to do so, then we should support it, and, if necessary,
>>> use
>>> a
>>> >> > custom namespace.
>>> >> >
>>> >> >
>>> >> >
>>> >> > **New terms around abundance**
>>> >> >
>>> >> > Yes, the discussion on TDWG did fade out but it was clear that the
>>> term
>>> >> > ?abundance? as recommended by the SIGS report (along with
>>> >> > abundanceAsPercent) was confusing many when we were looking for
>>> term(s)
>>> >> > that reported quantitative measures of organisms in a sample. It
>>> also
>>> >> > became clear we would need to be able to state the type of quantity
>>> being
>>> >> > measured. An alternative suggestion for using the MeasurementsOrFact
>>> class
>>> >> > was immediately shot down.
>>> >> >
>>> >> > As some of our main use cases were coming from the EU BON project,
>>> >> > discussion shifted to that forum and consensus formed about the
>>> currently
>>> >> > proposed terms. It was within this group that the additional terms
>>> >> > (samplingGeometry, samplingUnit, eventSeriesID) were proposed and
>>> where
>>> we
>>> >> > began testing with sample data sets.
>>> >> >
>>> >> >
>>> >> >
>>> >> > Best regards,
>>> >> >
>>> >> > ?amonn
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > *From:* robgur(a)gmail.com [mailto:robgur@gmail.com <robgur(a)gmail.com
>>> >]
>>> *On
>>> >> > Behalf Of *Robert Guralnick
>>> >> > *Sent:* 19 August 2014 16:56
>>> >> > *To:* ?amonn ? Tuama [GBIF]
>>> >> > *Cc:* TDWG Content Mailing List
>>> >> > *Subject:* Re: [tdwg-content] Darwin Core: proposed news terms for
>>> >> > expressing sample data
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > Hi ?amonn --- I am curious about the outcomes presented in the
>>> SIGS
>>> >> > paper, in particular, this portion of the paper:
>>> >> >
>>> >> >
>>> >> >
>>> >> > "Solutions without introducing an event core in Darwin Core
>>> Archives:
>>> >> > During the review of the solutions for the uses cases, it became
>>> apparent
>>> >> > that either model could be applied to every use case. The core and
>>> >> > extensions bore a complementary relationship and between them could
>>> express
>>> >> > all the required information. The core simply provided the central
>>> anchor
>>> >> > in the star schema from which to join the additional information.
>>> >> > Therefore, using the Occurrence core, well established in the GBIF
>>> network
>>> >> > through uptake of the IPT, seemed more appropriate than inventing
>>> >> > CollectingEvent as an additional core type."
>>> >> >
>>> >> >
>>> >> >
>>> >> > That SIGS paper has John Wieczorek and you both as authors,
>>> including
>>> >> > many luminaries across the biodiversity standards spectrum. Given
>>> the
>>> >> > above, its curious to see the EventCore come back again, along with
>>> a
>>> >> > specific IPT version to support it.
>>> >> >
>>> >> >
>>> >> >
>>> >> > So I see two issues, conflated, in this post you just made.
>>> One is
>>> >> > the need for an EventCore at all, and the nature of relating Event
>>> and
>>> >> > Occurrence/Material Sample. The second is the introduction of new
>>> terms,
>>> >> > which seemingly have arrived after debate on similar terms - but
>>> framed
>>> >> > around abundance - stalled a year ago. To my mind, these both
>>> require
>>> some
>>> >> > further discussion, because I don't (necessarily) see TDWG community
>>> >> > coherence around either issue?
>>> >> >
>>> >> >
>>> >> >
>>> >> > Best, Rob
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Tue, Aug 19, 2014 at 6:11 AM, ?amonn ? Tuama [GBIF]
>>> <eotuama(a)gbif.org>
>>> >> > wrote:
>>> >> >
>>> >> > Dear All,
>>> >> >
>>> >> >
>>> >> >
>>> >> > GBIF is committed to exploring ways in which the IPT and Darwin Core
>>> >> > Archive format can be extended for publishing sample-based data
>>> sets.
>>> In
>>> >> > association with the EU BON project [1], a customised version of the
>>> IPT
>>> >> > [2] has been deployed to test this using a special type of Darwin
>>> Core
>>> >> > Archive in which the core is an ?Event? with associated taxon
>>> occurrences
>>> >> > in an ?Occurrence? extension.
>>> >> >
>>> >> >
>>> >> >
>>> >> > The Darwin Core vocabulary already provides a rich set of terms with
>>> many
>>> >> > relevant for describing sample-based data. Synthesising several
>>> sources
>>> of
>>> >> > input (GBIF organised workshop on sample data, May 2013 [3],
>>> discussions on
>>> >> > the TDWG mailing list in late 2013; internal discussion among EU BON
>>> >> > project partners), five new terms relating to sample data were
>>> identified
>>> >> > as essential. The complete model including these new terms are fully
>>> >> > described with examples in the online document ?Publishing sample
>>> data
>>> >> > using the GBIF IPT? [4].
>>> >> >
>>> >> >
>>> >> >
>>> >> > As a first step towards ratification, we would like to register the
>>> new
>>> >> > terms in the DwC Google Code tracker [5] if there are no major
>>> objections
>>> >> > on this list. The five terms are:
>>> >> >
>>> >> >
>>> >> >
>>> >> > 1. *quantity*: the number or enumeration value of the
>>> quantityType
>>> >> > (e.g., individuals, biomass, biovolume, BraunBlanquetScale) per
>>> >> > samplingUnit or a percentage measure recorded for the sample.
>>> >> >
>>> >> >
>>> >> >
>>> >> > 2. *quantityType*: : the entity being referred to by quantity,
>>> >> > e.g., individuals, biomass, %species, scale type.
>>> >> >
>>> >> >
>>> >> >
>>> >> > 3. *samplingGeometry*: an indication of what kind of space was
>>> >> > sampled; select from point, line, area or volume.
>>> >> >
>>> >> >
>>> >> >
>>> >> > 4. *samplingUnit*: the unit of measurement used for reporting
>>> the
>>> >> > quantity in the sample, e.g., minute, hour, day, metre, metre^2,
>>> metre^3.
>>> >> > It is combined with quantity and quantityType to provide the
>>> complete
>>> >> > measurement, e.g., 9 individuals per day, 4 biomass-gm per metre^2.
>>> >> >
>>> >> >
>>> >> >
>>> >> > 5. *eventSeriesID*: an identifier for a set of events that are
>>> >> > associated in some way, e.g., a monitoring series; may be a global
>>> unique
>>> >> > identifier or an identifier specific to the series.
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > Best regards,
>>> >> >
>>> >> >
>>> >> >
>>> >> > ?amonn
>>> >> >
>>> >> >
>>> >> >
>>> >> > [1] http://eubon.eu
>>> >> >
>>> >> > [2] http://eubon-ipt.gbif.org
>>> >> >
>>> >> > [3]
>>> >> >
>>>
>>> http://www.standardsingenomics.org/index.php/sigen/article/view/sigs.4898640
>>> >> >
>>> >> > [4] http://links.gbif.org/sample_data_model
>>> >> >
>>> >> > [5] https://code.google.com/p/darwincore/issues/list
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > ____________________________________________________
>>> >> >
>>> >> > *?amonn ? Tuama, M.Sc., Ph.D. (eotuama(a)gbif.org <eotuama(a)gbif.org>),
>>> *
>>> >> >
>>> >> > *Senior Programme Officer for Interoperability, *
>>> >> >
>>> >> > *Global Biodiversity Information Facility Secretariat, *
>>> >> >
>>> >> > *Universitetsparken 15, DK-2100, Copenhagen ?, DENMARK*
>>> >> >
>>> >> > *Phone: +45 3532 1494 <%2B45%203532%201494>; Fax: +45 3532 1480
>>> >> > <%2B45%203532%201480>*
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > tdwg-content mailing list
>>> >> > tdwg-content(a)lists.tdwg.org
>>> >> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> >
>>> >> > tdwg-content mailing list
>>> >> >
>>> >> > tdwg-content(a)lists.tdwg.org
>>> >> >
>>> >> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> >
>>> >> > Anne E. Thessen, Ph.D.
>>> >> >
>>> >> > The Data Detektiv, Owner and Founder
>>> >> >
>>> >> > Ronin Institute, Research Scholar
>>> >> >
>>> >> > 443.225.9185
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > tdwg-content mailing list
>>> >> > tdwg-content(a)lists.tdwg.org
>>> >> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> >
>>> >> > Anne E. Thessen, Ph.D.
>>> >> >
>>> >> > The Data Detektiv, Owner and Founder
>>> >> >
>>> >> > Ronin Institute, Research Scholar
>>> >> >
>>> >> > 443.225.9185
>>> >> >
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > tdwg-content mailing list
>>> >> > tdwg-content(a)lists.tdwg.org
>>> >> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>> >> >
>>> >> >
>>> >> >
>>> >> > _______________________________________________
>>> >> > tdwg-content mailing list
>>> >> > tdwg-content(a)lists.tdwg.org
>>> >> > http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>> >> >
>>> >> >
>>> >>
3
2
Re: [tdwg-content] Darwin Core: proposed news terms for expressing sample data
by Simon.Cox@csiro.au 29 Aug '14
by Simon.Cox@csiro.au 29 Aug '14
29 Aug '14
G’day again TDWGers:
Matt passed on links to this thread to me and suggested I comment, as I was the author of the O&M standard (published as ISO 19156:2011 and OGC Abstract Spec Topic 20).
For those who are not aware of this work, there is a short Wikipedia page http://en.wikipedia.org/wiki/Observations_and_Measurements whose main value is it has links to a number of more detailed resources.
Probably the richest of these is another Wiki page at CSIRO https://www.seegrid.csiro.au/wiki/AppSchemas/ObservationsAndSampling which hasn’t been updated much recently, but at least has some diagrams embedded.
As Matt and others have hinted, as a result of a workshop at NCEAS a few years ago, there were some tweaks to allow it to meet some of the requirements identified in OBOE, just in time to beat the ISO deadline!
O&M includes a generic model for ‘Sampling Features’ – being those artefacts that are created to assist the observation process, but would not exist and have very much interest otherwise.
Things like specimens, transects, sections, quadrats, scenes and swaths, drillholes, flightlines, trajectories, ships tracks, etc.
Because it is a generic standard, you won’t find things with names familiar to any particular discipline, and there are a lot of stub classes for supporting information which need filling out for specific applications.
But the intention is that it provides a framework for a discipline or community to specialize for their purposes, while retaining some topology and perhaps terminology (maybe just as super-classes) that help with information sharing across discipline boundaries.
The main properties of a sampling feature are
- The sampledFeature – being the domain object which it is being used to characterize
- Related sampling features – other features related to the observational strategy
- Related observations – observation events that use this sampling feature (for which another generic model is provided)
We’ve generally found it helpful in teasing apart observational records and protocols in a variety of environmental science applications, and other have applied it in oceans, meteorology, even air-traffic control!
The primary classification of sampling features in O&M is by topological dimension (point, curve, surface, solid), because these are commonly used and afford common processing methods.
‘Specimen’ is the other concrete sampling-feature type.
There is no ‘sample’ class, because it is such an overloaded word (noun, verb, statistical sample vs ex-situ sample, etc).
O&M and its Sampling Feature model was designed in UML.
As Matt notes that the original implementation in the OGC context was in XML, using GML http://schemas.opengis.net/samplingSpecimen/2.0/specimen.xsd and http://schemas.opengis.net/samplingSpatial/2.0/spatialSamplingFeature.xsd .
However, it has been implemented other ways: there is an OWL2/RDFS representation at http://def.seegrid.csiro.au/isotc211/iso19156/2011/sampling which is linked in with OWL versions of a bunch of the other ISO standards, and therefore probably makes too many commitments for the faint hearted – see paper from ISWC 2013 here http://ceur-ws.org/Vol-1063/paper1.pdf
O&M was also one of the core inputs to the W3C Semantic Sensor Network ontology, reported here: http://www.w3.org/2005/Incubator/ssn/wiki/Incubator_Report though that focussed on the sensors and observations side of the equation, and hardly deals with sampling.
Hope this helps.
>> Date: Thu, 21 Aug 2014 18:52:06 -0800
>> From: Matt Jones <jones(a)nceas.ucsb.edu<mailto:jones@nceas.ucsb.edu>>
>> Subject: Re: [tdwg-content] Darwin Core: proposed news terms for
>> expressing sample data
>> To: ?amonn ? Tuama [GBIF] <eotuama(a)gbif.org<mailto:eotuama@gbif.org>>
>> Cc: TDWG Content Mailing List <tdwg-content(a)lists.tdwg.org<mailto:tdwg-content@lists.tdwg.org>>
>> Message-ID:
>>
<CAFSW8xkx7uRP9PC2g3=JT_VJanqujH8nPXoz8GXwh+JwKw5Ccw(a)mail.gmail.com<mailto:JT_VJanqujH8nPXoz8GXwh%2BJwKw5Ccw@mail.gmail.com>>
>> Content-Type: text/plain; charset="utf-8"
>>
>> This proposal is treading on ground that is quite similar to other
>> observations and measurements standards for data exchange that are
already
>> mature, in particular:
>>
>> * OGC Observations and Measurements (
>> http://www.opengeospatial.org/standards/om)
>> * Extensible Observation Ontology (OBOE;
>> https://semtools.ecoinformatics.org/oboe)
>>
>> The former is a standard and broadly deployed, whereas the latter is part
>> of a research program in the use of ontologies for measurements. Through
>> collaboration between the two projects, they've been modified to be
>> reasonably isomorphic, but O&M uses an XML serialization while OBOE uses
an
>> OWL-DL serialization. They largely express the same measurements and
>> sampling model once one gets beyond the terminology differences.
>>
>> So, I'm wondering if it make much sense to extend Darwin Core, which is
at
>> heart an Occurrence exchange syntax, into this measurements area that is
>> well represented by these other existing specifications? I'm curious to
>> hear why people would even want to do this. And if we do go down this
>> path, won't we just end up with a new syntax that does essentially what
O&M
>> and OBOE do now?
>>
>> Matt
1
0