tdwg-content
Threads by month
- ----- 2024 -----
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2003 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2002 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2001 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2000 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1999 -----
- December
- November
- October
- September
- August
- 1557 discussions
Hi Everyone,
Darwin Core remains poorly documented, occasionally inconsistent, and
frequently misunderstood. (Does anyone disagree with that
characterization?) I believe this is one of the reasons we're seeing a
proliferation of overlapping and sometimes incompatible ontologies
building on Darwin Core terms.
One of the suggestions that came up on the TDWG-RDF mailing list is to
have a clean-up-a-thon/document-a-thon for TDWG namespaces and terms. I
suggest that, until such a clean up of Darwin Core occurs, TDWG accept no
additions to the Darwin Core standard. There are several examples in
support of my claim that we're building on a shaky foundation - an obvious
one is that, as Steve is currently pointing out, there is no consensus on
what constitutes a Darwin Core occurrence. (Can anyone name an instance of
the class "http://rs.tdwg.org/dwc/terms/Occurrence"?)
The clean-up-a-thon proposal was enthusiastically endorsed within the RDF
group, but no one volunteered to organize it. I propose that we
self-organize, and find a way to carve out two days at the coming meeting
to hash out as much as we can, with a follow-on workshop if necessary. But
first, I'd be interested to know - am I the only one who feels this way?
Sincerely,
Joel.
p.s.
I've said this before, but it bears repeating - Darwin Core is almost an
excellent standard, and almost ideally suited to be the foundation for a
semantic web for biodiversity informatics. I have great respect for those
who were involved in its creation and continued curation - for their hard
work, and clear thinking, and patience for people like me struggling to
understand. But all that work, thought, and patience will be for naught,
if the gyre is allowed to widen much further.
4
8
NOTE:
Following discussion on the TDWG content list, discussions with OBI
developers, and discussions with MIxS developers, we have modified the
original Material Sample Term Proposal in the following document.
The original document sent to the Darwin Core community for Material Sample
proposed the OBI term Material Sample (
http://purl.obolibrary.org/obo/OBI_0000747), which we are now modifying to
instead reference OBI Specimen (http://purl.obolibrary.org/obo/OBI_0100051).
The reason for this change is that it came to light that the intention of
OBI Material Sample “should be representative of the whole from which it is
sampled”. Since this definition places an unnecessary constraint on its
use we have opted to use OBI Specimen instead, which is more closely
aligned with our original intent. In addition to this change, we have also
modified the proposal based on the feedback from this list.
-John Deck, Rob Guralnick, Ramona Walls
****
New Term Request: Material Sample
This is a proposal for two new terms in Darwin Core, relating to the
addition of the concept, “Material Sample”, described by the identifier
http://purl.obolibrary.org/obo/OBI_0100051. The two terms are:
1) A new BasisOfRecord term MaterialSample with label “Material Sample”
that references http://purl.obolibrary.org/obo/OBI_0100051
2) A new Darwin Core property term, MaterialSampleID.
Submitters: John Deck, Rob Guralnick and Ramona Walls
Justification
The current values in the DwC Type Vocabulary (
http://rs.tdwg.org/dwc/dwctype/) do well in representing some types of
biocollections and observations. However, the more general notion of a
sample is not well represented, because the existing terms are too
specific. For example, the DwC terms “Preserved Specimen”, “Fossil
Specimen”, and “Living Specimen” are appropriate for use in the museum
community but assume particular properties pertaining to museum
collections, which “material samples” may or may not have. Examples of
“material samples” we are considering (beyond the examples above) are
surveys that involve soil and water sampling, bulk sampling of specimens
from, e.g., trawls, microbiological sampling, metagenomics, etc. These
sampling approaches often rely on field sub-sampling processes and
laboratory techniques (e.g., DNA extraction and sequencing) which transform
the physical material and produce distinct information content and thus
represent a type of information that is distinct from what DwC has
typically dealt with. The proposal for adding “Material Sample” as a DwC
class is to maintain consistency with the way Darwin Core terms are managed
and organized. This term comes from the Ontology for Biomedical
Investigations (OBI) class OBI:specimen. We use the class concept
definition directly from OBI but provide the more familiar label “Material
Sample” for use within the biodiversity community and annotate how that
definition applies in the domain of biological collections.
A “material sample” can pertain to general matter in which organisms may
exist, in whole, in part, or in conjunction with many other organisms. The
“material sample” may exist for a brief period, such as a tissue that is
converted to extracted DNA. It may also represent a collection of multiple
taxa, such as a soil or water sample that is used with the intention of
describing the diversity of organisms, whether the actual organisms are
later recovered from such a sample, or whether that sample is processed in
order to generate a set of derivatives from organisms (e.g.16S sequences
from a metagenomics run). A “material sample” may also yield connections
to other indicators of biodiversity aside from taxa, such as a
transcriptome, indicating which DNA is actively being expressed at a
particular point in time.
For the purposes of biological collections, we can think of “material sample”
as any type of matter that we can use in order derive further evidence
needed for identification of taxa, whether it is taxonomically homogenous,
heterogenous, a single individual, sets of individuals, or populations.
However, the definition of the term does not exclude its use in broader
contexts outside the scope of biological collections.
How is the term “Material Sample” different from “Individual”? The intent
of individualID is fairly clear: since an Occurrence represents an
organism at a place and time, the individualID term allows us to assign an
instance identifier for a particular organism that can be present in at
multiple events. MaterialSampleID, on the other hand, is intended to allow
users to say that the basis of an occurence is a material entity (i.e.
matter) that has been sampled according to some particular method. Whether
or not this material entity is an individual (sensu individualID in DwC)
represents an independent axis of classification. There is no restriction
on specifying that an occurence is associated with more than one type, so
any occurrence can have both an individualID and a materialSampleID.
Adding this term will help align DwC to two other significant projects: the
Ontology for Biomedical Investigations (OBI), from which we will be
adapting this term, and the MIxS family of checklists.
The MIxS vocabulary is proposing to adopt MaterialSampleID by clarifying
the existing term source_mat_id to read:
“A unique identifier assigned to a material sample (as defined by
http://rs.tdwg.org/dwc/terms/MaterialSampleID, and as opposed to a
particular digital record of a material sample) used for extracting nucleic
acids, and subsequent sequencing. The identifier can refer either to the
original material collected or to any derived sub-samples. The INSDC
qualifiers /specimen_voucher, /bio_material, or /culture_collection provide
additional context and suggested syntax for this identifier for data
submitted to INSDC databases.”
The MIxS source_mat_id term clarification proposal is pending based on the
outcome of this proposal.
Connecting a DwC Record to a MIxS record would have the advantage of
aligning DwC terminology (geospatial, taxonomic) with sequencing
terminology (investigation, environment, nucleic acid sequence source,
sequencing) and with OBI (investigation, roles, processes), using “Material
Sample” as the pivot point between the standards.
Definition:
>From OBI ((http://purl.obolibrary.org/obo/OBI_0100051): “A material entity
that has the specimen role.”
A specimen role (http://purl.obolibrary.org/obo/OBI_0000112) in OBI is
defined as “a role borne by a material entity that is gained during a
specimen creation process and that can be realized by use of the specimen
in an investigation”. The operative word is “can”. That is, the specimen
is not required to be realized by use in an investigation. However, it is
worth nothing that deposition into a museum or biobank can fulfill the
criteria of “use in an investigation”, if necessary (for discussion, see
http://sourceforge.net/p/obi/obi-terms/677/).
We have chosen to use the label “Material Sample” instead of using the OBI
label “Specimen” for this definition. This allows us to distinguish this
term from other types containing the word “Specimen” currently in use in
the Darwin Core vocabulary, which have their own meaning, distinct from the
concept we are proposing. In the natural history community, biological
specimens have a colloquial meaning, typically referring to a voucher held
by a biorepository for research. We intend a more inclusive definition,
and thus, when we refer to “DwC Material Sample” here, we are actually
referring to the class of entities defined by “OBI Specimen”.
In order to clarify how this definition may be considered in a biological
collections context, we wish to include a http://www.w3.org/2000/01/rdf
-schema#comment annotation within the DwC vocabulary which would read: “In
biological collections, the material sample is typically collected, and
either preserved, transformed by some process, or destructively processed”
Further clarification on the use of this term, including this document,
would be provided in the supplementary documentation and the Darwin Core
wiki.
Comment: N/A
Refines: N/A
Has Domain: N/A
Has Range: N/A
Replaces: N/A
Summary:
Term Name: MaterialSample
Identifier: http://rs.tdwg.org/dwc/dwctype/MaterialSample<http://rs.tdwg.org/dwc/terms/MaterialSample>
Namespace: http:/rs.tdwg.org/dwctype/ <http://rs.tdwg.org/dwc/terms>
Label: Material Sample
Definition: A resource describing the physical results of a sampling (or
subsampling) event. In biological collections, the material sample is
typically collected, and either preserved or destructively processed.
Comment: For discussion see
http://code.google.com/p/darwincore/wiki/DwCTypeVocabulary (there will be
no further documentation here until the term is ratified)
Type of Term: http://www.w3.org/2000/01/rdf-schema#Class
Refines: <http://purl.obolibrary.org/obo/OBI_0000747>
http://purl.obolibrary.org/obo/OBI_0100051<http://purl.obolibrary.org/obo/OBI_0000747>
Status: proposed
Date Issued: 2013-03-28
Date Modified: 2013-05-25
Has Domain:
Has Range:
Refines:
Version: MaterialSample-2013-06-24
Replaces:
IsReplaceBy:
Class:
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)
Term Name: materialSampleID
Identifier: http://rs.tdwg.org/dwc/terms/MaterialSampleID<http://rs.tdwg.org/dwc/terms/MaterialSample>
Namespace: http://rs.tdwg.org/dwc/terms/<http://rs.tdwg.org/dwc/terms/MaterialSample>
Label: Material Sample ID
Definition: An identifier for the MaterialSample (as opposed to a
particular digital record of the material sample). In the absence of a
persistent global unique identifier, construct one from a combination of
identifiers in the record that will most closely make the materialSampleID
globally unique.
Comment: For discussion see
http://code.google.com/p/darwincore/wiki/MaterialSample (this page will not
exist until the term is ratified).
Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
Refines: http://purl.org/dc/terms/identifier
Status: proposed
Date Issued: 2013-03-28
Date Modified: 2013-05-25
Has Domain:
Has Range:
Version: materialSampleID-2013-05-25
Replaces:
IsReplaceBy:
Class: http://rs.tdwg.org/dwc/terms/Occurrence
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)
2
2
This is an open letter to Steve Baskauf.
<context>
For those who don't know, Steve came to TDWG a few years ago for answers
on how to share data about plants, and and data about images of plants. He
was, at the time, supplying three aggregators with data, and had to format his data
differently for each one. Each of the aggregators had, in theory, embraced
the semantic web, as had TDWG. So he started, in 2009, to ask some
innocuous questions on the TDWG-content list regarding how to use TDWG
standards. These questions led to such a cacophony of answers that he
realized that TDWG needed help, and he agreed, in 2011, to co-convene the
TDWG RDF-Task group, under the sponsorship of the TDWG TAG.
He shares the vision, espoused by many, that it would be better to build a
"semantic layer" on top of Darwin Core, than to have to rebuild a
"semantic Darwin Core" from scratch. The reason for this is that everyone
has different semantic web needs, and, if forced to start from scratch,
will build different semantic Darwin Cores. Then BCO and Darwin-SW and
DWC-FP and Pyle/Whiton will have no way of understanding each other, while
if they shared a basic underlying vocabulary [1], there would at least be
a chance for data interoperability.
</context>
<email to Steve>
Steve,
I proposed a moratorium on additions to Darwin Core until some fundamental
things are cleaned up. I'm only aware of two additions pending: the
addition of MaterialSample and MaterialSampleID to the "dwctype:"
namespace; and the Darwin Core RDF guide, still receiving feedback from
the task group. I do think that the MaterialSample proposal would be
better framed atop a clearer semantics of existing terms and usage
patterns, but who knows. I trust the proposers of the new terms to make
good choices.
My current proposal (below) follows Guralnick's advice to narrow scope. It
is partly motivated by the fact that you made a perfectly reasonable plea
for clarification regarding the meaning of "occurrence", and no one is
responding. (Yes, you only made the plea three days ago, but you've made
many versions of it in the past.) In my opinion, it is unreasonable to
expect the RDF Task Group to produce a guide for expressing Darwin Core in
RDF unless some effort is made to clarify the semantics of a few Darwin
Core terms.
So my current, less radical, proposal is this:
<proposal>
That the RDF Task Group not release the draft guide for public comment
until we (i.e. TDWG) improve the documentation around dwc:Occurrence and
dwctype:Occurrence. By "improve", I mean:
i. Configure the rs.tdwg.org server to set an xml mime type for
http://rs.tdwg.org/dwc/rdf/human-dwctype.xsl and
http://rs.tdwg.org/dwc/rdf/human.xsl . (Currently the mime type is set to
"text/plain", which results in users being unable to view either the raw
RDF or the human-readable RDF in a standard browser.)
ii. In a single document, served by tdwg.org, list the valid
usages/interpretations of dwc:Occurrence and dwctype:Occurrence. (I
believe there are six.) Currently, to be aware of the various semantics of
"Occurrence", a person needs to read half a dozen or so different
documents. That's not necessarily bad, but there should at least be a
"Guide to Standard 450", that links to and briefly explains the contents
and purposes of each document.
</proposal>
Is that too much to ask of ourselves?
Now, you might say, "Fine, we should do that, but I'm not delaying release
of the guide, which has already taken way too long to produce." And if you
say that, I'll support you. But if we go ahead and release the guide
without fixing some basic underlying things, I don't think it will have
the impact that it should, for several reasons. Here's one of them:
<Example of how we're limiting ourselves>
The draft guide is excellent. For many examples of what people want to do,
the guide essentially creates a matrix with the dimensions "technical
approaches" and "conceptual approaches", and proceeds to fill in every
cell of this matrix with a valid usage pattern. It is meticulous. Yet
there is a significant omission. There is no example of representing a
Darwin Core occurrence in RDF! In the guide's defence, there is no way to
do this without making significant assumptions about just what a Darwin
Core occurrence is.
</Example of how we're limiting ourselves>
Anyway, those are my thoughts, and some of my arguments. I will defer to
your judgement on the best way to proceed.
Best,
Joel.
</email to Steve>
1. Notice I said "vocabulary" and not "ontology". The less ontology there
is in the shared Core, the easier it will be for people to build on it to
suit their needs. But a lack of ontology does not imply a lack of
semantics. For proof, look at a dictionary - it has the complete semantics
of the language. Similarly, Darwin Core - a "bag of terms", i.e. a
glossary - is a perfect candidate to be a shared base layer for the
semantic web … once it clarifies a few basic things.
2
1
Dear all,
TDWG could see a lot of activity in 2013 in anticipation of the meeting in
Florence in October. Much of the activity is related to enabling
integration across multiple parts of our domain. We have the Audubon Core
under review for biodiversity-related media and an impending RDF Guide to
supplement the already extant Text and XML Guides for Darwin Core.
This message is to bring your attention to another integrative initiative,
to introduce terms into Darwin Core that will form a nexus between
Occurrences and the interesting things that happen with physical materials
that result from them, such as, but not limited to, genetic sequencing. A
series of meetings for a little over the past year have inspired our
colleagues in the Genomics Standards Consortium (GSC) to propose to their
constituency to align their terms with Darwin Core, including adopting some
of the Darwin Core terms in place of their own that have the same meaning.
Out of these discussions has come the realization that neither community
has terms to accommodate the concept of an identifiable (objectively, not
taxonomically), trackable material sample. This message constitutes such a
proposal.
This proposal would have no impact on those publishing purely taxonomic
data. It would also have no impact on those publishing occurrence data
unless they want to increase their capacity to distinguish material samples
from organisms more rigorously than is now possible using only the
dwc:preparations term.
The initial request for new terms can be found in the Darwin Core Issue
tracker as http://code.google.com/p/darwincore/issues/detail?id=167. Below
I have elaborated nad formalized the request into the three distinct terms
under consideration, initiating the 30 day minimum public review process to
seek consensus on their inclusion in the Darwin Core standard. Your job,
should you choose to accept it, is to discuss the merits or any perceived
problems in the inclusion of these three terms in Darwin Core.
Below I will give the proposed properties of three terms as they would
appear in the Darwin Core Quick Reference Guide, though these properties
would be included in the RDF of the normative form of the documentation.
A new MaterialSample class: This is for the purpose of organizing
properties, just as the existing classes (Occurrence, Event, Location,
GeologicalContext, Identification, Taxon, etc.) do, without having any
terms declare this class as their domain.
Term Name: MaterialSample
Identifier: http://rs.tdwg.org/dwc/terms/MaterialSample
Namespace: http:/rs.tdwg.org/dwc/terms
Label: Material Sample
Definition: The category of information pertaining to the physical results
of a sampling (or subsampling) event. In biological collections, the
material sample is typically collected, and either preserved or
destructively processed, with the intention of being representative of a
greater whole.
Comment: For discussion see
http://code.google.com/p/darwincore/wiki/MaterialSample (this page will not
exist until the term is ratified).
Type of Term: http://www.w3.org/2000/01/rdf-schema#Class
Refines:
Status: proposed
Date Issued: 2013-03-28
Date Modified: 2013-04-08
Has Domain:
Has Range:
Version: MaterialSample-2013-03-28
Replaces:
IsReplaceBy:
Class:
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)
A Darwin Core Type Vocabulary value for basisOfRecord is needed to
represent this new class of information. Luckily, a term already exists in
the Ontology for Biomedical Investigations (
http://www.ontobee.org/browser/rdf.php?o=OBI&iri=http://purl.obolibrary.org…).
We and the GSC both propose to reuse this class within Darwin Core as
below, making it the cross-ver point between the two domains.
Term Name: MaterialSample
Identifier: http://rs.tdwg.org/dwc/terms/MaterialSample
Namespace: http://purl.obolibrary.org/obo/OBI_0000747
Label: material sample
Definition: A material entity that has the material sample role
Comment: For discussion see
http://code.google.com/p/darwincore/wiki/DwCTypeVocabulary (there will be
no further documentation here until the term is ratified)
Type of Term: http://www.w3.org/2000/01/rdf-schema#Class
Refines:http://purl.obolibrary.org/obo/OBI_0100051
Status: recommended
Date Issued: 2013-03-28
Date Modified: 2013-03-28
Has Domain:
Has Range:
Version: MaterialSample-2013-03-28
Replaces:
IsReplaceBy:
Class:
ABCD 2.0.6: not in ABCD
In keeping with all other classes in Darwin Core, the Material Sample class
would have a corresponding identifier property. The Genomics Standards
Consortium (GSC) is in the process of proposing this term. If it is
accepted, we propose to use it, and its properties would be as below,
otherwise, the properties would be the same, but have the Darwin Core
namespace and identifier URI.
Term Name: materialSampleID
Identifier: http://gensc.org/ns/mixs/materialSampleID
Namespace: http://gensc.org/ns/mixs
Label: Material Sample ID
Definition: An identifier for the MaterialSample (as opposed to a
particular digital record of the material sample). In the absence of a
persistent global unique identifier, construct one from a combination of
identifiers in the record that will most closely make the materialSampleID
globally unique.
Comment: For discussion see
http://code.google.com/p/darwincore/wiki/MaterialSample (this page will not
exist until the term is ratified).
Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
Refines: http://purl.org/dc/terms/identifier
Status: proposed
Date Issued: 2013-03-28
Date Modified: 2013-04-08
Has Domain:
Has Range:
Version: materialSampleID-2013-03-28
Replaces:
IsReplaceBy:
Class: http://purl.obolibrary.org/obo/OBI_0000747
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)
10
32
Consensus on what constitutes an Occurrence? (was Re: New Darwin Core terms proposed relating to material samples)
by Steve Baskauf 24 Jun '13
by Steve Baskauf 24 Jun '13
24 Jun '13
I have quoted below part of an email which has been sitting in my inbox
for a month. It been stuck there because there was a statement in it
that (in my mind) needed clarification. In John Deck's email, he says
"...since an Occurrence represents an organism at a place and time...".
What I am wondering is whether there is actually a consensus that an
Organism represents an organism at a place and time.
Caveat: I use "individual organism" here in a general way that probably
includes more than individual organisms. But that is a different issue,
so let's not rehash that in this thread.
The history of the discussion of the meaning of Occurrence is
extensive. You can find my attempt to summarize it at:
http://code.google.com/p/darwin-sw/wiki/ClassOccurrence so I won't
repeat that here. In a nutshell, it seems to me that people have used
dwc:Occurrence in three general ways:
- to indicate that we know from aggregate records that a taxon occurs or
ever occurred, in a particular geographic area (the "checklist" meaning
of Occurrence)
- as a broad term that includes both preserved specimens and
observations (the "superclass" meaning of Occurrence)
- as a join between Events and individual organisms [database
description]/as a node connecting Event instances to instances of
individual organisms [RDF description]/as a tuple of (individual
organism,Event) with properties to connect it to the individual organism
and Event [computer science description] (the "node" meaning of
Occurrence).
It has been noted that the "checklist" meaning of Occurrence is related
to Occurrence as a primary unit of data gathering ("superclass" and
"node" meanings; see history reference for details) but the "checklist"
meaning is probably the least likely to be considered a consensus view,
so I'm going to ignore it for the moment. The "node" meaning of
occurrence corresponds to what is described by John Deck (quoting Markus
Döring) in his email below. It is also the view taken by Darwin-SW and
is reflected in Rich Pyle's emails (related since Darwin-SW was
influenced by Rich Pyle's emails!). However, although it isn't
explicitly stated as such, the Darwin Core standard as it currently
stands really reflects the "superclass" meaning. I was involved in a
conversation with John Wieczorek a few months ago which was on the topic
of "fixing" dwc:Occurrence (i.e. getting rid of the ambiguity
surrounding it). In that conversation, I confirmed with John W. that as
things stand currently, Darwin Core effectively considers dwc:Occurrence
to be a superclass of PreservedSpecimen and Observation. So to me it
does not seem that there actually is a consensus about what
dwc:Occurrence means. Is an Occurrence the *thing* that documents the
presence of an organism at a place and time ("superclass" meaning), or
is the Occurrence an *abstract resource* connecting organisms to
place/time with the thing itself as documentation for the abstract
resource ("node" meaning)?
In order to "fix" Occurrence by clarifying its meaning, it seems to me
that there are two courses of action:
1. Declare clearly that Occurrence is a superclass of PreservedSpecimen
and Observation and create a new term for the more abstract "organism at
a place and time".
2. Declare clearly that Occurrence is an organism at a place and time
and that it is NOT a superclass of PreservedSpecimen and Observation.
The second course of action would be the easiest from the standpoint of
making a change to the standard. However, it might be the worst from an
implementation standpoint because of the thousands (millions?) of
specimen records that are typed as Occurrence.
If we can clarify these two uses of Occurrence, then the terms currently
listed in DwC under the dwc:Occurrence class could be separated among
the two "kinds" of Occurrence. Terms related to the recording of the
presence of an organism at a time and place (dwc:recordedBy,
dwc:behavior, etc.) would be separated from terms related to the
specimens themselves (dwc:preparations, dwc:disposition, etc.). This
may not seem like a big deal for flat specimen records, but it would be
very helpful from the standpoint of advancing the use of DwC in RDF to
clarify the types of resources that these terms can serve as properties
of.
I would be interested in hearing some discussion about concrete steps
that could be taken to "fix" Occurrence. The "best" solution would
probably be to create a robust consensus ontology that includes
Occurrence. However, that is not likely to happen on the timescale of a
year or less. Given that this issue has dragged on for at least two
years already, in the interest of moving forward it would be good to
take some kind of decisive action in the near term.
Steve
-------- Original Message --------
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Date: Wed, 29 May 2013 16:00:35 +0200
From: John Deck <jdeck(a)berkeley.edu>
To: Richard Pyle <deepreef(a)bishopmuseum.org>
CC: Markus Döring <m.doering(a)mac.com>, Steve Baskauf
<steve.baskauf(a)vanderbilt.edu>, TDWG Content Mailing List
<tdwg-content(a)lists.tdwg.org>, Robert Whitton <whittonr(a)gmail.com>,
"Ramona Walls" <rlwalls2008(a)gmail.com>
References:
Since the original proposal was from a group of folks, we decided to put
our heads together to construct a general response to the various issues
and ideas expressed on this thread.
John Deck for Rob Guralnick, Ramona Walls, and John Wieczorek
...
How is MaterialSample different from Individual? The intent of
individualID is fairly clear: since an Occurrence represents an
organism at a place and time (per Markus’ email), the individualID term
allows us to assign an instance identifier for a particular organism
that can be present in multiple events. MaterialSampleID, on the other
hand, is intended to allow users to say that the basis of an occurence
is a material entity (i.e. matter) that has been sampled according to
some particular method. Whether or not
...
--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences
postal mail address:
PMB 351634
Nashville, TN 37235-1634, U.S.A.
delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235
office: 2128 Stevenson Center
phone: (615) 343-4582, fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu
2
3
Re: [tdwg-content] New Darwin Core terms proposed relating to material samples
by Richard Pyle 01 Jun '13
by Richard Pyle 01 Jun '13
01 Jun '13
Hi Ramona,
I apologize for the long emails, but this stuff is complex and unfortunately
requires lots of words (to avoid - or at least minimize - misunderstanding).
I will try to keep my responses to your points short.
> Using the word "individual" to describe collections of organisms -
> whether they are taxonomically homogenous or heterogeneous - makes no
sense.
> Yes, I know it is just a label, but seriously, just make a better label.
Yes, I agree. But it's what we already have in DWC
(http://rs.tdwg.org/dwc/terms/index.htm#individualID) I have no problem
using a different term, but before we choose terms, we should first define
what the concepts are.
> A single organism and a collection of organisms are fundamentally
different things.
Actually, not really different things. Many natural history collections
maintain their specimens as "lots", which may have a single individual
specimen, or multiple specimens. Regardless of whether it's a single
specimen or multiple specimens, the basic properties are the same (same
collecting event, sme taxonomic identification, and many other identical
properties). This becomes especially true for colonial organisms (like
corals, where the "individual" could be interpreted as a single polyp). It's
also true for other use cases we deal with that are outside the DWC/TDWG
scope.
> If you need a class that can cover both of them under certain
circumstances,
> you need to use a logical definition to define the circumstances (just
like the
> class material sample does by using the criterion of having a material
sample role).
> In order to do this, you also need to have separate classes for individual
organism
> and collection of organisms.
We have tried to do this by distinguishing instances as "Lot" or "Whole
Organism" -- which could be thought of as distinct subclasses (though again,
they generally share the same properties). The same is true for tissue
samples, and other "parts".
> I agree whole-heartedly with the need to clearly track stakeholders needs
> for different classes of things, using a logical system to decide how
these
> things relate to one another, examining alternative systems for creating
> the classes of things, and testing them against use cases (Steve's points
1-4).
> This is precisely what we are trying to do with the bio-collections
ontology
> (BCO). The suggestion to use the term material sample came out of just
> such a process. It is important to remember that the stakeholders include
> more than just the community using DwC.
It seems we are all in full agreement on these points. In my case, I am
especially in agreement with the last point, as much of our thinking has
been independent of the TDWG/DWC thinking, but still keeping that set of
use-cases in mind.
Aloha,
Rich
4
7
Re: [tdwg-content] New Darwin Core terms proposed relating to material samples
by Richard Pyle 31 May '13
by Richard Pyle 31 May '13
31 May '13
Thanks Ramona;
Actually, the basic elements of our data mode precede DwC by quite a bit.
What weve tried to do, however, is mold the data model to be more
compatible with DwC, to make the task of mapping for data export & exchange
that much easier. Of course, DwC is not (and is not intend to be) a data
model in any sense of the word; however, its impossible to avoid
representing core elements of a bona-fide data model within DwC. This is
especially true when it comes to each of the ID terms (and
doubly-especially true when the ID terms correlate to class terms). The
existence of an ID term implies that some class of object exists to which
an ID value is applied. The ID value itself is never useful
data/metadata it is just a way to reference a data record that
(presumably) contains properties that can be expressed as data/metadata for
the object represented by the ID value.
This was all well-understood when the original DwC was being drafted; but as
it evolved into its current iteration (with the addition of all the ID
terms), it has been drawn ever more (in some ways subtly, and in some ways
not-so-subtly) in the direction of a data model.
Of course, what we all (desperately!) need is a robust ontology that fits
our world. The task is not easy in part because our data domain is not so
easily modeled, in part because different sections of our broader community
have different priorities, and in part because there is always a delicate
balance between developing a model or ontology that is practical and useful
for the data providers and consumers, with one that is robust and detailed
and flexible, to allow asking questions of the data that were never even
considered at the time the model/ontology were conceived. The parallel
experience in database modeling is normalization (as Paul Kirk likes to say:
Normalize until it hurts; then de-normalize until it works).
The original DwC was completely flat. The current DwC moved into the
direction of more complex structures by clustering terms into classes and
sprinkling with ID terms. It even tip-toed into RDF-space with
dwc:ResourceRelationship. I think thats definitely an improvement, but it
still must strike a delicate balance between the needs by some to represent
a robust data model, and the needs by others to have a simple/practical
mechanism to exchange biodiversity data in a standard form. It will never
be all things to all people; but at least it is enough things to enough
people that it represents an important flag pole around which our
community has (more or less) successfully rallied.
Hmmmm
. Now Ive forgotten what my point was. I guess I was just in a
ramblin mood. Well
.sorry about the bandwidth!
Aloha,
Rich
From: Ramona Walls [mailto:rlwalls2008@gmail.com]
Sent: Thursday, May 30, 2013 6:05 PM
To: Richard Pyle
Cc: Jason Holmberg; TDWG Content Mailing List; John Deck; Robert Whitton
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Jason,
Thanks for sharing how you have been using the Darwin Core terms. I am
intrigued by the data structure you have developed. It is quite interesting
how both you and Rich have adapted the DwC to fit your specific needs. While
I am often troubled by the vagueness of DwC, I guess in some ways it is that
vagueness allows it to be used in many different applications. Of course, I
don't think vagueness is necessary for wide application, or a good thing for
data exchange, but it does seem to be working for a lot of different
purposes.
Ramona
On Thu, May 30, 2013 at 3:07 PM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Hi Jason,
Many thanks for this input. If I understand you correctly, then you are
using Encounter as equivalent to what we have been using Occurrence for.
That is, by our definition, an Occurrence is the instance representing an
intersection between an Event (i.e., where, when, who, etc.) and what we
have been calling an Individual (i.e., what); and the properties we attach
to the Occurrence are the how bits (including things like size, etc.).
In my mind, the essence of an Individual is the collective physical
material of the individual. If I see a fish on a reef, its Occurrence on
that reef and at that time exists (and is worth documenting) regardless of
whether I took an image of it (what we would call Evidence), or whether I
took a tissue sample from it, or whether I collected and killed the whole
damn thing. To me, the essence of the individual or its occurrence at
an event is unaffected by what I end up doing to it. By extension,
following a hierarchical model of individual, a sub-sample
(materialSample) extracted from it is just another instance of Individual.
This is why I generally think of materialSample (if it were represented as
a class which it is currently not propsed for DWC) as a subclass of a
broader concept (e.g., material, but what I have naively been referring to
as Individual).
That part of our model has proven to be very stable and effective for
representing the information as we want it.
Where it gets complicated is instances of taxonomically heterogeneous
objects treated as a single individual which (in my mind) includes such
things as soil samples.
In that contect, I see (and agree) with John and others that really its a
separate axis of classification from what I have called Individual.
I dont expect that to make a lot of sense (I barely understand it myself).
Aloha,
Rich
From: Jason Holmberg [mailto:holmbergius@gmail.com]
Sent: Thursday, May 30, 2013 11:28 AM
To: Richard Pyle
Cc: Ramona Walls; TDWG Content Mailing List; John Deck; Robert Whitton
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Hi everyone.
List lurker here. DWC has been a great inspiration in my work, so I hope I
can contribute some small amount of insight on the "individual" and
"material sample" threads. I have no grand thoughts on the subject, but I
can tell you how the DWC has inspired my own information architecture for
open source mark-recapture software:
http://www.ecoceanusa.org/shepherd/doku.php?id=manual:2.0.x:1_overview
I felt the very clear need for a distinct Individual Class and to separate
that from the concept of a sample taken from nature. When reviewing DWC, I
interpreted Occurrence.individualCount to be somewhat contradictory to
Occurrence.individualID, so I created a one-individual-at-a-point-in-time
class called Encounter that reuses quite a bit of DWC.Occurrence. Occurrence
I then broadened to include the potential for multiple marked individuals.
I neither present this as "right" nor "good" (though they have worked very
well for us). I just present it as a practical example from mark-recapture
in which we have tried to adhere to DWC in order to expose data to GBIF,
iOBIS, etc. The concepts of "material sample" and "individual" are very
important to us, and this is how we have defined them.
Cheers,
Jason Holmberg
ECOCEAN Whale Shark Photo-identification Library
http://www.whaleshark.org
Please consider adopting a shark to support our mission:
http://www.whaleshark.org/adoptashark.jsp
On Wed, May 29, 2013 at 4:13 PM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Hi Ramona,
Yes, I agree, and thanks. Ive always felt that there has been a trend
towards trying to push too much ontology (or other semantic meaning) onto
DWC terms and classes, when DWC was fundamentally intended to represent an
mechanism for data exchange; not a mechanism to describe the ontological
landscape of biodiversity data. The only reason I brought this up now (and,
I think, why we discussed it in 2010), is that the term individualID in
DWC sort of hinted that something like Individual was the forgotten
class for DWC. I sincerely hope that BCO and DSW gain more traction (and,
ideally, harmony between them) than earlier attempts at developing
ontologies in this space have met and clearly that is the right path
forward.
My main concern for this thread (and the reason I engaged in it), was to:
1) Find out the status of the discussions that began in 2010; and
2) Clarify where the current materialSample proposal overlaps, or does
not overlap, with that earlier effort.
Steve has very adequately answered the first question, and you, John, and
others have answered the second, and Im happy with both sets of answers.
Im sorry for the voluminous exchange, but I felt the discussion was both
important, and very helpful (certainly to me).
Aloha,
Rich
From: Ramona Walls [mailto:rlwalls2008@gmail.com]
Sent: Wednesday, May 29, 2013 1:03 PM
To: Richard Pyle
Cc: John Deck; Markus Döring; Steve Baskauf; TDWG Content Mailing List;
Robert Whitton
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Hi Rich,
Sorry I didn't mention this sooner, but your emails were also helpful to me
in describing an important and generalizable use case.
I don't know whether or not the TDWG community is ready to deal with the
level of abstraction we are talking about, but my assessment is that whether
or not they are ready, the Darwin Core is not constructed to deal with it.
That is why (among other reasons) we started work on the BCO, and perhaps
one reason why Steve and others developed DSW.
Our goal with the material sample proposal was not to overhaul DwC, but to
work within the DwC framework to make it more compatible with other
standards such as MIxS. That is why we tried to keep our proposal fairly
narrowly focused.
Ramona
On Wed, May 29, 2013 at 3:21 PM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Thanks, Ramona this is an *extremely* helpful email! It helps clear things
up a lot in my mind.
Just to be clear, what I am looking for is the notion of a defined physical
object (what I think you mean by material entity), and I explicitly mean
the material entity itself. Yes, there is information (properties) that
relate to that material entity, but to me that is a separate issue. What I
would like to see clearly defined is the class representing the material
(physical) entity which seems to me to be a superclass of what
materialSample is intended to represent.
Perhaps our (TDWG/DWC) community is not yet ready to deal with this level of
abstraction (unfortunately, I absolutely have to, because Occurrence is
simply way too overloaded a class for me to use independently of what I have
been calling individual and what I have been calling Evidence). In that
case, I guess the best thing to do is accept materialSample as a
basisOfRecord for Occurrence and move on. But this is more or less the same
thing that happened the last time we engaged in this conversation (2 years
or so ago), and I was hoping this conversation about materialSample could
leverage progress on the larger issue.
As Ive said before, the last thing I want to do is confuse or otherwise
slow down the process of incorporating the term materialSample into DWC.
Its just that I saw enough overlap with that other issue, that I was
hoping we could find a reasonable pathway forward on both.
Thanks again for the very helpful comments.
Aloha,
Rich
From: Ramona Walls [mailto:rlwalls2008@gmail.com]
Sent: Wednesday, May 29, 2013 9:14 AM
To: Richard Pyle
Cc: John Deck; Markus Döring; Steve Baskauf; TDWG Content Mailing List;
Robert Whitton
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Rich,
I now understand more fully what you are asking for ( a clear definition
goes a long way!). A material sample, as we discussed it at the Kansas and
Oxford workshops, does indeed need to be physically removed from its
environment. This is also the case with the OBI term material sample, which,
as a subclass of OBI:specimen is the output of some collecting process. It
is true that concept of material sample could be defined to include sampling
in an observational sense, but that is not how it is defined at this point.
Based on this, material sample is NOT the term you are looking for or
defined as :
"The category of information pertaining to the physical basis of a sampling,
subsampling, or observational event. In biological collections, the
[SuperclassTerm] is typically a defined group of organisms, a single whole
organism, or a part of a whole organism that is collected or otherwise
documented in nature, and either preserved, destructively processed, or
documented through some form of Evidence (such as images or reported visual
observations)."
What you have defined is a category of information (whatever that may be)
that pertains to some material entity. Not the material entity itself, but
information about that entity. The "SuperclassTerm" you refer to in the
definition sounds an awful lot like a material entity from the Basic Formal
Ontology, which is used for defining material sample in OBI and the
Bio-collections Ontology.
Ramona
On Wed, May 29, 2013 at 11:51 AM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Many thanks, John. This is extremely helpful!
First of all, in the context of a distinct term for basisOfRecord, I see
absolutely no problem with adding the term MaterialSample. I fully support
your proposal (although if this is simply a basisOfRecord term to be used
alongside Occurrence, PreservedSpecimen, LivingSpecimen, FossilSpecimen,
HumanObservation, MachineObservation; does it need a defined ID term? Do
all the others have defined ID terms?).
However, Im excited by this conversation because I think we are very close
to solving a bigger problem (which was the focus of the 2010 discussion on
this list around IndividualOrganism).
This bigger problem involves the need for a defined concept (Im
hesitating to say class), and an associated ID, in dwc that refers to
the physical/material basis of an Occurrence. We dont yet have a term for
this concept in dwc (IndividualID hinted at the need for one, but that
term was not well-defined, and the term itself seems to cause confusion).
As Steve Baskauf and I have both been advocating for the establishment of
new class in dwc for exactly this purpose, I just want to make sure that
were on the same page about what each concept is. The more I understand
about what you need for materalSample, the more convinced I am that both
of our needs can be met with the same concept.
I am perfectly happy to adopt the term MaterialSample, but I guess it all
boils down to this: In order for something to be a MaterialSample, must it
necessarily be removed from nature?
If the answer is No, then I think we can merge the two concepts into one.
If the answer is Yes, then I think materialSample is best characterized
as a subclass of what Steve and I have been pushing for (setting aside, for
the moment, the additional complexity of taxonomically homogeneous vs.
heterogeneous).
In the latter case, I would define the superclass (whatever term is used for
it), along the lines of:
"The category of information pertaining to the physical basis of a sampling,
subsampling, or observational event. In biological collections, the
[SuperclassTerm] is typically a defined group of organisms, a single whole
organism, or a part of a whole organism that is collected or otherwise
documented in nature, and either preserved, destructively processed, or
documented through some form of Evidence (such as images or reported visual
observations)."
Aloha,
Rich
From: jdeck88(a)gmail.com [mailto:jdeck88@gmail.com] On Behalf Of John Deck
Sent: Wednesday, May 29, 2013 4:01 AM
To: Richard Pyle
Cc: Markus Döring; Steve Baskauf; TDWG Content Mailing List; Robert Whitton;
Ramona Walls
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Since the original proposal was from a group of folks, we decided to put our
heads together to construct a general response to the various issues and
ideas expressed on this thread.
John Deck for Rob Guralnick, Ramona Walls, and John Wieczorek
In the text of the issue submitted for MaterialSample (
<https://code.google.com/p/darwincore/issues/detail?id=167>
https://code.google.com/p/darwincore/issues/detail?id=167) we noted cases
where the current basisOfRecord terms pertaining to the Occurrence class
(Occurrence, PreservedSpecimen, LivingSpecimen, FossilSpecimen,
HumanObservation, MachineObservation) do not adequately cover certain cases,
including: environmental sample (for metagenomic analysis), transcriptomes
(measuring genes but not taxa), and destructive samples (e.g. tissues
destructively sampled in order to generate genomic DNA). The term we
borrowed from OBI (http://purl.obolibrary.org/obo/OBI_0000747) is broad
enough to be utilized across various cases that fulfill our criteria while
still maintaining a consistent, clear and human understandable meaning. For
our purposes, we can think of Material Sample as any type of matter that
we can use in order derive further evidence needed for identifications, and
taxa, whether it is taxonomically homogenous, heterogenous, a single
individual, sets of individuals, or populations.
How is MaterialSample different from Individual? The intent of individualID
is fairly clear: since an Occurrence represents an organism at a place and
time (per Markus email), the individualID term allows us to assign an
instance identifier for a particular organism that can be present in
multiple events. MaterialSampleID, on the other hand, is intended to allow
users to say that the basis of an occurence is a material entity (i.e.
matter) that has been sampled according to some particular method. Whether
or not this material entity is an individual (sensu individualID in DwC) is
an independent axis of classification. As was already pointed out, there is
no restriction on specifying that an occurence is associated with more than
one type, so any occurrence can have both an individualID and a
materialSampleID.
We maintain our position on the proposal for MaterialSample as a value for
the basisOfRecord, with an associated materialSampleID to identify instances
of them. Per Steves initial comments, we have already withdrawn the
proposal for a MaterialSample class distinct from that in the Darwin Core
type vocabulary, which should make it easier to evaluate the implications of
what were discussing.
********************
NOTES, MaterialSample from OBI:
OBI has fairly broad definitions of samples & specimens that are meant to be
utilized across many different scientific activities. Material Sample is
defined as a material entity that has the material sample role, while a
material sample role is defined as a specimen role borne by a material
entity that is the output of a material sampling process, and a material
sampling process is a specimen gathering process with the objective to
obtain a specimen that is representative of the input material entity.
On Mon, May 27, 2013 at 11:59 PM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Hi Markus,
Great question! Particularly because this is exactly the sort of use case
we designed our model around.
> if you take a tissue sample of the same tree every year, would the
identifier
> in individualID be the same for all of them or be different? WIth the
current
> dwc:individualID definition it would be the same for all samples. If I
> understand you correct each sample would have its own "individual"
> identifier in your proposal? It can't see how you can collapse these two
things
> into one definition.
No, that is not how we would handle it.
In our model, there would be one IndividualID to represent the tree,
spanning the time period beginning (more or less) when the seed was
germinated, until the time at which the entire physical structure of the
tree was disintegrated. It is an individual tree.
There would be multiple Occurrence instances, for each time that someone
observed or sampled or otherwise wished to document some condition of that
tree. All of these Occurrence instances would refer to the same individualID
value (i.e., the "tree"). In the example above, this means there would be a
different Occurrence instance for each year that a sample is taken --
because in each case, an assertion that the full tree existed at a certain
time and place can be made (I understand that trees tend not to move around
very much, so the Location for each event associated with each Occurrence
would, in this case, remain the same; but the other Event properties -- such
as eventID, samplingProtocol, samplingEffort, eventDate, eventTime,
startDayOfYear, endDayOfYear, year, month, day, verbatimEventDate, habitat,
fieldNumber, fieldNotes, eventRemarks -- would be documented accordingly for
each sampling Occurrence instance).
Suppose that the tree is visited every month, but only sampled once per
year. In that case, there would be an Occurrence record for every monthly
visit. In other words, an Occurrence instance exists regardless of whether
a physical sample was made or not. Any in-situ images made of the tree
would likewise be associated with the specific Occurrence instance, and each
image would represent a separate instance of "Evidence".
Now, let's focus on the annual samplings. Every time a new sample is taken
from the tree, at least one new instance of Individual (with a unique
individualID value) is created to represent the sample. This sample
(individual instance) may be a "gathering" (set of multiple individual
specimens gathered at the same time), or it may be a single specimen, or it
may be simply a tissue sample intended for destructive analysis. In any
case, it's a new individual instance derived from the "parent" individual
instance representing the whole tree. In our implementation, "Individual"
can be hierarchical, such that a whole-organism tree can be sub-sampled with
many "child" instances of "gatherings" (say, one gathering each year), and
each gathering may have multiple child "specimen" individuals (e.g.
individual botanical sheets created from the multiple items of a single
gathering), and each specimen may have further "child" subsamples extracted
for DNA analysis (or whatever), and the hierarchy can continue on down to
whatever derivatives that people feel a need to keep track of (e.g.,
aliquot).
The point is, all Individual instances are well-defined physical objects (or
well-defined sets of physical objects), and they can be arranged in a
n-tiered hierarchy.
Moreover, each Individual that can be characterized as a "sample" (what we
refer to as a "CollectionObject") may also have a property value for
"CollectionOccurrenceID" -- which refers to the specific Occurrence instance
at which the sample was obtained.
So, for example, if the tree is visited on May 27, 2013 and a specimen
(sample) is taken, then:
1) An Event instance is generated to represent the event where the tree was
visited;
2) An Occurrence instance is generated, which refers to the new EventID, and
the existing IndividualID for the whole tree, and includes whatever other
Occurrence properties are relevant for the tree at the time of this
Occurrence
3) An Individual instance is generated for the specimen, which has a
property value for parentIndividualID that refers to the individualID for
the whole tree, and a property value for collectionOccurrenceID that refers
the Occurrence instance where the specimen was collected.
So, to summarize the answer to your question:
- There are multiple Occurrence instances that refer to the same Individual
instance representing the whole tree (and, hence can be collapsed to the
same IndividualID value).
- Any Individual can have derivatives that are themselves unique Individual
instances.
- Individuals are arranged hierarchically, and certain properties can be
inherited up or down the hierarchy, depending on the properties and their
associated logical constraints.
At some point, I will assemble a set of other specific use cases, and how we
manage them through our use of the "Individual" instance (although I will
probably not use the word "Individual", as this seems to cause too much
confusion in these discussions).
Aloha,
Rich
--
John Deck
(541) 321-0689 <tel:%28541%29%20321-0689>
_______________________________________________
tdwg-content mailing list
tdwg-content(a)lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content
5
13
Apologies in advance- this is a little off topic but I think it should
be of interest to many here.
For a second year we are teaching a "Ontologies for Evolutionary
Biology" sponsored by NESCent on July 29th in Durham, NC. Details are
here: http://bit.ly/ZAQ6Dt . Please forward this to any interested
parties. Last year we had a broad spectrum of "students", from
advanced undergrads to Professors nearing retirement.
If you have questions please message me off-list.
Thanks,
Matt Yoder
Biological Informatician
Illinois Natural History Survey
1
0
Re: [tdwg-content] New Darwin Core terms proposed relating to material samples
by Richard Pyle 30 May '13
by Richard Pyle 30 May '13
30 May '13
Hi Ramona,
Yes, I agree, and thanks. Ive always felt that there has been a trend
towards trying to push too much ontology (or other semantic meaning) onto
DWC terms and classes, when DWC was fundamentally intended to represent an
mechanism for data exchange; not a mechanism to describe the ontological
landscape of biodiversity data. The only reason I brought this up now (and,
I think, why we discussed it in 2010), is that the term individualID in
DWC sort of hinted that something like Individual was the forgotten
class for DWC. I sincerely hope that BCO and DSW gain more traction (and,
ideally, harmony between them) than earlier attempts at developing
ontologies in this space have met and clearly that is the right path
forward.
My main concern for this thread (and the reason I engaged in it), was to:
1) Find out the status of the discussions that began in 2010; and
2) Clarify where the current materialSample proposal overlaps, or does
not overlap, with that earlier effort.
Steve has very adequately answered the first question, and you, John, and
others have answered the second, and Im happy with both sets of answers.
Im sorry for the voluminous exchange, but I felt the discussion was both
important, and very helpful (certainly to me).
Aloha,
Rich
From: Ramona Walls [mailto:rlwalls2008@gmail.com]
Sent: Wednesday, May 29, 2013 1:03 PM
To: Richard Pyle
Cc: John Deck; Markus Döring; Steve Baskauf; TDWG Content Mailing List;
Robert Whitton
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Hi Rich,
Sorry I didn't mention this sooner, but your emails were also helpful to me
in describing an important and generalizable use case.
I don't know whether or not the TDWG community is ready to deal with the
level of abstraction we are talking about, but my assessment is that whether
or not they are ready, the Darwin Core is not constructed to deal with it.
That is why (among other reasons) we started work on the BCO, and perhaps
one reason why Steve and others developed DSW.
Our goal with the material sample proposal was not to overhaul DwC, but to
work within the DwC framework to make it more compatible with other
standards such as MIxS. That is why we tried to keep our proposal fairly
narrowly focused.
Ramona
On Wed, May 29, 2013 at 3:21 PM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Thanks, Ramona this is an *extremely* helpful email! It helps clear things
up a lot in my mind.
Just to be clear, what I am looking for is the notion of a defined physical
object (what I think you mean by material entity), and I explicitly mean
the material entity itself. Yes, there is information (properties) that
relate to that material entity, but to me that is a separate issue. What I
would like to see clearly defined is the class representing the material
(physical) entity which seems to me to be a superclass of what
materialSample is intended to represent.
Perhaps our (TDWG/DWC) community is not yet ready to deal with this level of
abstraction (unfortunately, I absolutely have to, because Occurrence is
simply way too overloaded a class for me to use independently of what I have
been calling individual and what I have been calling Evidence). In that
case, I guess the best thing to do is accept materialSample as a
basisOfRecord for Occurrence and move on. But this is more or less the same
thing that happened the last time we engaged in this conversation (2 years
or so ago), and I was hoping this conversation about materialSample could
leverage progress on the larger issue.
As Ive said before, the last thing I want to do is confuse or otherwise
slow down the process of incorporating the term materialSample into DWC.
Its just that I saw enough overlap with that other issue, that I was
hoping we could find a reasonable pathway forward on both.
Thanks again for the very helpful comments.
Aloha,
Rich
From: Ramona Walls [mailto:rlwalls2008@gmail.com]
Sent: Wednesday, May 29, 2013 9:14 AM
To: Richard Pyle
Cc: John Deck; Markus Döring; Steve Baskauf; TDWG Content Mailing List;
Robert Whitton
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Rich,
I now understand more fully what you are asking for ( a clear definition
goes a long way!). A material sample, as we discussed it at the Kansas and
Oxford workshops, does indeed need to be physically removed from its
environment. This is also the case with the OBI term material sample, which,
as a subclass of OBI:specimen is the output of some collecting process. It
is true that concept of material sample could be defined to include sampling
in an observational sense, but that is not how it is defined at this point.
Based on this, material sample is NOT the term you are looking for or
defined as :
"The category of information pertaining to the physical basis of a sampling,
subsampling, or observational event. In biological collections, the
[SuperclassTerm] is typically a defined group of organisms, a single whole
organism, or a part of a whole organism that is collected or otherwise
documented in nature, and either preserved, destructively processed, or
documented through some form of Evidence (such as images or reported visual
observations)."
What you have defined is a category of information (whatever that may be)
that pertains to some material entity. Not the material entity itself, but
information about that entity. The "SuperclassTerm" you refer to in the
definition sounds an awful lot like a material entity from the Basic Formal
Ontology, which is used for defining material sample in OBI and the
Bio-collections Ontology.
Ramona
On Wed, May 29, 2013 at 11:51 AM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Many thanks, John. This is extremely helpful!
First of all, in the context of a distinct term for basisOfRecord, I see
absolutely no problem with adding the term MaterialSample. I fully support
your proposal (although if this is simply a basisOfRecord term to be used
alongside Occurrence, PreservedSpecimen, LivingSpecimen, FossilSpecimen,
HumanObservation, MachineObservation; does it need a defined ID term? Do
all the others have defined ID terms?).
However, Im excited by this conversation because I think we are very close
to solving a bigger problem (which was the focus of the 2010 discussion on
this list around IndividualOrganism).
This bigger problem involves the need for a defined concept (Im
hesitating to say class), and an associated ID, in dwc that refers to
the physical/material basis of an Occurrence. We dont yet have a term for
this concept in dwc (IndividualID hinted at the need for one, but that
term was not well-defined, and the term itself seems to cause confusion).
As Steve Baskauf and I have both been advocating for the establishment of
new class in dwc for exactly this purpose, I just want to make sure that
were on the same page about what each concept is. The more I understand
about what you need for materalSample, the more convinced I am that both
of our needs can be met with the same concept.
I am perfectly happy to adopt the term MaterialSample, but I guess it all
boils down to this: In order for something to be a MaterialSample, must it
necessarily be removed from nature?
If the answer is No, then I think we can merge the two concepts into one.
If the answer is Yes, then I think materialSample is best characterized
as a subclass of what Steve and I have been pushing for (setting aside, for
the moment, the additional complexity of taxonomically homogeneous vs.
heterogeneous).
In the latter case, I would define the superclass (whatever term is used for
it), along the lines of:
"The category of information pertaining to the physical basis of a sampling,
subsampling, or observational event. In biological collections, the
[SuperclassTerm] is typically a defined group of organisms, a single whole
organism, or a part of a whole organism that is collected or otherwise
documented in nature, and either preserved, destructively processed, or
documented through some form of Evidence (such as images or reported visual
observations)."
Aloha,
Rich
From: jdeck88(a)gmail.com [mailto:jdeck88@gmail.com] On Behalf Of John Deck
Sent: Wednesday, May 29, 2013 4:01 AM
To: Richard Pyle
Cc: Markus Döring; Steve Baskauf; TDWG Content Mailing List; Robert Whitton;
Ramona Walls
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Since the original proposal was from a group of folks, we decided to put our
heads together to construct a general response to the various issues and
ideas expressed on this thread.
John Deck for Rob Guralnick, Ramona Walls, and John Wieczorek
In the text of the issue submitted for MaterialSample (
<https://code.google.com/p/darwincore/issues/detail?id=167>
https://code.google.com/p/darwincore/issues/detail?id=167) we noted cases
where the current basisOfRecord terms pertaining to the Occurrence class
(Occurrence, PreservedSpecimen, LivingSpecimen, FossilSpecimen,
HumanObservation, MachineObservation) do not adequately cover certain cases,
including: environmental sample (for metagenomic analysis), transcriptomes
(measuring genes but not taxa), and destructive samples (e.g. tissues
destructively sampled in order to generate genomic DNA). The term we
borrowed from OBI (http://purl.obolibrary.org/obo/OBI_0000747) is broad
enough to be utilized across various cases that fulfill our criteria while
still maintaining a consistent, clear and human understandable meaning. For
our purposes, we can think of Material Sample as any type of matter that
we can use in order derive further evidence needed for identifications, and
taxa, whether it is taxonomically homogenous, heterogenous, a single
individual, sets of individuals, or populations.
How is MaterialSample different from Individual? The intent of individualID
is fairly clear: since an Occurrence represents an organism at a place and
time (per Markus email), the individualID term allows us to assign an
instance identifier for a particular organism that can be present in
multiple events. MaterialSampleID, on the other hand, is intended to allow
users to say that the basis of an occurence is a material entity (i.e.
matter) that has been sampled according to some particular method. Whether
or not this material entity is an individual (sensu individualID in DwC) is
an independent axis of classification. As was already pointed out, there is
no restriction on specifying that an occurence is associated with more than
one type, so any occurrence can have both an individualID and a
materialSampleID.
We maintain our position on the proposal for MaterialSample as a value for
the basisOfRecord, with an associated materialSampleID to identify instances
of them. Per Steves initial comments, we have already withdrawn the
proposal for a MaterialSample class distinct from that in the Darwin Core
type vocabulary, which should make it easier to evaluate the implications of
what were discussing.
********************
NOTES, MaterialSample from OBI:
OBI has fairly broad definitions of samples & specimens that are meant to be
utilized across many different scientific activities. Material Sample is
defined as a material entity that has the material sample role, while a
material sample role is defined as a specimen role borne by a material
entity that is the output of a material sampling process, and a material
sampling process is a specimen gathering process with the objective to
obtain a specimen that is representative of the input material entity.
On Mon, May 27, 2013 at 11:59 PM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Hi Markus,
Great question! Particularly because this is exactly the sort of use case
we designed our model around.
> if you take a tissue sample of the same tree every year, would the
identifier
> in individualID be the same for all of them or be different? WIth the
current
> dwc:individualID definition it would be the same for all samples. If I
> understand you correct each sample would have its own "individual"
> identifier in your proposal? It can't see how you can collapse these two
things
> into one definition.
No, that is not how we would handle it.
In our model, there would be one IndividualID to represent the tree,
spanning the time period beginning (more or less) when the seed was
germinated, until the time at which the entire physical structure of the
tree was disintegrated. It is an individual tree.
There would be multiple Occurrence instances, for each time that someone
observed or sampled or otherwise wished to document some condition of that
tree. All of these Occurrence instances would refer to the same individualID
value (i.e., the "tree"). In the example above, this means there would be a
different Occurrence instance for each year that a sample is taken --
because in each case, an assertion that the full tree existed at a certain
time and place can be made (I understand that trees tend not to move around
very much, so the Location for each event associated with each Occurrence
would, in this case, remain the same; but the other Event properties -- such
as eventID, samplingProtocol, samplingEffort, eventDate, eventTime,
startDayOfYear, endDayOfYear, year, month, day, verbatimEventDate, habitat,
fieldNumber, fieldNotes, eventRemarks -- would be documented accordingly for
each sampling Occurrence instance).
Suppose that the tree is visited every month, but only sampled once per
year. In that case, there would be an Occurrence record for every monthly
visit. In other words, an Occurrence instance exists regardless of whether
a physical sample was made or not. Any in-situ images made of the tree
would likewise be associated with the specific Occurrence instance, and each
image would represent a separate instance of "Evidence".
Now, let's focus on the annual samplings. Every time a new sample is taken
from the tree, at least one new instance of Individual (with a unique
individualID value) is created to represent the sample. This sample
(individual instance) may be a "gathering" (set of multiple individual
specimens gathered at the same time), or it may be a single specimen, or it
may be simply a tissue sample intended for destructive analysis. In any
case, it's a new individual instance derived from the "parent" individual
instance representing the whole tree. In our implementation, "Individual"
can be hierarchical, such that a whole-organism tree can be sub-sampled with
many "child" instances of "gatherings" (say, one gathering each year), and
each gathering may have multiple child "specimen" individuals (e.g.
individual botanical sheets created from the multiple items of a single
gathering), and each specimen may have further "child" subsamples extracted
for DNA analysis (or whatever), and the hierarchy can continue on down to
whatever derivatives that people feel a need to keep track of (e.g.,
aliquot).
The point is, all Individual instances are well-defined physical objects (or
well-defined sets of physical objects), and they can be arranged in a
n-tiered hierarchy.
Moreover, each Individual that can be characterized as a "sample" (what we
refer to as a "CollectionObject") may also have a property value for
"CollectionOccurrenceID" -- which refers to the specific Occurrence instance
at which the sample was obtained.
So, for example, if the tree is visited on May 27, 2013 and a specimen
(sample) is taken, then:
1) An Event instance is generated to represent the event where the tree was
visited;
2) An Occurrence instance is generated, which refers to the new EventID, and
the existing IndividualID for the whole tree, and includes whatever other
Occurrence properties are relevant for the tree at the time of this
Occurrence
3) An Individual instance is generated for the specimen, which has a
property value for parentIndividualID that refers to the individualID for
the whole tree, and a property value for collectionOccurrenceID that refers
the Occurrence instance where the specimen was collected.
So, to summarize the answer to your question:
- There are multiple Occurrence instances that refer to the same Individual
instance representing the whole tree (and, hence can be collapsed to the
same IndividualID value).
- Any Individual can have derivatives that are themselves unique Individual
instances.
- Individuals are arranged hierarchically, and certain properties can be
inherited up or down the hierarchy, depending on the properties and their
associated logical constraints.
At some point, I will assemble a set of other specific use cases, and how we
manage them through our use of the "Individual" instance (although I will
probably not use the word "Individual", as this seems to cause too much
confusion in these discussions).
Aloha,
Rich
--
John Deck
(541) 321-0689 <tel:%28541%29%20321-0689>
2
2
Re: [tdwg-content] New Darwin Core terms proposed relating to material samples
by Richard Pyle 29 May '13
by Richard Pyle 29 May '13
29 May '13
Thanks, Ramona this is an *extremely* helpful email! It helps clear things
up a lot in my mind.
Just to be clear, what I am looking for is the notion of a defined physical
object (what I think you mean by material entity), and I explicitly mean
the material entity itself. Yes, there is information (properties) that
relate to that material entity, but to me that is a separate issue. What I
would like to see clearly defined is the class representing the material
(physical) entity which seems to me to be a superclass of what
materialSample is intended to represent.
Perhaps our (TDWG/DWC) community is not yet ready to deal with this level of
abstraction (unfortunately, I absolutely have to, because Occurrence is
simply way too overloaded a class for me to use independently of what I have
been calling individual and what I have been calling Evidence). In that
case, I guess the best thing to do is accept materialSample as a
basisOfRecord for Occurrence and move on. But this is more or less the same
thing that happened the last time we engaged in this conversation (2 years
or so ago), and I was hoping this conversation about materialSample could
leverage progress on the larger issue.
As Ive said before, the last thing I want to do is confuse or otherwise
slow down the process of incorporating the term materialSample into DWC.
Its just that I saw enough overlap with that other issue, that I was
hoping we could find a reasonable pathway forward on both.
Thanks again for the very helpful comments.
Aloha,
Rich
From: Ramona Walls [mailto:rlwalls2008@gmail.com]
Sent: Wednesday, May 29, 2013 9:14 AM
To: Richard Pyle
Cc: John Deck; Markus Döring; Steve Baskauf; TDWG Content Mailing List;
Robert Whitton
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Rich,
I now understand more fully what you are asking for ( a clear definition
goes a long way!). A material sample, as we discussed it at the Kansas and
Oxford workshops, does indeed need to be physically removed from its
environment. This is also the case with the OBI term material sample, which,
as a subclass of OBI:specimen is the output of some collecting process. It
is true that concept of material sample could be defined to include sampling
in an observational sense, but that is not how it is defined at this point.
Based on this, material sample is NOT the term you are looking for or
defined as :
"The category of information pertaining to the physical basis of a sampling,
subsampling, or observational event. In biological collections, the
[SuperclassTerm] is typically a defined group of organisms, a single whole
organism, or a part of a whole organism that is collected or otherwise
documented in nature, and either preserved, destructively processed, or
documented through some form of Evidence (such as images or reported visual
observations)."
What you have defined is a category of information (whatever that may be)
that pertains to some material entity. Not the material entity itself, but
information about that entity. The "SuperclassTerm" you refer to in the
definition sounds an awful lot like a material entity from the Basic Formal
Ontology, which is used for defining material sample in OBI and the
Bio-collections Ontology.
Ramona
On Wed, May 29, 2013 at 11:51 AM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Many thanks, John. This is extremely helpful!
First of all, in the context of a distinct term for basisOfRecord, I see
absolutely no problem with adding the term MaterialSample. I fully support
your proposal (although if this is simply a basisOfRecord term to be used
alongside Occurrence, PreservedSpecimen, LivingSpecimen, FossilSpecimen,
HumanObservation, MachineObservation; does it need a defined ID term? Do
all the others have defined ID terms?).
However, Im excited by this conversation because I think we are very close
to solving a bigger problem (which was the focus of the 2010 discussion on
this list around IndividualOrganism).
This bigger problem involves the need for a defined concept (Im
hesitating to say class), and an associated ID, in dwc that refers to
the physical/material basis of an Occurrence. We dont yet have a term for
this concept in dwc (IndividualID hinted at the need for one, but that
term was not well-defined, and the term itself seems to cause confusion).
As Steve Baskauf and I have both been advocating for the establishment of
new class in dwc for exactly this purpose, I just want to make sure that
were on the same page about what each concept is. The more I understand
about what you need for materalSample, the more convinced I am that both
of our needs can be met with the same concept.
I am perfectly happy to adopt the term MaterialSample, but I guess it all
boils down to this: In order for something to be a MaterialSample, must it
necessarily be removed from nature?
If the answer is No, then I think we can merge the two concepts into one.
If the answer is Yes, then I think materialSample is best characterized
as a subclass of what Steve and I have been pushing for (setting aside, for
the moment, the additional complexity of taxonomically homogeneous vs.
heterogeneous).
In the latter case, I would define the superclass (whatever term is used for
it), along the lines of:
"The category of information pertaining to the physical basis of a sampling,
subsampling, or observational event. In biological collections, the
[SuperclassTerm] is typically a defined group of organisms, a single whole
organism, or a part of a whole organism that is collected or otherwise
documented in nature, and either preserved, destructively processed, or
documented through some form of Evidence (such as images or reported visual
observations)."
Aloha,
Rich
From: jdeck88(a)gmail.com [mailto:jdeck88@gmail.com] On Behalf Of John Deck
Sent: Wednesday, May 29, 2013 4:01 AM
To: Richard Pyle
Cc: Markus Döring; Steve Baskauf; TDWG Content Mailing List; Robert Whitton;
Ramona Walls
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to
material samples
Since the original proposal was from a group of folks, we decided to put our
heads together to construct a general response to the various issues and
ideas expressed on this thread.
John Deck for Rob Guralnick, Ramona Walls, and John Wieczorek
In the text of the issue submitted for MaterialSample (
<https://code.google.com/p/darwincore/issues/detail?id=167>
https://code.google.com/p/darwincore/issues/detail?id=167) we noted cases
where the current basisOfRecord terms pertaining to the Occurrence class
(Occurrence, PreservedSpecimen, LivingSpecimen, FossilSpecimen,
HumanObservation, MachineObservation) do not adequately cover certain cases,
including: environmental sample (for metagenomic analysis), transcriptomes
(measuring genes but not taxa), and destructive samples (e.g. tissues
destructively sampled in order to generate genomic DNA). The term we
borrowed from OBI (http://purl.obolibrary.org/obo/OBI_0000747) is broad
enough to be utilized across various cases that fulfill our criteria while
still maintaining a consistent, clear and human understandable meaning. For
our purposes, we can think of Material Sample as any type of matter that
we can use in order derive further evidence needed for identifications, and
taxa, whether it is taxonomically homogenous, heterogenous, a single
individual, sets of individuals, or populations.
How is MaterialSample different from Individual? The intent of individualID
is fairly clear: since an Occurrence represents an organism at a place and
time (per Markus email), the individualID term allows us to assign an
instance identifier for a particular organism that can be present in
multiple events. MaterialSampleID, on the other hand, is intended to allow
users to say that the basis of an occurence is a material entity (i.e.
matter) that has been sampled according to some particular method. Whether
or not this material entity is an individual (sensu individualID in DwC) is
an independent axis of classification. As was already pointed out, there is
no restriction on specifying that an occurence is associated with more than
one type, so any occurrence can have both an individualID and a
materialSampleID.
We maintain our position on the proposal for MaterialSample as a value for
the basisOfRecord, with an associated materialSampleID to identify instances
of them. Per Steves initial comments, we have already withdrawn the
proposal for a MaterialSample class distinct from that in the Darwin Core
type vocabulary, which should make it easier to evaluate the implications of
what were discussing.
********************
NOTES, MaterialSample from OBI:
OBI has fairly broad definitions of samples & specimens that are meant to be
utilized across many different scientific activities. Material Sample is
defined as a material entity that has the material sample role, while a
material sample role is defined as a specimen role borne by a material
entity that is the output of a material sampling process, and a material
sampling process is a specimen gathering process with the objective to
obtain a specimen that is representative of the input material entity.
On Mon, May 27, 2013 at 11:59 PM, Richard Pyle <deepreef(a)bishopmuseum.org>
wrote:
Hi Markus,
Great question! Particularly because this is exactly the sort of use case
we designed our model around.
> if you take a tissue sample of the same tree every year, would the
identifier
> in individualID be the same for all of them or be different? WIth the
current
> dwc:individualID definition it would be the same for all samples. If I
> understand you correct each sample would have its own "individual"
> identifier in your proposal? It can't see how you can collapse these two
things
> into one definition.
No, that is not how we would handle it.
In our model, there would be one IndividualID to represent the tree,
spanning the time period beginning (more or less) when the seed was
germinated, until the time at which the entire physical structure of the
tree was disintegrated. It is an individual tree.
There would be multiple Occurrence instances, for each time that someone
observed or sampled or otherwise wished to document some condition of that
tree. All of these Occurrence instances would refer to the same individualID
value (i.e., the "tree"). In the example above, this means there would be a
different Occurrence instance for each year that a sample is taken --
because in each case, an assertion that the full tree existed at a certain
time and place can be made (I understand that trees tend not to move around
very much, so the Location for each event associated with each Occurrence
would, in this case, remain the same; but the other Event properties -- such
as eventID, samplingProtocol, samplingEffort, eventDate, eventTime,
startDayOfYear, endDayOfYear, year, month, day, verbatimEventDate, habitat,
fieldNumber, fieldNotes, eventRemarks -- would be documented accordingly for
each sampling Occurrence instance).
Suppose that the tree is visited every month, but only sampled once per
year. In that case, there would be an Occurrence record for every monthly
visit. In other words, an Occurrence instance exists regardless of whether
a physical sample was made or not. Any in-situ images made of the tree
would likewise be associated with the specific Occurrence instance, and each
image would represent a separate instance of "Evidence".
Now, let's focus on the annual samplings. Every time a new sample is taken
from the tree, at least one new instance of Individual (with a unique
individualID value) is created to represent the sample. This sample
(individual instance) may be a "gathering" (set of multiple individual
specimens gathered at the same time), or it may be a single specimen, or it
may be simply a tissue sample intended for destructive analysis. In any
case, it's a new individual instance derived from the "parent" individual
instance representing the whole tree. In our implementation, "Individual"
can be hierarchical, such that a whole-organism tree can be sub-sampled with
many "child" instances of "gatherings" (say, one gathering each year), and
each gathering may have multiple child "specimen" individuals (e.g.
individual botanical sheets created from the multiple items of a single
gathering), and each specimen may have further "child" subsamples extracted
for DNA analysis (or whatever), and the hierarchy can continue on down to
whatever derivatives that people feel a need to keep track of (e.g.,
aliquot).
The point is, all Individual instances are well-defined physical objects (or
well-defined sets of physical objects), and they can be arranged in a
n-tiered hierarchy.
Moreover, each Individual that can be characterized as a "sample" (what we
refer to as a "CollectionObject") may also have a property value for
"CollectionOccurrenceID" -- which refers to the specific Occurrence instance
at which the sample was obtained.
So, for example, if the tree is visited on May 27, 2013 and a specimen
(sample) is taken, then:
1) An Event instance is generated to represent the event where the tree was
visited;
2) An Occurrence instance is generated, which refers to the new EventID, and
the existing IndividualID for the whole tree, and includes whatever other
Occurrence properties are relevant for the tree at the time of this
Occurrence
3) An Individual instance is generated for the specimen, which has a
property value for parentIndividualID that refers to the individualID for
the whole tree, and a property value for collectionOccurrenceID that refers
the Occurrence instance where the specimen was collected.
So, to summarize the answer to your question:
- There are multiple Occurrence instances that refer to the same Individual
instance representing the whole tree (and, hence can be collapsed to the
same IndividualID value).
- Any Individual can have derivatives that are themselves unique Individual
instances.
- Individuals are arranged hierarchically, and certain properties can be
inherited up or down the hierarchy, depending on the properties and their
associated logical constraints.
At some point, I will assemble a set of other specific use cases, and how we
manage them through our use of the "Individual" instance (although I will
probably not use the word "Individual", as this seems to cause too much
confusion in these discussions).
Aloha,
Rich
--
John Deck
(541) 321-0689 <tel:%28541%29%20321-0689>
1
0