John,
I would like to suggest that somewhere in your revised proposal you state explicitly that "MaterialSample" be used as the controlled string value when a literal value is given for dwc:basisOfRecord (see http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord). The current convention for other type vocabulary values has been to use the last part of the URI (e.g. "HumanObservation" for http://rs.tdwg.org/dwc/dwctype/HumanObservation and "Event" for http://purl.org/dc/dcmitype/Event). If that pattern were followed, some users might provide "OBI_0100051" as the string and based on what you wrote below, I don't think that is what you intend.
Steve
John Deck wrote:
NOTE:
Following discussion on the TDWG content list, discussions with OBI developers, and discussions with MIxS developers, we have modified the original Material Sample Term Proposal in the following document.
The original document sent to the Darwin Core community for Material Sample proposed the OBI term Material Sample (http://purl.obolibrary.org/obo/OBI_0000747), which we are now modifying to instead reference OBI Specimen (http://purl.obolibrary.org/obo/OBI_0100051). The reason for this change is that it came to light that the intention of OBI Material Sample “should be representative of the whole from which it is sampled”. Since this definition places an unnecessary constraint on its use we have opted to use OBI Specimen instead, which is more closely aligned with our original intent. In addition to this change, we have also modified the proposal based on the feedback from this list.
-John Deck, Rob Guralnick, Ramona Walls
New Term Request: Material Sample
This is a proposal for two new terms in Darwin Core, relating to the addition of the concept, “Material Sample”, described by the identifier http://purl.obolibrary.org/obo/OBI_0100051. The two terms are:
- A new BasisOfRecord term MaterialSample with label “Material
Sample” that references http://purl.obolibrary.org/obo/OBI_0100051
- A new Darwin Core property term, MaterialSampleID.
Submitters: John Deck, Rob Guralnick and Ramona Walls
Justification
The current values in the DwC Type Vocabulary (http://rs.tdwg.org/dwc/dwctype/) do well in representing some types of biocollections and observations. However, the more general notion of a sample is not well represented, because the existing terms are too specific. For example, the DwC terms “Preserved Specimen”, “Fossil Specimen”, and “Living Specimen” are appropriate for use in the museum community but assume particular properties pertaining to museum collections, which “material samples” may or may not have. Examples of “material samples” we are considering (beyond the examples above) are surveys that involve soil and water sampling, bulk sampling of specimens from, e.g., trawls, microbiological sampling, metagenomics, etc. These sampling approaches often rely on field sub-sampling processes and laboratory techniques (e.g., DNA extraction and sequencing) which transform the physical material and produce distinct information content and thus represent a type of information that is distinct from what DwC has typically dealt with. The proposal for adding “Material Sample” as a DwC class is to maintain consistency with the way Darwin Core terms are managed and organized. This term comes from the Ontology for Biomedical Investigations (OBI) class OBI:specimen. We use the class concept definition directly from OBI but provide the more familiar label “Material Sample” for use within the biodiversity community and annotate how that definition applies in the domain of biological collections.
A “material sample” can pertain to general matter in which organisms may exist, in whole, in part, or in conjunction with many other organisms. The “material sample” may exist for a brief period, such as a tissue that is converted to extracted DNA. It may also represent a collection of multiple taxa, such as a soil or water sample that is used with the intention of describing the diversity of organisms, whether the actual organisms are later recovered from such a sample, or whether that sample is processed in order to generate a set of derivatives from organisms (e.g.16S sequences from a metagenomics run). A “material sample” may also yield connections to other indicators of biodiversity aside from taxa, such as a transcriptome, indicating which DNA is actively being expressed at a particular point in time.
For the purposes of biological collections, we can think of “material sample” as any type of matter that we can use in order derive further evidence needed for identification of taxa, whether it is taxonomically homogenous, heterogenous, a single individual, sets of individuals, or populations. However, the definition of the term does not exclude its use in broader contexts outside the scope of biological collections.
How is the term “Material Sample” different from “Individual”? The intent of individualID is fairly clear: since an Occurrence represents an organism at a place and time, the individualID term allows us to assign an instance identifier for a particular organism that can be present in at multiple events. MaterialSampleID, on the other hand, is intended to allow users to say that the basis of an occurence is a material entity (i.e. matter) that has been sampled according to some particular method. Whether or not this material entity is an individual (sensu individualID in DwC) represents an independent axis of classification. There is no restriction on specifying that an occurence is associated with more than one type, so any occurrence can have both an individualID and a materialSampleID.
Adding this term will help align DwC to two other significant projects: the Ontology for Biomedical Investigations (OBI), from which we will be adapting this term, and the MIxS family of checklists.
The MIxS vocabulary is proposing to adopt MaterialSampleID by clarifying the existing term source_mat_id to read:
“A unique identifier assigned to a material sample (as defined by http://rs.tdwg.org/dwc/terms/MaterialSampleID, and as opposed to a particular digital record of a material sample) used for extracting nucleic acids, and subsequent sequencing. The identifier can refer either to the original material collected or to any derived sub-samples. The INSDC qualifiers /specimen_voucher, /bio_material, or /culture_collection provide additional context and suggested syntax for this identifier for data submitted to INSDC databases.”
The MIxS source_mat_id term clarification proposal is pending based on the outcome of this proposal.
Connecting a DwC Record to a MIxS record would have the advantage of aligning DwC terminology (geospatial, taxonomic) with sequencing terminology (investigation, environment, nucleic acid sequence source, sequencing) and with OBI (investigation, roles, processes), using “Material Sample” as the pivot point between the standards.
Definition:
From OBI ((http://purl.obolibrary.org/obo/OBI_0100051): “A material entity that has the specimen role.”
A specimen role (http://purl.obolibrary.org/obo/OBI_0000112) in OBI is defined as “a role borne by a material entity that is gained during a specimen creation process and that can be realized by use of the specimen in an investigation”. The operative word is “can”. That is, the specimen is not required to be realized by use in an investigation. However, it is worth nothing that deposition into a museum or biobank can fulfill the criteria of “use in an investigation”, if necessary (for discussion, see http://sourceforge.net/p/obi/obi-terms/677/).
We have chosen to use the label “Material Sample” instead of using the OBI label “Specimen” for this definition. This allows us to distinguish this term from other types containing the word “Specimen” currently in use in the Darwin Core vocabulary, which have their own meaning, distinct from the concept we are proposing. In the natural history community, biological specimens have a colloquial meaning, typically referring to a voucher held by a biorepository for research. We intend a more inclusive definition, and thus, when we refer to “DwC Material Sample” here, we are actually referring to the class of entities defined by “OBI Specimen”.
In order to clarify how this definition may be considered in a biological collections context, we wish to include a http://www.w3.org/2000/01/rdf-schema#comment http://www.w3.org/2000/01/rdf-schema#comment annotation within the DwC vocabulary which would read: “In biological collections, the material sample is typically collected, and either preserved, transformed by some process, or destructively processed” Further clarification on the use of this term, including this document, would be provided in the supplementary documentation and the Darwin Core wiki.
Comment: N/A
Refines: N/A
Has Domain: N/A
Has Range: N/A
Replaces: N/A
Summary:
Term Name: MaterialSample
Identifier: http://rs.tdwg.org/dwc/dwctype/MaterialSample http://rs.tdwg.org/dwc/terms/MaterialSample
Namespace: http:/rs.tdwg.org/dwctype/ http://rs.tdwg.org/dwc/terms
Label: Material Sample
Definition: A resource describing the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed.
Comment: For discussion see http://code.google.com/p/darwincore/wiki/DwCTypeVocabulary (there will be no further documentation here until the term is ratified)
Type of Term: http://www.w3.org/2000/01/rdf-schema#Class
Refines: http://purl.obolibrary.org/obo/OBI_0000747http://purl.obolibrary.org/obo/OBI_0100051 http://purl.obolibrary.org/obo/OBI_0000747
Status: proposed
Date Issued: 2013-03-28
Date Modified: 2013-05-25
Has Domain:
Has Range:
Refines:
Version: MaterialSample-2013-06-24
Replaces:
IsReplaceBy:
Class:
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)
Term Name: materialSampleID
Identifier: http://rs.tdwg.org/dwc/terms/MaterialSampleID http://rs.tdwg.org/dwc/terms/MaterialSample
Namespace: http://rs.tdwg.org/dwc/terms/ http://rs.tdwg.org/dwc/terms/MaterialSample
Label: Material Sample ID
Definition: An identifier for the MaterialSample (as opposed to a particular digital record of the material sample). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the materialSampleID globally unique.
Comment: For discussion see http://code.google.com/p/darwincore/wiki/MaterialSample (this page will not exist until the term is ratified).
Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
Refines: http://purl.org/dc/terms/identifier
Status: proposed
Date Issued: 2013-03-28
Date Modified: 2013-05-25
Has Domain:
Has Range:
Version: materialSampleID-2013-05-25
Replaces:
IsReplaceBy:
Class: http://rs.tdwg.org/dwc/terms/Occurrence
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)