[tdwg-content] New Darwin Core terms proposed relating to material samples

Richard Pyle deepreef at bishopmuseum.org
Sun May 26 03:31:31 CEST 2013


Hi All,

I'm on my way home from Berlin, and catching up on this thread.

I wanted to take this opportunity to describe what Rob Whitton and I have been working on over the past year or so, along these lines.

Basically, we've been running with the idea of an "Individual" class - as originally proposed by Steve and discussed at some length on this list a while ago.  This has been documented for DSW:
https://code.google.com/p/darwin-sw/wiki/ClassIndividual

We have (or intend to) drop the "Organism" part, so it's just an "Individual" - because in our data model it doesn't need to be limited to an organism.

Anyway, the reason I mention this here is that we have found it to be a very powerful tool for tracking our data much more closely to "reality" than is achieved when force-fitting everything into the Occurrence class.  I also mention it because we are using it for the same function that the proposed MaterialSample/ID seems to be used for.

We define an "Individual" as the physical "something" that underpins an Occurrence.  In the case of organisms, this can be a group (herd, school, flock, etc.), specimen (either a single specimen, or a lot of multiple specimens), or any sort of derivative of a specimen (part, tissue sample, dna extraction, etc.).  It corresponds to the intended meaning of dwc:individualID (http://rs.tdwg.org/dwc/terms/index.htm#individualID), and also to the meaning of "CollectionObject" from the old ASC/MVZ data model (see also: https://www.idigbio.org/wiki/images/c/c9/Phase_I_Report.pdf).

So far, it seems to be a very stable and functional unit for tracking myriad kinds of biodiversity information.  It is linked from the Occurrence table, and is the thing to which taxon determinations are applied.  It's also the thing that represents museum collections objects.  The key is that it is hierarchical.  For example, there may be one instance of an "Individual" that is a school of fish observed on a reef, representing an occurrence.  That Individual instance may have a child "Lot" of, say 5 specimens that were speared and preserved for a Museum. Each of the five specimens might then be assigned its own individual instance, as children of the "Lot".  Then, when one or more tissue samples are extracted from one or more of the specimens, those are represented as additional "Individual" instances that are children of the respective specimen individual.

What's nice about the way we manage this is that there is inheritance up and down the hierarchy chain.  For example, if the "School" individual is associated with an Occurrence (observation/collection event), then all of its children can inherit this information.  Likewise, a taxonomic determination can be applied to any individual in the hierarchy, and that identification can be inherited up or down the hierarchy as appropriate (the meaning of "appropriate" is a bit involved, but I'd be happy to elaborate).

I've been out of this discussion for a while, but I guess my main question/point is to ask whether there has been any progress towards incorporating the concept of the Darwin-SW concept of "Individual" (https://code.google.com/p/darwin-sw/wiki/ClassIndividual) into DWC, and if so, whether this might be a better way of managing materialSample/ID.

Aloha,
Rich

From: tdwg-content-bounces at lists.tdwg.org [mailto:tdwg-content-bounces at lists.tdwg.org] On Behalf Of Steve Baskauf
Sent: Saturday, May 25, 2013 1:08 AM
To: John Deck
Cc: TDWG Content Mailing List
Subject: Re: [tdwg-content] New Darwin Core terms proposed relating to material samples

John and John,

OK, great!  I think this all makes sense to me now.  I think that that in the wiki discussion you might eventually make the comment that although materialSampleID is conveniently grouped with other terms under the Occurrence class, a material sample does not have to be a sample from a living organism which is documented in an Occurrence.  It could be something that may or may not be known to contain one or more living organisms (e.g. water samples), part of one or more organisms (e.g. tissue samples), or even no known living organisms (e.g. rock samples).  Well anyway, you should say that if it is true.  I think that was the intention when the term was being discussed.  You can confirm whether I have this right or not.

Steve

John Deck wrote:

Steve,

Thanks for your comments.  Responding to both of your emails here.


We've removed the class and now have just the MaterialSample dwctype and a materialSampleID property.  dwctype:MaterialSample refines  http://purl.obolibrary.org/obo/OBI_0000747.   Also, we've updated materialSampleID to be a new term in the dwc/terms namespace instead of referencing the MiXS namespace.  In our original proposal, we suggested using the MIxS RDF namespace for this property, however, the GSC did not make MIxS-as-RDF a standard, as decided recently at GSC15, so we've chosen not to use that term (by convention) and instead propose creating our own materialSampleID property in the dwc/terms namespace.   (A side note: the GSC is still very much interested in MIxS as RDF and we'll continue to maintain and implement https://code.google.com/p/mixs-as-rdf/ in conjunction with the MIxS developers).


Modification to proposed terms:


Term Name: MaterialSample

Identifier: http://rs.tdwg.org/dwc/dwctype/MaterialSample<http://rs.tdwg.org/dwc/terms/MaterialSample>

Namespace: http:/<http://>rs.tdwg.org/dwctype/<http://rs.tdwg.org/dwc/terms>

Label: Material Sample

Definition: The category of information pertaining to the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed.

Comment: For discussion see http://code.google.com/p/darwincore/wiki/DwCTypeVocabulary (there will be no further documentation here until the term is ratified)

Type of Term: http://www.w3.org/2000/01/rdf-schema#Class

Refines: http://purl.obolibrary.org/obo/OBI_0000747

Status: proposed

Date Issued: 2013-03-28

Date Modified: 2013-05-25

Has Domain:

Has Range:

Refines:

Version: MaterialSample-2013-05-25

Replaces:

IsReplaceBy:

Class:

ABCD 2.0.6: not in ABCD (someone please confirm or deny this)


Term Name: materialSampleID

Identifier: http://rs.tdwg.org/dwc/terms/MaterialSampleID<http://rs.tdwg.org/dwc/terms/MaterialSample>

Namespace: http://rs.tdwg.org/dwc/terms/<http://rs.tdwg.org/dwc/terms/MaterialSample>

Label: Material Sample ID

Definition: An identifier for the MaterialSample (as opposed to a particular digital record of the material sample). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the materialSampleID globally unique.

Comment: For discussion see http://code.google.com/p/darwincore/wiki/MaterialSample (this page will not exist until the term is ratified).

Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property

Refines: http://purl.org/dc/terms/identifier

Status: proposed

Date Issued: 2013-03-28

Date Modified: 2013-05-25

Has Domain:

Has Range:

Version: materialSampleID-2013-05-25

Replaces:

IsReplaceBy:

Class: http://rs.tdwg.org/dwc/terms/Occurrence

ABCD 2.0.6: not in ABCD (someone please confirm or deny this)

John D. and John W.

On Mon, May 20, 2013 at 7:39 PM, Steve Baskauf <steve.baskauf at vanderbilt.edu<mailto:steve.baskauf at vanderbilt.edu>> wrote:
Here is the second question.  The proposal proposes a new MaterialSample class
http://rs.tdwg.org/dwc/terms/MaterialSample
which is in the "main" (dwc:=http://rs.tdwg.org/dwc/terms/ ) namespace.  I guess my question is why we need this class.  I can definitely see a rational for a new class defined as part of the DwC type vocabulary (i.e. dwctype:MaterialSample).  It would be used to type resources that are material samples.  But the class terms in the main (dwc:) namespace are used as a convenient way to group DwC property terms that might reasonably be used with instances of that class.  However, there really aren't any such terms.

We already have a convention in which not every type vocabulary class term has a corresponding class term in the main (dwc:) namespace.  There are dwctype:PreservedSpecimen, dwctype:LivingSpecimen, dwctype:FossilSpecimen, dwctype:HumanObservation, dwctype:MachineObservation, and dwctype:NomenclaturalChecklist, none of which have dwc: namespace analogues.  So why does MaterialSample need a dwc: namespace analog?

Steve


John Wieczorek wrote:
Dear all,

TDWG could see a lot of activity in 2013 in anticipation of the meeting in Florence in October. Much of the activity is related to enabling integration across multiple parts of our domain. We have the Audubon Core under review for biodiversity-related media and an impending RDF Guide to supplement the already extant Text and XML Guides for Darwin Core.

This message is to bring your attention to another integrative initiative, to introduce terms into Darwin Core that will form a nexus between Occurrences and the interesting things that happen with physical materials that result from them, such as, but not limited to, genetic sequencing. A series of meetings for a little over the past year have inspired our colleagues in the Genomics Standards Consortium (GSC) to propose to their constituency to align their terms with Darwin Core, including adopting some of the Darwin Core terms in place of their own that have the same meaning. Out of these discussions has come the realization that neither community has terms to accommodate the concept of an identifiable (objectively, not taxonomically), trackable material sample.  This message constitutes such a proposal.

This proposal would have no impact on those publishing purely taxonomic data. It would also have no impact on those publishing occurrence data unless they want to increase their capacity to distinguish material samples from organisms more rigorously than is now possible using only the dwc:preparations term.

The initial request for new terms can be found in the Darwin Core Issue tracker as http://code.google.com/p/darwincore/issues/detail?id=167. Below I have elaborated nad formalized the request into the three distinct terms under consideration, initiating the 30 day minimum public review process to seek consensus on their inclusion in the Darwin Core standard. Your job, should you choose to accept it, is to discuss the merits or any perceived problems in the inclusion of these three terms in Darwin Core.

Below I will give the proposed properties of three terms as they would appear in the Darwin Core Quick Reference Guide, though these properties would be included in the RDF of the normative form of the documentation.

A new MaterialSample class: This is for the purpose of organizing properties, just as the existing classes (Occurrence, Event, Location, GeologicalContext, Identification, Taxon, etc.) do, without having any terms declare this class as their domain.

Term Name: MaterialSample
Identifier: http://rs.tdwg.org/dwc/terms/MaterialSample
Namespace: http:/<http://>rs.tdwg.org/dwc/terms<http://rs.tdwg.org/dwc/terms>
Label: Material Sample
Definition: The category of information pertaining to the physical results of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed, with the intention of being representative of a greater whole.
Comment: For discussion see http://code.google.com/p/darwincore/wiki/MaterialSample (this page will not exist until the term is ratified).
Type of Term: http://www.w3.org/2000/01/rdf-schema#Class
Refines:
Status: proposed
Date Issued: 2013-03-28
Date Modified: 2013-04-08
Has Domain:
Has Range:
Version: MaterialSample-2013-03-28
Replaces:
IsReplaceBy:
Class:
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)

A Darwin Core Type Vocabulary value for basisOfRecord is needed to represent this new class of information. Luckily, a term already exists in the Ontology for Biomedical Investigations (http://www.ontobee.org/browser/rdf.php?o=OBI&iri=http://purl.obolibrary.org/obo/OBI_0000747). We and the GSC both propose to reuse this class within Darwin Core as below, making it the cross-ver point between the two domains.

Term Name: MaterialSample
Identifier: http://rs.tdwg.org/dwc/terms/MaterialSample
Namespace: http://purl.obolibrary.org/obo/OBI_0000747
Label: material sample
Definition: A material entity that has the material sample role
Comment: For discussion see http://code.google.com/p/darwincore/wiki/DwCTypeVocabulary (there will be no further documentation here until the term is ratified)
Type of Term: http://www.w3.org/2000/01/rdf-schema#Class
Refines:http://purl.obolibrary.org/obo/OBI_0100051
Status: recommended
Date Issued: 2013-03-28
Date Modified: 2013-03-28
Has Domain:
Has Range:
Version: MaterialSample-2013-03-28
Replaces:
IsReplaceBy:
Class:
ABCD 2.0.6: not in ABCD

In keeping with all other classes in Darwin Core, the Material Sample class would have a corresponding identifier property. The Genomics Standards Consortium (GSC) is in the process of proposing this term. If it is accepted, we propose to use it, and its properties would be as below, otherwise, the properties would be the same, but have the Darwin Core namespace and identifier URI.

Term Name: materialSampleID
Identifier: http://gensc.org/ns/mixs/materialSampleID
Namespace: http://gensc.org/ns/mixs
Label: Material Sample ID
Definition: An identifier for the MaterialSample (as opposed to a particular digital record of the material sample). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the materialSampleID globally unique.
Comment: For discussion see http://code.google.com/p/darwincore/wiki/MaterialSample (this page will not exist until the term is ratified).
Type of Term: http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
Refines: http://purl.org/dc/terms/identifier
Status: proposed
Date Issued: 2013-03-28
Date Modified: 2013-04-08
Has Domain:
Has Range:
Version: materialSampleID-2013-03-28
Replaces:
IsReplaceBy:
Class: http://purl.obolibrary.org/obo/OBI_0000747
ABCD 2.0.6: not in ABCD (someone please confirm or deny this)


--

Steven J. Baskauf, Ph.D., Senior Lecturer

Vanderbilt University Dept. of Biological Sciences



postal mail address:

PMB 351634

Nashville, TN  37235-1634,  U.S.A.



delivery address:

2125 Stevenson Center

1161 21st Ave., S.

Nashville, TN 37235



office: 2128 Stevenson Center

phone: (615) 343-4582<tel:%28615%29%20343-4582>,  fax: (615) 322-4942<tel:%28615%29%20322-4942>

If you fax, please phone or email so that I will know to look for it.

http://bioimages.vanderbilt.edu



_______________________________________________
tdwg-content mailing list
tdwg-content at lists.tdwg.org<mailto:tdwg-content at lists.tdwg.org>
http://lists.tdwg.org/mailman/listinfo/tdwg-content



--
John Deck
(541) 321-0689



--

Steven J. Baskauf, Ph.D., Senior Lecturer

Vanderbilt University Dept. of Biological Sciences



postal mail address:

PMB 351634

Nashville, TN  37235-1634,  U.S.A.



delivery address:

2125 Stevenson Center

1161 21st Ave., S.

Nashville, TN 37235



office: 2128 Stevenson Center

phone: (615) 343-4582,  fax: (615) 322-4942

If you fax, please phone or email so that I will know to look for it.

http://bioimages.vanderbilt.edu

________________________________
This message is only intended for the addressee named above. Its contents may be privileged or otherwise protected. Any unauthorized use, disclosure or copying of this message or its contents is prohibited. If you have received this message by mistake, please notify us immediately by reply mail or by collect telephone call. Any personal opinions expressed in this message do not necessarily represent the views of the Bishop Museum.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20130525/7cf90ec2/attachment.html 


More information about the tdwg-content mailing list