[tdwg-content] New Darwin Core terms proposed relating to material samples

John Deck jdeck at berkeley.edu
Fri May 31 21:16:54 CEST 2013


Since it was a gut sample, we'll be seeing lots of stuff in there, maybe
even 1000 different taxa each one can be a distinct occurrence as
JohnDeckOccurrence124 and JohnDeckOccurrence125 are different taxa.

Now, in the sense of Event/location i'm taking the location details to be
when it was isolated from nature and thus the location would be the same as
the wholeorganism location.  In our recent workshop on this same issue
recently in Copenhagen we went around about this issue for awhile but
decided that we should take "location" to mean the the location at which
whatever parent organism was isolated in nature.  Certainly the location
where the gut contents were extracted (e.g. in the lab) is important too
but that is something different and not represented by dc:location or in
dwc in general.  This is something of an interpretation of the actual term
but it since our implementation model of DwCA is still using "occcurrence"
at the core we probably don't have much choice since GBIF does not use a
graph-based parser.  The other option is to represent this all using
event/location at the core of a DwCA but felt this introduced yet more
complexities and breaks other cases where we would want to hang data off of
occurrence (e.g. Identification), and would not be able to do so if
event/location lived at the core.

John


On Fri, May 31, 2013 at 12:02 PM, Richard Pyle <deepreef at bishopmuseum.org>wrote:

> Thanks, John – this is REALLY helpful!****
>
> ** **
>
> A couple questions – can you expand a bit on the differences between
> JohnDeckOccurrence123, 124,and 125?  I’m assuming that
> JohnDeckOccurrence123 is associated with the Event representing the time &
> place when JohnDeckTissueSample1 was removed from JohnDeck.  I’m guessing
> that JohnDeckOccurrence124 is associated with the Event representing the
> time & place when JohnDeckGutSample1 was removed from JohnDeck.  What I
> don’t understand is why there needs to be a JohnDeckOccurrence125.  What
> Occurrence does that represent?  Later you suggest that JohnDeck
> (WholeOrganism) was extracted from nature. Is the extraction-from-nature
> Occurrence one of these three Occurrences?****
>
> ** **
>
> What you describe below is consistent with our approach to treating
> materialSample as a subclass of Individual (assuming a hierarchical
> Individual, which means that ParentIndividualID of both
> JohnDeckTissueSample1 and JohnDeckGutSample1 is IndividualID=JohnDeck).
> The nice thing about the hierarchical approach is that deals with the
> problem you describe in the last paragraph.****
>
> ** **
>
> Rich****
>
> ** **
>
> ** **
>
> *From:* jdeck88 at gmail.com [mailto:jdeck88 at gmail.com] *On Behalf Of *John
> Deck
> *Sent:* Friday, May 31, 2013 7:54 AM
> *To:* Richard Pyle
> *Cc:* Markus Döring; Jason Holmberg; TDWG Content Mailing List; Robert
> Whitton; Ramona Walls
>
> *Subject:* Re: [tdwg-content] New Darwin Core terms proposed relating to
> material samples****
>
> ** **
>
> Yep--- that reference point for aggregation can be really powerful:  To
> provide a working example of how these identifiers would work, and how they
> can act to aggregate data elements, consider the following:
>
> ****
>
> IndividualID = JohnDeck****
>
> MaterialSampleID = JohnDeckTissueSample1****
>
> OccurrenceID = JohnDeckOccurrence123****
>
> Taxon = "Homo sapiens"****
>
> ** **
>
> IndividualID = JohnDeck****
>
> MaterialSampleID = JohnDeckGutSample1****
>
> OccurrenceID = JohnDeckOccurrence124****
>
> Taxon = "Bacteria500"****
>
> ** **
>
> IndividualID = JohnDeck****
>
> MaterialSampleID = JohnDeckGutSample1****
>
> OccurrenceID = JohnDeckOccurrence125****
>
> Taxon = "Bacteria501"****
>
> ** **
>
> JohnDeckTissueSample1 is representative of the Individual itself, while
> JohnDeckGutSample1 is still associated with the same Individual but notice
> the taxon has changed and it is a new Occurrence as well.  This approach
> allows for some sense to be constructed using a flat file approach if
> desired.  Providing a Material Sample BoR for OccurrenceID's 124 and 125
> provides further context.  Meanwhile, we can consider the implications of,
> for example, habitat descriptions (... for JohnDeckOccurrence123 maybe i'd
> put http://purl.obolibrary.org/obo/ENVO_01000193, "temperate grassland
> biome")  but the distinct occurrence records for the gut samples could be
> listed as (http://purl.obolibrary.org/obo/ENVO_01000162, "organ").****
>
> ** **
>
> Another use for the identifier MaterialSampleID -- lets assume we've
> expressed an equivalent identifier for a genbank sample using
> MIxS:source_mat_id, a term which references the same OBI:MaterialSample
> we're referencing, which allows.  If they're URIs we can model this in RDF
> using the MaterialSampleID's as either subjects or objects... this gets us
> a step closer for representing contextual information in genbank and DwC
> without duplicating metadata across systems (genbank for sequencing metada;
> DwC for environmental context)
>
> ****
>
> There are some issues with this approach of course, for example, if we
> provide a lat/lng for an occurrence that is a gutsample are we taking the
> lat/lng where the gutsample was removed from the organism (may be different
> than where a parent organism was isolated from nature).  In this case, we
> need to assume that we're referring to where the parent organism was
> isolated from nature to be consistent with DwC and implementations in use.
>  However, the notion of habitat should vary with the occurrence of the
> actual organism (e.g. "organ" vs. "temperate grassland biome").  Thus, we
> can still aggregate properties around MaterialSample BoR's that are useful
> but we need to think carefully about what exactly the properties mean that
> we assign to these things.... but this is no different than issues we've
> encountered between other BoR's (Fossil, PreservedSpecimen, or
> Human/MachineObservation).  ****
>
> ** **
>
> John****
>
> ** **
>
> On Thu, May 30, 2013 at 11:48 PM, Richard Pyle <deepreef at bishopmuseum.org>
> wrote:****
>
> Yes, that’s a fair point!  In a sense, the ID has intrinsic value on its
> own if for no other reason than to represent a reference point for
> aggregation.****
>
>  ****
>
> Nevertheless, I still maintain that if it fulfills that purpose, then it
> implies a “thing” (around which other “things” are aggregated), and I can’t
> imagine such a “thing” that we would care about for aggregating purposes,
> about which we would not associate other property values. ****
>
>  ****
>
> I say all this quite deliberately in reference to “dwc:individualID”, of
> course…. J****
>
>  ****
>
> Aloha,****
>
> Rich****
>
>  ****
>
>  ****
>
> *From:* Markus Döring [mailto:m.doering at mac.com]
> *Sent:* Thursday, May 30, 2013 7:56 PM
> *To:* Jason Holmberg
> *Cc:* Richard Pyle; TDWG Content Mailing List; Robert Whitton; John Deck;
> Ramona Walls****
>
>
> *Subject:* Re: [tdwg-content] New Darwin Core terms proposed relating to
> material samples****
>
>  ****
>
> The id value is actually very useful and the only trustworthy way of
> grouping records, e.g. all occurrences of the same whale.
>
> Markus ****
>
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content****
>
>
>
> ****
>
> ** **
>
> --
> John Deck
> (541) 321-0689****
>
> _______________________________________________
> tdwg-content mailing list
> tdwg-content at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>
>


-- 
John Deck
(541) 321-0689
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20130531/06b6de73/attachment.html 


More information about the tdwg-content mailing list