Hi again Douglas,

I have spent some time going through the fern-specimen-dwc-dsw.ttl file that you attached to your message.  I made some modifications to it so that use of the Darwin-SW object properties would be consistent with the Darwin-SW model and avoid the nasty entailments that render graphs inconsistent.  The modified file is attached as fern-specimen-dwc-dsw-sjb.ttl I also posted it as a Gist in case the attachment doesn't survive the ListServ.  It's at https://gist.github.com/baskaufs/fa6dd16c53a66895d881034f98447ad3

The major change in the graph structure is adding the blank node for the dwc:Organism instance.  You will notice right away that this instance has no properties other than an rdf:type.  From the standpoint of "flat" records, that might seem useless.  But in the context of a graph, the organism instance is a node to which zero to many identifications (a.k.a. determinations), zero to many occurrences, and zero to many derived resources can be linked.  In your example there's only one identification and one occurrence, but as I said in my previous email, there might be other providers for which it would be critical to be able to link more than one occurrence to the same organism, or to assign several determinations to the organism.  In the case of your metadata, there are two resources that are derived from the organism: the preserved specimen (<http://collections-api.boh.tepapa.govt.nz/object/759415>) and the still image (<http://collections-api.boh.tepapa.govt.nz/media/208762>).  In your original metadata, the image was linked to the specimen record via dwc:associatedMedia.  There is nothing wrong with letting people know that they are associated, but how would one know whether the image was an image of the preserved specimen, or an image of the organism itself?  I edited the file to say this:

<http://collections-api.boh.tepapa.govt.nz/object/759415> dsw:derivedFrom _:autos4.
<http://collections-api.boh.tepapa.govt.nz/media/208762> dsw:derivedFrom _:autos4.


i.e. both the specimen and image were derived from the organism (blank node _:autos4) and the image is an picture of the organism.  If the image were a picture of the specimen instead of the live organism, the RDF/Turtle could be:

<http://collections-api.boh.tepapa.govt.nz/object/759415> dsw:derivedFrom _:autos4.
<http://collections-api.boh.tepapa.govt.nz/media/208762> dsw:derivedFrom
<http://collections-api.boh.tepapa.govt.nz/object/759415>.

The other major thing I did was to move the dwc: taxon-related convenience properties from the dwc:Taxon instance to the dwc:Identification instance.  I'll say more about that in the technical details below, since if I put it here, readers skimming this email would zone out before I got to the fun part. 

I should also note that in the same way that I dodged minting an identifier for the Organism instance by using a blank node, one could use blank nodes for the Event and Occurrence instances as well, rather than using http://collections-api.boh.tepapa.govt.nz/event/yyyy and http://collections-api.boh.tepapa.govt.nz/event/xxx .  The main reason why one might need to have some kind of identifier for those two resources would be if someone wanted to link to them from the outside.  If no one is going to do that, then it doesn't really matter if the nodes are anonymous or if they have assigned URIs.  In the case of the Occurrence instance, if the record has gone to GBIF, one could use the GBIF-minted occurrence ID rather than minting one.  That would require a degree of coordination to re-use another's identifier, but in theory, that's what we should be doing in Linked Data anyway, and having a link to a GBIF resource wouldn't be bad anyway.

As I said in my previous email, if providers use the Darwin-SW object properties to link in accordance with the Darwin-SW model, one can then basically throw the triples in a pot and query away.  I actually did that by uploading the modified Turtle file fern-specimen-dwc-dsw-sjb.ttl into the Vanderbilt library triplestore along with the Bioimages triples.  You can now query the merged graph at the SPARQL endpoint:

http://rdf.library.vanderbilt.edu/sparql?view

For example, if you paste this query into the box:

PREFIX dsw: <http://purl.org/dsw/>
PREFIX dwc: <http://rs.tdwg.org/dwc/terms/>
PREFIX dwciri: <http://rs.tdwg.org/dwc/iri/>

SELECT DISTINCT ?sciName ?country ?stateProvince WHERE {
  ?id dwc:genus "Polystichum".
  ?id dwc:scientificName ?sciName.
  ?id dsw:identifies ?organism.
  ?occurrence dsw:occurrenceOf ?organism.
  ?occurrence dsw:atEvent ?event.
  ?event dsw:locatedAt ?location.
  ?location dwc:countryCode ?country.
  ?location dwc:stateProvince ?stateProvince.
}


you can discover the species of Polystichum in the merged dataset and location information about where they occurred.  The one from Tennessee was documented in the Bioimages data and the one from New Zealand was documented your data.  I think it's exciting to do some actual testing of RDF data integration using real data from the wild. (I may not leave these data in the triplestore indefinitely, since the store is the home of real Bioimages data and not just a test.)

Those on the list who aren't interested in the details can stop reading here.  Anyone who is interested in the issues that I addressed when I modified the original document, read on.

1. I changed all of the dc:type (i.e. http://purl.org/dc/terms/type) properties to rdf:type.  This is the recommendation of Section 2.3.1.4 of the DwC RDF guide, which comes from the recommendations for expressing Dublin Core as RDF.  In some cases that resulted in multiple rdf:type's for a resource, but that's totally fine.

2. The DwC RDF Guide did not mint the term dwciri:associatedMedia as an IRI-object analogue of dwc:associatedMedia.  Rather, it recommended using dc:relation (i.e. http://purl.org/dc/terms/relation) as the object property to link to an IRI.  There's more about it in Section 2.8 of the RDF Guide [1].  So I used dc:relation, but also left a dwc:associatedMedia property with a literal value.

3. I moved the dwciri:recordedBy property from the Event instance to the Occurrence instance.  I think that's implied by listing dwc:recordedBy in the Occurrence section of the DwC quick reference guide.

4. Similarly, I moved the dwc:georeferenceSources property from the Event instance to the Location instance.

5. I corrected the typo of "stateProvence" to "stateProvince".

6. There are several technical details about the dc:Location instance property values.  I would say that it would be a best practice to datatype the values of dwc:decimalLatitude and dwc:decimalLongitude as xsd:decimal.  I think that in general, people are pretty sloppy about this kind of datatyping, but from a technical RDF standpoint, leaving of the datatype would mean that the values denote untyped literals (i.e. just strings), whereas providing the datatypes indicates that the values denote decimal numbers.  It would also help consuming applications know how to interpret the values.  The same could be said for the geo:lat and geo:long values, although I think that people are pretty sloppy with their use of them.  The other issue related to the geodetic datum.  My understanding of the best practice for providing literal values for dwc:geodeticDatum is to use a controlled string value based on the EPSG: code for that geodetic datum, e.g. "EPSG:4326" for WGS84.  I don't know what the value would be for NZGD1949, but you could probably look it up somewhere.  Also, including " (assumed)" as part of the literal value might interfere with interpreting the string - perhaps it could go as part of a dwc:georeferenceRemarks value or something.  The other piece of this would be the relationship between the values given for dwc:decimalLatitude and geo:lat, and dwc:decimalLongitude and geo:long.  The definitions of the two geo: terms imply that the datum is WGS84, yet the values are the same for the dwc: terms with NZGD1949.  Maybe you did a mapping between NZGD1949 and WGS84 and they didn't differ in the second decimal place.

7. I added some Audubon Core properties to the Image instance.  In particular, I added the required http://purl.org/dc/elements/1.1/type value.  I also linked to two Service Access point instances that represented different sizes of the image.  You can see how I marked it up.  There isn't really any guide for using Audubon Core properties as RDF - I just sort of made this up based on what made sense to me.

8. As I mentioned in the first part of this email, I moved most of the properties from your Taxon instance to the Identification instance.  At first glance, this doesn't seem to make sense.  However, if one envisions how one might represent taxa as RDF, it makes a little more sense.  In the other attached document taxon-sjb.ttl [2], I've written some rudimentary RDF/Turtle showing one way that taxa could be represented as RDF.  In order to save Rich Pyle from having to write a long response, I'll say from the start that it is likely that what I've done is not "right".  I put this together based on a document that I wrote for the RDF Task Group several years ago [3], and that document came about as part of an extended discussion [4] that included Rich and several other people.  In the RDF that I created in the taxon-sjb.ttl document, I typed some resources as dwc:Taxon (which could be considered taxon concepts or taxon name usages) and other resources as tc:TaxonName (which could be considered some kind of name object or perhaps a protonym).  I used some terms from the defunct TaxonConcept ontology of the TDWG Ontologies [5], since we don't really have any better vocabulary at the moment (hopefully that will change if there is ever a TCS 2.0).  The way that the TaxonConcept ontology describes the hierarchical relationships between taxa is in my opinion overly complicated, so I just used dc:isPartOf to indicate that a lower level taxon is part of a higher level taxon.  There are many more properties I could have assigned to the dwc:Taxon and tc:TaxonName instances (such has links to references, author names, etc.), but as I said, this is just an illustration and really only a skeleton.  As you can see in the taxon-sjb.ttl document, there is no resource that is assigned the entire hierarchy of convenience properties from subspecies up to kingdom.  Rather, each taxon is a separate resource whose higher level taxonomy is indicated by the links to taxon levels above it.  Creating a hierarchy such as this would not be the job of occurrence instance data providers, but rather the job of some community group who has taken on the job of generating and maintaining identifiers and metadata about taxon concepts.  The occurrence data providers would just link to the lowest level of the hierarchy that is applicable to their identification, and data users could traverse the levels of the hieararchy to discover the upper levels.  For convenience of searching and in lieu of a functioning organization that does the job of creating and maintaining this hierarchy, an occurrence data provider can provide literal values for the dwc: convenience properties of the Identification instance to make searching convenient.  But having the occurrence data provider assert these convenience properties isn't really saying anything authoritative about the hierarchy - that's the job of the people who would create and maintain the taxon concept identifiers and metadata.  This is analogous to what happens with Locations and "described places".  For convenience of searching, Location instance providers can provide literal values for convenience properties of the Location instance (dwc:country, dwc:stateProvince, etc.), but the heavy lifting of minting and maintaining the identifiers, metadata, and connections between described places is done by consensus providers, such as GeoNames or the Getty Thesaurus of Geographic Names.  The same would presumably be the case for a community repository of taxon concepts - it would be unrealistic to expect each occurrence provider to re-invent the wheel and create their own taxon concept instances.

Steve

[1] http://rs.tdwg.org/dwc/terms/guides/rdf/index.htm#2.8_Darwin_Core_association_terms
[2] Gisted at https://gist.github.com/baskaufs/a4e4d66f5002b97ba14e3e9c07c86f7c
[3] https://github.com/tdwg/rdf/blob/master/TCSBasedTaxon.md
[4] https://github.com/tdwg/rdf/blob/master/TCSthread.md
[5] https://github.com/tdwg/ontology

Douglas Campbell wrote:

Thank you all for your feedback.

 

I am hopeful there will eventually be a ‘standard’ vocabulary for the DwC associations.  In the meantime it looks like DSW will suit us the best.  dwcFP is quite loose (and having version numbers in the IRI means the namespace has broken once already).  BCO is quite bewildering – I like the solidity of underlying OBO framework, but BCO has a steep learning curve and doesn’t seem to tie into well-known LD ontologies.

 

The downside of DSW for us is the extra abstract Occurrence and Organism entities.  We basically hold specimens from field collection events that we identify – that is the data in our collection management system, so the other entities would just be fabricated for data exchange (not because we specifically model or store them).

 

I have mocked up a herbaria record starting from the specimen point of view (the core things that we hold in our collections).  This is attached as turtle and also our original JSON-LD (converted at http://json-ld.org/playground/ ).

 

*I had to go to Occurrence directly from the Specimen/Token using dsw:evidenceFor (rather than via Organism and dsw:hasOccurence) as it may not have been identified - meaning it would be impossible to create an Organism entity. And in fact we will not create Organism entities as we don’t need them in our context.

* The Occurrence is just a placeholder (since we don’t have data for it), which means I have associated the Agent to the Event rather than the Occurrence – it so happens that this is valid as the domain isn’t specified in dwciri:recordedBy.

*I have include properties of linked-to entities because we need them populated as part of the web API for searching, etc.  This is similar to the DwC concept of convenience properties.

* There’s a few other messy parts like what IRIs we will create for the Occurrence and Identification entities.

 

I’d be interested in any comments about whether this is on the right track and/or would be considered conformant DwC RDF J

 

Cheers,

Douglas

 

 

From: tdwg-content [mailto:tdwg-content-bounces@lists.tdwg.org] On Behalf Of John Deck
Sent: Thursday, 18 August 2016 4:02 a.m.
To: tdwg-content@lists.tdwg.org
Subject: Re: [tdwg-content] Implementing Darwin Core in RDF

 

Another angle to consider here is using the Biological Collections Ontology (https://github.com/BiodiversityOntologies/bco), which is part of the OBO Foundry framework (http://www.obofoundry.org/), and for which development and work has been done in integrating with Darwin Core and Audubon Core (e.g. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4511409/).  The OBO community has a longer history with the biomedical community but gaining traction in biodiversity.  Since these are all ontologies, there should be less confusion on the meaning of classes. For example, the notion of Observation in DwC is broken down into at least some of the following classes: "Observing process", "Material target of observation",  "Material target of observation role".  Relations are well specified as well and drawn from the Relations Ontology (http://www.obofoundry.org/ontology/ro.html).  BTW, along with the PPO (plant phenology ontology -- https://github.com/PlantPhenoOntology/PPO) team, i'm working on annotating instance data with BCO and PPO now, an ongoing project and we will have more information on this effort at the upcoming TDWG meeting.   Happy to answer additional questions you have.

 

John Deck

 

On Wed, Aug 17, 2016 at 7:59 AM, Paul J. Morris <mole@morris.net> wrote:

While commenting on the RDF guide, I wrote implementations of delivery
of flat Darwin Core in RDF into Symbiota and the Harvard University
Herbaria web search using content negotiation.  Symbiota, when an
occurrence (or agent record) is requested with an accept header of
text/turtle; will deliver the occurrence record in a turtle
serialization (an example is below), when requested with an accept
header of application/rdf+xml will return flat Darwin Core in RDF/XML

For structured relations beyond flat Darwin Core, there is another
alternative to darwin-sw (which is also in use in the wild), dwcFP:
http://filteredpush.org/ontologies/FP/2.0/dwcFP.owl  Bob Morris can
probably comment further, but one of the design goals of dwcFP was
fewer added inferences than darwin-sw (object properties have ranges
specified, but not domains).

dwcFP defines a set of object properties for making relations between
core Darwin Core classes that we've been using in annotations:

    owl:ObjectProperty rdf:about="dwcFP:hasIdentification"
    owl:ObjectProperty rdf:about="dwcFP:usesTaxon"
    owl:ObjectProperty rdf:about="dwcFP:hasCollectingEvent"
    owl:ObjectProperty rdf:about="dwcFP:hasLocality"
    owl:ObjectProperty rdf:about="dwcFP:hasGeologicalContext"
    owl:ObjectProperty rdf:about="dwcFP:hasGeoreference"
    owl:ObjectProperty rdf:about="dwcFP:hasAssociatedMedia"

One object property we added specifically to simplyfy SPARQL queries on
taxonomic trees:

    owl:ObjectProperty rdf:about="dwcFP:descendantTaxonOf"

And additional object properties that cover more possible relations:

    owl:ObjectProperty rdf:about="dwcFP:relationProperty"
    owl:ObjectProperty rdf:about="dwcFP:hasAcceptedNameUsage"
    owl:ObjectProperty rdf:about="dwcFP:namePublishedIn"
    owl:ObjectProperty rdf:about="dwcFP:hasOriginalNameUsage"
    owl:ObjectProperty rdf:about="dwcFP:hasParentNameUsage"
    owl:ObjectProperty rdf:about="dwcFP:hasScientificName"
    owl:ObjectProperty rdf:about="dwcFP:hasTaxonConcept"
    owl:ObjectProperty rdf:about="dwcFP:taxonomicAuthorityFor"
    owl:ObjectProperty rdf:about="dwcFP:nomenclaturalCode"
    owl:ObjectProperty rdf:about="dwcFP:nomenclaturalStatus"

Attached is an example RDF document (which is a commented extract from
the body of an annotation in a new occurrence annotation document)
that describes an occurrence.  There were about 350,000 such occurrence
records wrapped in annotations generated in the NEVP project.  This
representation preceeded the Darwin Core RDF guide, and I've added a
couple of comments about things we would do differently now.

In using dwcFP, we encountered the same kind of questions you are
raising about where to place datatype properties that might go in more
than one place.  We ended up hanging dwc:scientificName,
dwc:scientificNameAuthorship, etc, off of the Identification, and then
a Taxon (through dwcFP:usesTaxon) off the Identification, with only a
guid hung off the Taxon.  Bob can probably comment further, but I think
our motivation for this was that the new occurrence annotation
documents are intended to describe transcription of data off of
specimens (and container (herbarium sheet and folder)), and that
conceptually these scientfic names were properties of transcriptions of
identifications.  If I recall correctly, we are doing the opposite in
new identification annotations (hanging dwc:scientificName off of a
Taxon instance).

-Paul

--

Example flat Darwin Core occurrence record as RDF produced by Symbiota:

http://symbiota4.acis.ufl.edu/scan/portal/collections/individual/index.php?occid=16950169&clid=0
with
HTTP Accept (through, for example, the Modify Headers firefox plugin):
text/turtle;q=1.0,text/xml,application/xml,application/xhtml+xml,text/html;
q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5

Delivers:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix dwc: <http://rs.tdwg.org/dwc/terms/> .
@prefix dwciri: <http://rs.tdwg.org/dwc/iri/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix dcmitype: <http://purl.org/dc/dcmitype/> .
<urn:uuid:00032bdf-8862-4ca1-b8a6-ba35ee59f56a>
    a dwc:Occurrence  ;
    dwc:institutionCode "ASNP" ;
    dwc:collectionCode "ENT" ;
    dwciri:inCollection
<urn:uuid:af7140c3-4aa2-41ac-b3e9-4c7415b3ce90> ; a
dcmitype:PhysicalObject ; dwc:basisOfRecord  "PreservedSpecimen" ;
    dwc:catalogNumber "735" ;
    dwc:kingdom "Animalia" ;
    dwc:phylum "Arthropoda" ;
    dwc:class "Insecta" ;
    dwc:order "Hemiptera" ;
    dwc:family "Miridae" ;
    dwc:scientificName "Phytocoris" ;
    dwc:scientificNameAuthorship "Fallén, 1814" ;
    dwc:genus "Phytocoris" ;
    dwc:identifiedBy "G. W. Cowper 2008" ;
    dwc:recordedBy "G. W. Cowper" ;
    dwc:eventDate  "2008-10-18" ;
    dwc:year  "2008" ;
    dwc:month  "10" ;
    dwc:day  "18" ;
    dwc:startDayOfYear "292" ;
    dwc:fieldNumber "-" ;
    dwc:individualCount "1" ;
    dwc:preparations "Academy Drawer" ;
    dwc:country "United States" ;
    dwc:stateProvince "New Jersey" ;
    dwc:county "Burlington" ;
    dwc:municipality "Woodland" ;
    dwc:locality "SW of Chatsworth Beat Pinus along sandy road in
upland pine & oak woodland.; 1 individual(s) collected; Way Point
[-]" ; dwc:decimalLatitude "39.806067" ; dwc:decimalLongitude
"-74.545333" ; dwc:verbatimCoordinates "N+39º48.364' W-074º32.720'" ;
    dwc:disposition "ANSP" ;
    dcterms:modified "2015-04-25 17:13:22" ;
    dcterms:license <http://creativecommons.org/licenses/by-nc/3.0/> ;
    dcterms:rightsHolder <Academy of Natural Sciences> ;
    dcterms:accessRights "CC BY-NC (Attribution-Non-Commercial)" ;
    dcterms:references
"http://symbiota4.acis.ufl.edu/scan/portal/collections/individual/index.php?occid=16950169".

On Wed, 17 Aug 2016 06:02:01 +0000
Douglas Campbell <Douglas.Campbell@tepapa.govt.nz> wrote:

> Thanks for taking the time to bring me up to speed, Steve.
>
> I'm familiar with the complexities of preparing specifications and
> realise I've come in mid-way.  I'll spend some time reading up on the
> containers and occurrences history.  But I'm unclear whether it is
> better to use DSW terms in anticipation of them having longevity, or
> just to mint our own in the meantime?
>
> For the taxon convenience fields...
>
> I thought I read in the DwC RDF schema that properties like taxonRank
> were in the Taxon class (so using them in Identification conflicts
> with their definition), but looking again I see that the spec uses
> "dwcattributes:organizedInClass" which specifically does not imply
> domain or range.  So now I'm at peace with that. :)
>
> However, I am pointing to our own RDF versions of taxon
> classification terms that we use, and using DwC properties to define
> these taxon terms, plus I am combing these all together in a single
> JSON-LD API result.  So at this point I don't think I need to repeat
> the convenience properties in Identification (as they are available
> directly in the Taxon object).  While this seems to fit our purpose I
> can see that this may be sub-optimal for others who download the data
> and use it separately - I'll need to contemplate that scenario some

> more.
>
> Cheers,
> Douglas
>
>
> From: Steve Baskauf [mailto:steve.baskauf@Vanderbilt.Edu]
> Sent: Wednesday, 17 August 2016 6:32 a.m.
> To: Douglas Campbell
> Cc: tdwg-content@lists.tdwg.org
> Subject: Re: [tdwg-content] Implementing Darwin Core in RDF
>
> Douglas,
> I was the lead author on the DwC RDF Guide, so I can try to answer
> your questions about it.  The TDWG RDF Task Group is still in
> operation, although it hasn't been very active for the past several
> years.  The RDG TG has an online "home" at the TDWG Github site.[1]
> However, the content didn't survive the migration from Google Code
> very well, so it takes some effort at this point to sort through it.
> The TG also has an email list [2] but there has been little traffic
> on it recently.
>
> *Dereferencing of the DwC IRI namespace* - Unfortunately, the dwciri:
> namespace terms don't dereference at the present time.  This needs to
> be corrected.  I've created a Turtle serialization [3] of how I think
> the RDF should be written for the dwciri: terms, but it isn't served
> when one attempts to dereference the terms and hasn't been
> incorporated into the official DwC repository.  Part of the problem
> here is that the guidelines for documenting terms in machine-readable
> form are still going through the adoption process.[4] I'm hopeful
> that when the Documentation Specification is ratified, we can make
> sure that all existing DwC terms dereference in a consistent manner.
>
> *Best practice for connecting containers together* - By this, I'm
> assuming you mean linking instances of the various Darwin Core
> classes, or in RDF terms, linking nodes.  The RDF Guide is silent on
> how to do this.  That's not great from the standpoint of actually
> turning Darwin Core records into RDF, but it was a way to complete
> the guide in a finite amount of time.  What is missing is a consensus
> domain model that would lay out how instances of the Darwin Core
> classes would be linked.  Such a model should be developed, but that
> has not yet happened.  Again, there is a draft standard submitted for
> review [5], which if adopted will specify (in Section 4) a process
> for developing such a model.  When we wrote the RDF Guide, we
> provided ancillary documents [6], which included examples that
> followed the RDF Guide and linked instances using various proposed
> models.  There are links to web pages containing examples using
> TaxonConcept, BiSciCol, and Darwin-SW object properties to link class
> instances.  I am not sure whether there is any RDF "in the wild" for
> the first two examples.  I'm more familiar with Darwin-SW, as I was
> involved in its development [7].  There is a Semantic Web Journal
> article about Darwin-SW [8], so I won't go into detail about it here,
> except to say that its data model was developed following an
> extensive discussion on the tdwg-content email list [9] about how
> members of the community understood the Darwin Core classes.  The
> relationship between Darwin-SW model and the historical 1993 ACS
> Model can be viewed at [10].  There are a bit over a million triples
> of data "in the wild" modeled on Darwin-SW in accordance with the DwC
> RDF guide, accessible at a SPARQL end point. [11]  Some examples
> showing how to play around with SPARQL queries of these data are at
> [12].
>
> *The overlapping scope of Occurrence and Specimen types* - There is a
> long history behind the meaning of "Occurrence".  There is an
> out-of-date-summary of some of the discussion around this topic in
> the Darwin-SW documentation [13].  I think that at the time when
> Darwin Core was originally adopted, an Occurrence was considered a
> sort of superclass of the Specimen and Observation classes.  However,
> after a lot of discussion, the meaning of dwc:Occurrence was
> clarified by changing it to its current definition: "An existence of
> an Organism (sensu http://rs.tdwg.org/dwc/terms/Organism) at a
> particular place at a particular time."  In this view, an Occurrence
> isn't a concrete thing like a Specimen - it's more like a database
> join between an Event instance (time and place) and an Organism,
> which allows for a one-to-many relationship between a Organism and
> Occurrences, and a one-to-many relationships between an Event and
> Occurrences.  It also allows for a single occurrence of an organism
> at a time and place to be documented by one-to-many forms of
> evidence, which could include PreservedSpecimens, HumanObservation
> data, or images of various sorts.  In RDF terms, an Occurrence could
> be thought of as a node that is linked to Event, Organism, and
> evidence instances nodes.  You can see this represented graphically
> at [7], where "dsw:Token" refers to a generic class for evidence.  In
> any case, separating Occurrence (as a node linking Events to
> Organisms) from Specimen allows an Occurrence to be documented by one
> to many instances of any kind of evidence, or even multiple kinds of
> evidence.  For example, an Occurrence could be documented by a
> PhysicalSpecimen as well as several images.  Here is an example of an
> organism with two Occurrences:
> http://bioimages.vanderbilt.edu/org-jorgem/rec13_0004 The first
> occurrence on 2013-07-24 was documented by 42 camera trap images, and
> the second occurrence on 2013-07-25 was documented by 21 camera trap
> images.  You can see how this was represented in RDF at [14].  In
> most cases, specimen records will be much simpler than this, with one
> organism, documented at one occurrence, with evidence of one
> PreservedSpecimen.  Such a simpler case could be represented with a
> simpler model.  But the more complex model allows specimen-derived
> occurrence records to be merged with other kinds of occurrence
> records, such as the camera trap example I gave, mark-recapture bird
> banding observations, iNaturalist occurrences documented by photos of
> the organism, etc.
>
> *Conflicting usage of Taxon fields in the Identification object* - In
> order to explain the rationale behind why what seem to be
> taxon-related properties are assigned to Identification instances, I
> must refer to the idea of "convenience terms" as expressed in Section
> 2.7 of the RDF Guide.[15]   In a perfect world, we would have the
> following:
>
> a collection item linked by dwciri:inCollection to an IRI-identified
> collection an identification instance linked by dwciri:toTaxon to an
> IRI-identified taxon (a.k.a. taxon concept) a location instance
> linked by dwciri:inDescribedPlace to an IRI-identified geographic
> place (a.k.a. "feature")
>
> If the linked IRI-identified object resources were described by RDF,
> it would not be necessary to include any of the Darwin Core
> "convenience" properties included in Table 3.5 [16].  The information
> contained in the values of those properties could be discovered by
> dereferencing the object IRIs and traversing subsequent links from
> that RDF.  However, if those IRIs don't exist, then the convenience
> properties provide a string-based mechanism to relate the subject
> resource to other resources that should be linked to the same
> (unidentified) object resource.  So for example, if we say a specimen
> has the convenience properties and values
>
> dwc:collectionCode="Mamm"
> dwc:institutionCode ="MVZ"
>
> we are not saying that "Mamm" is the collection code of the specimen
> and that "MVZ" is the institution code of the specimen.  Rather, we
> mean that the specimen should be linked to a collection (with unknown
> IRI) whose code is MVZ and whose owning institution has the code
> "MVZ".  Similarly, if we say that an identification has the
> convenience properties and values
>
> dwc:genus="Hersiliiadae"
> dwc:specificEpithet="yaeyamaensis"
>
> we are not saying that "yaeyamaensis" is the specific epithet of the
> identification and that "Hersiliiadae" is the genus of the
> identification.  Rather, we mean that the identification should be
> linked to a taxon (with unknown IRI) for which the specificEpithet
> part of its name string is "yaeyamaensis", which is included in the
> genus "Hersiliiadae".  This may seem odd, particularly if you are
> used to thinking of genus and specific epithet as properties of a
> taxon.  But the sets of DwC convenience properties are intended to be
> a temporary, string-based way to describe an unidentified resource to
> which the subject resource should be linked.  At some future time, if
> IRIs can be discovered, those sets of convenience properties might be
> dropped if dereferencing the IRIs provides the same information.  In
> these examples, one might replace with:
>
> a collection item linked by dwciri:inCollection to
> http://grbio.org/cool/0rht-pj95 an identification instance linked to
> http://zoobank.org/75C9EA16-72B1-44C9-AD40-3C3D41323AB9
>
> although I don't think either of these IRIs currently dereference to
> meaningful machine-readable RDF (although they have human-readable
> web pages).
>
> I hope that this has provided you with some answers, or at least a
> starting point for additional exploration or questions.  Please feel
> free to reply if there were parts of what I wrote that weren't clear.
>
> Steve Baskauf
>
> [1] https://github.com/tdwg/rdf
> [2] http://groups.google.com/group/tdwg-rdf
> [3]
> https://github.com/tdwg/vocab/blob/master/code-examples/darwin-core/dwciri.ttl
> [4]
> https://github.com/tdwg/vocab/blob/master/documentation-specification.md
> [5]
> https://github.com/tdwg/vocab/blob/master/maintenance-specification.md
> [6] https://github.com/tdwg/rdf/blob/master/DwCAncillary.md [7]
> https://github.com/darwin-sw/dsw [8]
> http://www.semantic-web-journal.net/content/darwin-sw-darwin-core-based-terms-expressing-biodiversity-data-rdf-1
> [9] https://github.com/darwin-sw/dsw/wiki/TdwgContentEmailSummary
> [10]
> https://github.com/darwin-sw/dsw/blob/master/img/acs-dsw-poster.pptx
> [11] http://rdf.library.vanderbilt.edu/sparql?view [12]
> https://github.com/HeardLibrary/semantic-web/blob/master/learning-sparql/learning-sparql-ch3-part2-answers.md
> [13] https://github.com/darwin-sw/dsw/wiki/ClassOccurrence [14]
> http://bioimages.vanderbilt.edu/org-jorgem/rec13_0004.rdf [15]
> http://rs.tdwg.org/dwc/terms/guides/rdf/index.htm#2.7_Darwin_Core_convenience_terms
> [16]
> http://rs.tdwg.org/dwc/terms/guides/rdf/index.htm#3.5_Darwin_Core_convenience_terms_that_are_expected_to_be_used_o
>
>
>
> Douglas Campbell wrote:
> Hi all,
>
> I am implementing Darwin Core in RDF as part of our API at Te Papa
> (Museum of New Zealand).  My aim is to map our specimen metadata to
> rich Darwin Core RDF using JSON-LD, then 'dumb down' to Simple Darwin
> Core to contribute to virtual herbariums.  I have mocked-up some
> records, however there are some areas where I'm not quite sure how to
> interpret the Darwin Core RDF Guide.
>
> The areas of confusion I have include:
> * Best practice for connecting containers together
> * Dereferencing of the DwC IRI namespace
> * The overlapping scope of the Occurrence and Specimen types
> * Conflicting usage of Taxon fields in the Identification object.
>
> I'm hoping for suggestions:
> 1. Are there any implementations of DwC RDF data online that I could
> look at as examples to follow? 2. What/to whom is the best way to ask
> specific questions about DwC RDF?
>
> At this stage our API prototype is only available internally but
> there is some documentation available publicly at:
> https://github.com/te-papa/collections-api/wiki
>
> Thanks in advance,
> Douglas
>
> Douglas Campbell
> Business Analyst
> Collections Information Services
> Museum of New Zealand Te Papa Tongarewa
>
> ________________________________
>
> Visit the Te Papa website http://www.tepapa.govt.nz
> The email message together with the accompanying attachments may be
> CONFIDENTIAL. If you have received this message in error, please
> notify https://www.tepapa.govt.nz/about/contact-us/general-enquiries
> immediately and delete the original message. The views expressed in
> this message are those of the individual sender, except where the
> sender specifically states them to be views of Te Papa.  Te Papa
> employs strict virus checking measures and accepts no liability for
> any loss caused either directly or indirectly by a virus arising from
> the use of this message or any attached file.
>
> ________________________________
> This email has been filtered by SMX. For more information visit

> smxemail.com<http://smxemail.com/>
>
>
>
> --
>
> Steven J. Baskauf, Ph.D., Senior Lecturer
>
> Vanderbilt University Dept. of Biological Sciences
>
>
>
> postal mail address:
>
> PMB 351634
>
> Nashville, TN  37235-1634,  U.S.A.
>
>
>
> delivery address:
>
> 2125 Stevenson Center
>
> 1161 21st Ave., S.
>
> Nashville, TN 37235
>
>
>
> office: 2128 Stevenson Center
>
> phone: (615) 343-4582,  fax: (615) 322-4942
>
> If you fax, please phone or email so that I will know to look for it.
>
> http://bioimages.vanderbilt.edu
>
> http://vanderbilt.edu/trees
>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Visit the Te Papa website http://www.tepapa.govt.nz
> The email message together with the accompanying attachments may be
> CONFIDENTIAL. If you have received this message in error, please
> notify https://www.tepapa.govt.nz/about/contact-us/general-enquiries
> immediately and delete the original message. The views expressed in
> this message are those of the individual sender, except where the
> sender specifically states them to be views of Te Papa. Te Papa
> employs strict virus checking measures and accepts no liability for
> any loss caused either directly or indirectly by a virus arising from
> the use of this message or any attached file.
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> ______________________________________________________________________________
>
> This email has been filtered by SMX.
> For more information visit http://smxemail.com
> ______________________________________________________________________________
>


--
Paul J. Morris
Biodiversity Informatics Manager
Museum of Comparative Zoölogy, Harvard University
mole@morris.net  AA3SD  PGP public key available

_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content



 

--

John Deck
(541) 914-4739


Visit the Te Papa website http://www.tepapa.govt.nz
The email message together with the accompanying attachments may be CONFIDENTIAL. If you have received this message in error, please notify https://www.tepapa.govt.nz/about/contact-us/general-enquiries immediately and delete the original message. The views expressed in this message are those of the individual sender, except where the sender specifically states them to be views of Te Papa.  Te Papa employs strict virus checking measures and accepts no liability for any loss caused either directly or indirectly by a virus arising from the use of this message or any attached file.


This email has been filtered by SMX. For more information visit smxemail.com

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
PMB 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu
http://vanderbilt.edu/trees