It's taken me a while to finish this project, but I have successfully turned Darwin Core Archives from GBIF and GGBN having Event, Taxon, and Occurrence cores into RDF, and merged them with RDF generated from an EOL traitbank archive and the pre-existing Bioimages RDF.  The resulting graph describes over 13 thousand distinct organisms and contains 8 million triples.  I've compressed the entire non-Bioimages RDF graph into a downloadable zip file (the Bioimages graph is available elsewhere) so that anyone can play with it.  It's also available for querying at Vanderbilt's Heard Library SPARQL endpoint. 

You can read the gory details of the experiment at http://baskauf.blogspot.com/2016/11/guid-o-matic-meets-dwc-rdf-octopus.html and there should be enough details that anyone who is serious about this could replicate the exercise.  I doubt that many people will, but you are also welcome to read the blog post the same way I read National Geographic: skim the high points and look at the pictures.  You might read the descriptions of each of the five datasets and look at there octopus diagrams, then skip to the end and try pasting the SPARQL queries into the endpoint to try them out for yourself (or hack them to find out more interesting things).

Thanks Quentin Groom, Willem Coetzer, Gabi Dröge, and Anne Thessen for sharing their cool datasets.  I'm excited about this and am looking forward to playing more with the big merged graph when I have more time.

Steve

Steve Baskauf wrote:
Markus and all,

Thanks for the responses.  I have several great datasets that I've downloaded from datasets at GBIF, so I have a lot to play with.  I've got occurrence, taxon, and event core archives, which is great.  I'd still be interested in an example of a material sample core archive if anyone has a good example, particularly if it had rich extension files linked to it.  I'll probably report on my experimentation in a future blog post.

Steve

Markus Döring wrote:
Hi Steve,
you can find thousands of openly published dwc archives at the GBIF registry: http://www.gbif.org/dataset

unless it is a biocase or digir resource   each details page contains the link to the dwca on the right side. Taxon and Event core datasets are always dwc-as 

best,
markus

Von meinem iPhone gesendet

Am 25.10.2016 um 06:35 schrieb Quentin Groom <quentin.groom@plantentuinmeise.be>:

Hi Steve,
I have a sampling event dataset on GBIF, which is quite richly populated.
Regards
Quentin



Dr. Quentin Groom
(Botany and Information Technology)

Botanic Garden Meise
Domein van Bouchout
B-1860 Meise
Belgium


Landline; +32 (0) 226 009 20 ext. 364
FAX:      +32 (0) 226 009 45

Skype name: qgroom
Website:    www.botanicgarden.be


On 25 October 2016 at 02:06, Steve Baskauf <steve.baskauf@vanderbilt.edu> wrote:
I have been playing around with turning Darwin Core Archives into RDF [1] and would like to extend my experiments to attempting to integrate data from archives that have very different core files.  I have played with the dwcaMolluscsAndorra.zip Occurrence archive mentioned in Annex 3 of the DwC-A How-To Guide [2], but the Whales-DWC-A.zip file isn't available any more (I guess because of the demise of Google Code).  A little bit of Google searching failed to turn up additional obvious example files.

If anyone would be interested in making DwC-A archives available to me, I'd appreciate it.  I'm particularly interested in archives that have cores other than Occurrence (although an Occurrence archive that has more complex data, including extensions, than the mollusc example would be welcome).  If there are any MaterialSample or Event core arcives, that would be particularly interesting.
Preferably, I'd like to have access to archives that contain publicly available data, since I'm likely to blog about the outcome and potentially use the data in examples.  If there are publicly available archives downloadable via a URL, you can reply to the list - otherwise let me know how I could access your example.

Thanks in advance for your help!
Steve Baskauf

[1] http://baskauf.blogspot.com/2016/10/guid-o-matic-meets-darwin-core-archives.html
[2] http://www.gbif.org/resource/80636

--
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
PMB 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu
http://vanderbilt.edu/trees


_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content

_______________________________________________
tdwg-content mailing list
tdwg-content@lists.tdwg.org
http://lists.tdwg.org/mailman/listinfo/tdwg-content

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
PMB 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu
http://vanderbilt.edu/trees

  

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
PMB 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu
http://vanderbilt.edu/trees