[tdwg-tag] provenance chains and DwC:recordedBy

joel sachs jsachs at csee.umbc.edu
Fri Aug 6 17:16:21 CEST 2010


Excatly - moving from our observation ontology to Darwin Core is lossy. In 
terms of the TDWG bioblitz, it would be nice to use TDWG standards. But 
it would also be nice to maintain the provenance of observations. I guess 
what I'm getting at is: What does it mean to be a Darwin Core 
record on the semantic web, where a single document often mixes terms 
from multiple vocabularies?

I'd like the answer to be: "Don't worry about it. Use DwC terms when 
necessary, but don't necessarily use DwC terms."

I think the obvious approach is to capture both the observer and 
reporter of the data where appropriate, and maintain this distinction in 
the data store. When we have to generate "pure" DwC records (for example, 
for harvesting by GBIF), we map to DwC (sometimes in 
a lossy manner). But is that obvious to everyone, or just to me?

Thanks again -

On Thu, 5 Aug 2010, Javier de la Torre wrote:

> Hi,
> If I understand correctly from http://rs.tdwg.org/dwc/terms/#recordedBy you would concatenate the observer and then the reporter in one single string that will be transferred in recordedBy. What should go first, if them observer or the reporter i dont know, in the case of Bioblitz I would say the observer (it is all about giving credit to people on the field no?). Of course this is a data loose transformation as the concatenated list will not tell you what is what.
> I feel I am missing something...
> Javier de la Torre
> www.vizzuality.com
> On Aug 4, 2010, at 4:12 PM, joel sachs wrote:
>> All,
>> In preparation for the tdwg bioblitz, I'd like to configure our Spotter tool (http://spire.umbc.edu/spotter) to compose DwC records. Currently, it uses an observation ontology that we whipped up a few years ago. (Here's an illustrative record - http://spire.umbc.edu/spotter/observation/data.php?record=1534)
>> For the most part, the mapping is straightforward. However, I'm wondering about two terms: "hasObserver" and "hasReporter". We distinguished between these two terms to accommodate situations where a student makes an observation, but her teacher reports it. Similarly, in a bioblitz event, one model is that a survey team leader will fill out and submit a spreadsheet comprised of the observations made by members of the survey team.
>> Both these terms seem to map to DwC:recordedBy. According to http://rs.tdwg.org/dwc/terms/#recordedBy, "The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first." So if we simply listed observer followed by reporter, we would comply with the spec. Of course, the ordering would be lost in typical rdf representations, since triples are considered unordered. And whether in rdf or in text, the distinction between observer and reporter would be pretty much lost.
>> Since one of the goals of the bioblitz is figuring out good ways to use DwC in citizen science, I'm interested in opinions on whether we should preserve the observer/reporter distinction, and include these non-DwC terms in the bioblitz data profile.
>> Many thanks -
>> Joel.
> _______________________________________________
> tdwg-tag mailing list
> tdwg-tag at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-tag

More information about the tdwg-tag mailing list