[tdwg-content] clarification of the use of dwc:recordedBy

Steve Baskauf steve.baskauf at vanderbilt.edu
Tue Dec 8 19:09:07 CET 2009


Roger,
Thanks for the comments.  At this point, I'm thinking about this from 
the standpoint of GUID (HTTP URI/LSID) resolution rather than flat 
databases.  I am, of course, hampered by my lack of understanding of 
exactly how the LSID resolution process works.  According to the draft 
LSID applicability statement, under the circumstance where LSIDs are 
used, there would be a single identifier for GUID for something like a 
digital image of a specimen.  The metadata associated with the image 
would be the metadata in the RDF file returned in response to a 
getmetadata() call.  The actual digital image would be the data that 
would be returned in response to a getdata() call.  If I'm understanding 
this correctly, the image and its metadata are not things that would be 
assigned separate persistent identifiers.  It's not so clear to me how 
this would work in the case of a generic HTTP URI: identifier that was 
not a proxy for an LSID.  The GUID applicability statement makes the 
distinction between data and metadata in section 4 ("Machine and human 
clients that retrieve the metadata associated with a GUID will use the 
associated typing information to decide how to process the metadata and 
any associated data. ") and specifies that the default format for 
metadata should be RDF.  However, as far as I can tell, the document is 
silent on the mechanism by which access information for the actual data 
(an actual information resource itself as opposed to its metadata) would 
specified to a semantic-web enabled client.  The working examples I've 
seen so far where HTTP URIs are used as persistent identifiers have all 
been for non-information resources (i.e. no data, only metadata). 

I suppose the actual answer from an RDF perspective would be to do 
something similar to http://biocol.org/collection/rdf/id/35115 and 
http://lod.geospecies.org/ses/4XSQO.rdf where metadata for the 
identified resource and the metadata for the metadata are contained by 
two separate XML wrappers with separate rdf:about attributes.  Then 
there could be separate dcterms:creator elements within each of the two 
containers.  The object of the rdf:about attribute for the resource 
would be the LSID or persistent HTTP:URI of the object itself and the 
object of the rdf:about attribute for the metadata could be the URI of 
the rdf file.  The URI for the rdf file would effectively be a separate 
(nonGUID) identifier for the metadata, as you suggested must exist.  

The question of the meaning of dwc:recordedBy is still open, I guess.

How about this?  Contents of file 
http://myherbarium.org/image/rdf/12345.rdf containing the metadata for 
an image identified as urn:lsid:myherbarium.org:image:12345 of a 
specimen identified as urn:lsid:myherbarium.org:vascular:12345

<rdf:RDF>
    <dwc:occurrenceID rdf:about="urn:lsid:myherbarium.org:image:12345">
       <dcterms:creator 
rdf:resource="http://myherbarium.org/people/fred_photographer/foaf.rdf"/>
       <dcterms:created>2008-09-08T10:22:15-0600</dcterms:created>
       <dcterms:type rdf:resource="http://purl.org/dc/dcmitype/StillImage"/>
       <dwc:recordedBy 
rdf:resource="http://myherbarium.org/people/joe_taxonomist/foaf.rdf"/>
       <dcterms:description>Image of herbarium specimen 
urn:lsid:myherbarium.org:vascular:12345</dcterms:description>
       ... more metadata about the image...
    </dwc:occurrenceID>
    <rdf:Description rdf:about="http://myherbarium.org/image/rdf/12345.rdf">
       <dcterms:creator>Joe Shmoe Memorial Herbarium</dcterms:creator>
       <dcterms:created>2008-09-08T12:01:33-0600</dcterms:created>
       <dcterms:modified>2009-12-08T13:32:33-0600</dcterms:modified>
       <dcterms:description>RDF formatted description of the image 
urn:lsid:myherbarium.org:image:12345</dcterms:description>
       <dcterms:language>en</dcterms:language>
       ... more metadata about the metadata ...
    </rdf:Description>
</rdf:RDF>

Contents of file http://myherbarium.org/vascular/rdf/12345.rdf 
containing the metadata for a specimen identified as 
urn:lsid:myherbarium.org:vascular:12345

<rdf:RDF>
    <dwc:occurrenceID rdf:about="urn:lsid:myherbarium.org:vascular:12345">
       <dcterms:creator 
rdf:resource="http://myherbarium.org/people/joe_taxonomist/foaf.rdf"/>
       <dcterms:created>2008-07-02</dcterms:created>
       <dcterms:type 
rdf:resource="http://purl.org/dc/dcmitype/PhysicalObject"/>
       <dwc:recordedBy 
rdf:resource="http://myherbarium.org/people/joe_taxonomist/foaf.rdf"/>
       <dcterms:description>Herbarium specimen of Toxicodendron 
radicans</dcterms:description>
       <dwc:associatedMedia 
rdf:resource="http://resolver.org/urn:lsid:myherbarium.org:image:12345"/>
       ... more metadata about the specimen...
    </dwc:occurrenceID>
    <rdf:Description 
rdf:about="http://myherbarium.org/vascular/rdf/12345.rdf">
       <dcterms:creator>Joe Shmoe Memorial Herbarium</dcterms:creator>
       <dcterms:created>2008-09-08T12:01:30-0600</dcterms:created>
       <dcterms:modified>2009-10-07T09:14:08-0600</dcterms:modified>
       <dcterms:description>RDF formatted description of the specimen 
urn:lsid:myherbarium.org:vascular:12345</dcterms:description>
       <dcterms:language>en</dcterms:language>
       ... more metadata about the metadata ...
    </rdf:Description>
</rdf:RDF>

Steve

Roger Hyam wrote:
> Steve,
>
> It sounds like you have a lack of IDs. 
>
> You always need at least two IDs to express anything. One is the ID of thing (the Occurrence) the other is the ID of the metadata for the occurrence (the occurrence record?). You can then say whether the dcterms:creator is describing the metadata (record) or the thing itself (event).
>
> If you have a photo of the thing then you need at least one more ID to show that the dcterms:creator is the person who took the photo not the person who created the specimen or the person who created the metadata.
>
> In a flat record you can't express all this without getting really in a twist with which object which field applies to. I guess ( and this is a guess) that is why dwc:recordedBy exists. It is really the creator of the record (in the sense of the biologist with a pencil) and not the creator of the digital record (as in the system that is publishing this thing on the web).
>
> As the semantics get more complex one has to give up and go to regular RDF or risk adding more and more fields.
>
> Just my two pence worth. John may have a more practical explanation.
>
> All the best,
>
> Roger
>
>
>
> On 5 Dec 2009, at 13:08, Steve Baskauf wrote:
>
>   
>> John et al.,
>>
>> I have been pondering the difference of the use of the terms 
>> dwc:recordedBy and dcterms:creator (http://purl.org/dc/terms/creator). 
>> dcterms:creator is defined as:
>> "An entity primarily responsible for making the resource."
>> dwc:recordedBy is defined as:
>> A list (concatenated and separated) of names of people, groups, or 
>> organizations responsible for recording the original Occurrence. The 
>> primary collector or observer, especially one who applies a personal 
>> identifier (recordNumber), should be listed first.
>>
>> Except for the one word "original" in dwc:recordedBy, I would believe 
>> that these two terms would mean the same thing in the case of Occurrence 
>> resources.  In some cases, such as the collection of a physical specimen 
>> or photographing a live organism, I think that they are the same thing.  
>> The entity that creates the resource (specimen or image) is the same 
>> entity that has recorded the Occurrence.  However, in the situation 
>> where a specimen is imaged, the resulting image resource would have a 
>> dcterms:creator that was the person or institution that did the specimen 
>> imaging, while according to the way that I read the definition, 
>> dwc:recordedBy for the specimen image would have a value that specified 
>> the collector of the specimen (not the photographer). 
>>
>> If I am correct in this interpretation, this distinction would be useful 
>> in the case of images because it would allow for a simple mechanism to 
>> distinguish between images that directly record the appearance of 
>> individual organisms and images that are simply digital representations 
>> of some other thing that records the appearance of an individual 
>> organism, i.e. if dcterms:creator= =dwc:recordedBy then the resource was 
>> collected directly from an organsim and if dcterms:creator 
>> !=dwc:recordedBy then the resource might represent some other resource 
>> that was collected directly from an organism. 
>>
>> I am not sure how this would apply in situations other than images.  For 
>> example, if a spider were collected and assigned a persistent 
>> identifier, then later for a character documentation project the body 
>> parts were separated and considered separate specimens with their own 
>> identifiers, would the metadata for a leg specimen resource have 
>> dcterms:creator as the person who made the leg prep and dwc:recordedBy 
>> be the person who collected the spider?
>>
>> Basically, I would like to know the intention of the use of the word 
>> "original" in the definition of dwc:recordedBy.
>>
>> Steve Baskauf
>>
>> -- 
>> Steven J. Baskauf, Ph.D., Senior Lecturer
>> Vanderbilt University Dept. of Biological Sciences
>>
>> postal mail address:
>> VU Station B 351634
>> Nashville, TN  37235-1634,  U.S.A.
>>
>> delivery address:
>> 2125 Stevenson Center
>> 1161 21st Ave., S.
>> Nashville, TN 37235
>>
>> office: 2128 Stevenson Center
>> phone: (615) 343-4582,  fax: (615) 343-6707
>> http://bioimages.vanderbilt.edu
>>
>> _______________________________________________
>> tdwg-content mailing list
>> tdwg-content at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-content
>>     
>
> .
>
>   

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
VU Station B 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 343-6707
http://bioimages.vanderbilt.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20091208/10149ddd/attachment.html 


More information about the tdwg-content mailing list