Roger, Thanks for the comments. At this point, I'm thinking about this from the standpoint of GUID (HTTP URI/LSID) resolution rather than flat databases. I am, of course, hampered by my lack of understanding of exactly how the LSID resolution process works. According to the draft LSID applicability statement, under the circumstance where LSIDs are used, there would be a single identifier for GUID for something like a digital image of a specimen. The metadata associated with the image would be the metadata in the RDF file returned in response to a getmetadata() call. The actual digital image would be the data that would be returned in response to a getdata() call. If I'm understanding this correctly, the image and its metadata are not things that would be assigned separate persistent identifiers. It's not so clear to me how this would work in the case of a generic HTTP URI: identifier that was not a proxy for an LSID. The GUID applicability statement makes the distinction between data and metadata in section 4 ("Machine and human clients that retrieve the metadata associated with a GUID will use the associated typing information to decide how to process the metadata and any associated data. ") and specifies that the default format for metadata should be RDF. However, as far as I can tell, the document is silent on the mechanism by which access information for the actual data (an actual information resource itself as opposed to its metadata) would specified to a semantic-web enabled client. The working examples I've seen so far where HTTP URIs are used as persistent identifiers have all been for non-information resources (i.e. no data, only metadata).
I suppose the actual answer from an RDF perspective would be to do something similar to http://biocol.org/collection/rdf/id/35115 and http://lod.geospecies.org/ses/4XSQO.rdf where metadata for the identified resource and the metadata for the metadata are contained by two separate XML wrappers with separate rdf:about attributes. Then there could be separate dcterms:creator elements within each of the two containers. The object of the rdf:about attribute for the resource would be the LSID or persistent HTTP:URI of the object itself and the object of the rdf:about attribute for the metadata could be the URI of the rdf file. The URI for the rdf file would effectively be a separate (nonGUID) identifier for the metadata, as you suggested must exist.
The question of the meaning of dwc:recordedBy is still open, I guess.
How about this? Contents of file http://myherbarium.org/image/rdf/12345.rdf containing the metadata for an image identified as urn:lsid:myherbarium.org:image:12345 of a specimen identified as urn:lsid:myherbarium.org:vascular:12345
rdf:RDF <dwc:occurrenceID rdf:about="urn:lsid:myherbarium.org:image:12345"> <dcterms:creator rdf:resource="http://myherbarium.org/people/fred_photographer/foaf.rdf%22/%3E dcterms:created2008-09-08T10:22:15-0600</dcterms:created> <dcterms:type rdf:resource="http://purl.org/dc/dcmitype/StillImage%22/%3E <dwc:recordedBy rdf:resource="http://myherbarium.org/people/joe_taxonomist/foaf.rdf%22/%3E dcterms:descriptionImage of herbarium specimen urn:lsid:myherbarium.org:vascular:12345</dcterms:description> ... more metadata about the image... </dwc:occurrenceID> <rdf:Description rdf:about="http://myherbarium.org/image/rdf/12345.rdf%22%3E dcterms:creatorJoe Shmoe Memorial Herbarium</dcterms:creator> dcterms:created2008-09-08T12:01:33-0600</dcterms:created> dcterms:modified2009-12-08T13:32:33-0600</dcterms:modified> dcterms:descriptionRDF formatted description of the image urn:lsid:myherbarium.org:image:12345</dcterms:description> dcterms:languageen</dcterms:language> ... more metadata about the metadata ... </rdf:Description> </rdf:RDF>
Contents of file http://myherbarium.org/vascular/rdf/12345.rdf containing the metadata for a specimen identified as urn:lsid:myherbarium.org:vascular:12345
rdf:RDF <dwc:occurrenceID rdf:about="urn:lsid:myherbarium.org:vascular:12345"> <dcterms:creator rdf:resource="http://myherbarium.org/people/joe_taxonomist/foaf.rdf%22/%3E dcterms:created2008-07-02</dcterms:created> <dcterms:type rdf:resource="http://purl.org/dc/dcmitype/PhysicalObject%22/%3E <dwc:recordedBy rdf:resource="http://myherbarium.org/people/joe_taxonomist/foaf.rdf%22/%3E dcterms:descriptionHerbarium specimen of Toxicodendron radicans</dcterms:description> <dwc:associatedMedia rdf:resource="http://resolver.org/urn:lsid:myherbarium.org:image:12345%22/%3E ... more metadata about the specimen... </dwc:occurrenceID> <rdf:Description rdf:about="http://myherbarium.org/vascular/rdf/12345.rdf%22%3E dcterms:creatorJoe Shmoe Memorial Herbarium</dcterms:creator> dcterms:created2008-09-08T12:01:30-0600</dcterms:created> dcterms:modified2009-10-07T09:14:08-0600</dcterms:modified> dcterms:descriptionRDF formatted description of the specimen urn:lsid:myherbarium.org:vascular:12345</dcterms:description> dcterms:languageen</dcterms:language> ... more metadata about the metadata ... </rdf:Description> </rdf:RDF>
Steve
Roger Hyam wrote:
Steve,
It sounds like you have a lack of IDs.
You always need at least two IDs to express anything. One is the ID of thing (the Occurrence) the other is the ID of the metadata for the occurrence (the occurrence record?). You can then say whether the dcterms:creator is describing the metadata (record) or the thing itself (event).
If you have a photo of the thing then you need at least one more ID to show that the dcterms:creator is the person who took the photo not the person who created the specimen or the person who created the metadata.
In a flat record you can't express all this without getting really in a twist with which object which field applies to. I guess ( and this is a guess) that is why dwc:recordedBy exists. It is really the creator of the record (in the sense of the biologist with a pencil) and not the creator of the digital record (as in the system that is publishing this thing on the web).
As the semantics get more complex one has to give up and go to regular RDF or risk adding more and more fields.
Just my two pence worth. John may have a more practical explanation.
All the best,
Roger
On 5 Dec 2009, at 13:08, Steve Baskauf wrote:
John et al.,
I have been pondering the difference of the use of the terms dwc:recordedBy and dcterms:creator (http://purl.org/dc/terms/creator). dcterms:creator is defined as: "An entity primarily responsible for making the resource." dwc:recordedBy is defined as: A list (concatenated and separated) of names of people, groups, or organizations responsible for recording the original Occurrence. The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first.
Except for the one word "original" in dwc:recordedBy, I would believe that these two terms would mean the same thing in the case of Occurrence resources. In some cases, such as the collection of a physical specimen or photographing a live organism, I think that they are the same thing. The entity that creates the resource (specimen or image) is the same entity that has recorded the Occurrence. However, in the situation where a specimen is imaged, the resulting image resource would have a dcterms:creator that was the person or institution that did the specimen imaging, while according to the way that I read the definition, dwc:recordedBy for the specimen image would have a value that specified the collector of the specimen (not the photographer).
If I am correct in this interpretation, this distinction would be useful in the case of images because it would allow for a simple mechanism to distinguish between images that directly record the appearance of individual organisms and images that are simply digital representations of some other thing that records the appearance of an individual organism, i.e. if dcterms:creator= =dwc:recordedBy then the resource was collected directly from an organsim and if dcterms:creator !=dwc:recordedBy then the resource might represent some other resource that was collected directly from an organism.
I am not sure how this would apply in situations other than images. For example, if a spider were collected and assigned a persistent identifier, then later for a character documentation project the body parts were separated and considered separate specimens with their own identifiers, would the metadata for a leg specimen resource have dcterms:creator as the person who made the leg prep and dwc:recordedBy be the person who collected the spider?
Basically, I would like to know the intention of the use of the word "original" in the definition of dwc:recordedBy.
Steve Baskauf
-- Steven J. Baskauf, Ph.D., Senior Lecturer Vanderbilt University Dept. of Biological Sciences
postal mail address: VU Station B 351634 Nashville, TN 37235-1634, U.S.A.
delivery address: 2125 Stevenson Center 1161 21st Ave., S. Nashville, TN 37235
office: 2128 Stevenson Center phone: (615) 343-4582, fax: (615) 343-6707 http://bioimages.vanderbilt.edu
tdwg-content mailing list tdwg-content@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-content
.