[tdwg-guid] Need for citation information in GUID metadata

Thu Nov 8 19:27:36 CET 2007

Disclaimer: I don't fully understand all of the issues involved here,  
as I've only been looking at the biology standards for a few months.  
I may be misinterpreting some of the points being made. However, I  
have a good understanding of related standards in the library world,  
so I hope my comments may be of use.

In my opinion, if you try to put too many external semantics on DC  
data, you're going to run into many problems in the future, when you  
interact with groups that have "regular" DC data. It is possible to  
solve these issues with explicit metadata relationships using  
existing metadata standards. Here is the metadata for an image object  
in a repository I built recently:

     http://fedora.dlib.indiana.edu:8080/fedora/get/iudl:20008/METADATA

At first glance it looks long and complex, but it's relatively easy  
to pick apart. The outer layer of metadata follows the METS schema,  
which is a wrapper format for collecting together different types of  
metadata.

The first inner object is a MODS record. MODS holds essentially the  
same type of information as DC, but it allows for more detailed  
descriptions. This record always describes the artifact in the image,  
not the image itself. And a big advantage of MODS is that it allows  
specification of the thumbnail URL that Greg was originally asking  
about (in the <mods:url access="preview">). Note: It is possible to  
include a DC representation as well as a MODS representation with a  
single METS document.

After the MODS record are MIX records containing detailed technical  
information about each of the image files.

Finally, the mets:fileSec and mets:structMap sections specify  
relationships between the metadata sections and the actual files. In  
this case, the hrefs are relative URLs, but they could easily be full  
URLs or LSIDs.

Now, I'm not advocating that you dump RDF in favor of METS. My main  
point is that explicitly separating the different types of metadata  
may be useful. If you would like more information about the  
specifics, let me know.

--- Ryan Scherle
--- Digital Data Repository Architect
--- NESCent

On Nov 6, 2007, at 8:48 PM, Ricardo Scachetti Pereira wrote:

>    Please see my comments in line below.
>
> Bob Morris wrote:
>> The problem that nobody will take a position on is this:
>>
>> Is the metadata on an image file, or on an image?
>>
> I don't want to dismiss this as a simple problem. We've been trying  
> to knock it down for a long time now. However, I keep wondering why  
> can't we just include information from both (image and image file)  
> in the metadata by using different predicates in each case. See an  
> example below.
>> Even---or especially if---you stick to DC, you have a problem about
>> what things are part of a description.  If the metadata is about the
>> file, then it is reasonable to express, e.g. that it has 1200x800
>> pixels, encoded as jpeg but perhaps not that it is a a picture of a
>> flea biting a dog.  If the image is being described, the reverse  
>> might
>> hold.
>>
> Couldn't we say the following about an image?
>
> <rdf:RDF>
>    <tdwg:Image rdf:about="urn:lsid:example.com:image:1234">
>       <dc:title>Picture of my dog Scratchy</dc:title>
>       <dc:subject>A picture of a flea biting my dog.</dc:subject>
>       <dc:description>A description of a flea biting my dog. You  
> get the idea, but an image is worth a thousand words...</ 
> dc:description>
>       <dc:identifier>urn:lsid:example.com:image:1234</dc:identifier>
>       <dc:format>image/jpeg</dc:format>
>       <tdwg:imageDimensions>1200x800</tdwg:ImageDimensions>
>    </tdwg:Image>
> </rdf:RDF>
>
> Even though I bet the RDF isn't valid, I hope you get the point  
> that each predicate refers to either the file or the image, but not  
> both.
>
> If some of these predicates aren't suitable, we can always use some  
> other vocabularies (EXIF?). If you want to refer to what's in the  
> picture, we can somehow point to our familiar biodiversity  
> information objects: taxon name, observation, specimen, etc.
>
> Is there a case where this can't be done?
>
>> ... rendering clients probably
>> desperately need the pixel size and also information about where to
>> find other sizes of the "same" image.
>>
> That's a different problem. We had agreed that LSIDs can't be used  
> if the number of representations of an image is infinite or just  
> very large. Should we be looking at OpenURL or just Web services  
> (and WSDL)?? But that's a little advanced for our simple discussion  
> thread, isn't it?
>
>
> So, is this a feasible solution, or is there a class of counter  
> examples that I'm missing completely?
>
> Cheers,
>
> Ricardo
>
>
> _______________________________________________
> tdwg-guid mailing list
> tdwg-guid at lists.tdwg.org
> http://lists.tdwg.org/mailman/listinfo/tdwg-guid