[tdwg-tag] string literals vs. uris for dwc:recordedBy, dwc:identifiedBy, and dwc:georeferencedBy in RDF

John Wieczorek tuco at berkeley.edu
Sat May 29 18:53:39 CEST 2010


While making the case on for tdwg-content about dropping
dwc:accordingTo a couple of things occurred to me. One is that
dwc:accordingTo fills a role that isn't found in Dublin Core and was
the reason I came up with it in the first place. The term is meant to
signify a source for the information, and that the source might be a
dcterms:Agent or it might be a publication. relationshipAccordingTo
might be either on of these. nameAccordingTo really is supposed to be
a publication. georeferencedBy was intended to be an Agent, but I can
see how the agent might be lost to knowledge, but the publication that
was the source isn't. None of the dcterms (creator, contributor,
provenance, source) covers this more generic idea of an authority.
Well, that was the idea behind dwc:accordingTo. The big question is,
"Is the role useful?"

The second thing that occurred to me is that getting rid of the term
is a lot of work compared to not getting rid of it, so if it is doing
no harm, shouldn't it just be left alone? That's what I intend to do
now rather than go through the excision process.

However, one of the most recent terms, dwc:locationAccordingTo is not
a refinement of dwc:accordingTo. For consistency, if dwc:accordingTo
is retained, dwc:locationAccordingTo should refine it. I will make
that change as having been an oversight unless there is any
opposition.


On Sat, May 29, 2010 at 8:45 AM, John Wieczorek <tuco at berkeley.edu> wrote:
> Dear all,
>
> Sorry to have to jump in to the discussion late. I'm just back out of
> the field in Jujuy where the winds took out the little connectivity
> that once existed.
>
> I'm stepping back to the beginning of the conversation, out of the RDF
> representation issue, which seems to reasonable resolved under the
> thread "Darwin Core generic XML with attributes."
>
> The Darwin Core terms Steve has brought to attention are among a
> larger set (relationshipAccordingTo, georeferencedBy,
> measurementDeterminedBy, recordedBy, identifiedBy) that all are
> intended to cover the real world situation of data coming out of
> database fields for which there was no sense of persistent global
> unique identification. They were meant for literal strings for
> dcterms:Agents. There is nothing to proscribe the use of URIs in these
> fields. However, though there has been a lot of discussion about how
> one might represent the URI and string literal information about the
> recordedBy Agent, there has been no statement of consensus in the TAG
> from discussions in this thread and the related thread "Darwin Core
> generic XML with attributes" about whether actionable GUIDs for these
> same concepts should have their own terms.
>
> As I see it, we got this far:
>
> In the thread "Darwin Core generic XML with attributes" Rod Page said:
>
> "If you want a literal for <dwc:recordedBy> (say for ease of display)
> then I think you want a different tag that is expressly defined to do
> just that. For example,
> http://rs.tdwg.org/ontology/voc/TaxonOccurrence
>  has <identifiedTo>  to point to a URI for a taxon, and
> <identifiedToString> if you want the literal. I don't know if dwc has
> anything equivalent for recordedBy (and can somebody please tell me
> why we now have so many vocabularies for the same things?)"
>
> I think what Rod proposes is cleaner than using the same term for two
> purposes, as much as I too would like to have my cake and eat it. I
> agree with Rod, except that dwc:recordedBy is the string literal
> version of the concept, and no URI version (something like
> dwc:recordedByID refines dcterms:identifier) currently exists. Up to
> now there has not been a need expressed, hence, no such term was
> included in DwC.
>
> In answer to Rod's last question, we have many vocabularies
> (unfinished attempts), but only one that is is a ratified standard.
> That standard isn't an ontology, nor is it complete in describing best
> practices for expressing information in all representations. It's
> reasonably clear for how to share information in text files, and about
> the recommendations for XML, but in terms of RDF it is silent,
> awaiting the badly needed ontology work.
>
> Reviewing DwC in light of these discussions, I noted the following:
>
> 1) The set of terms (relationshipAccordingTo, georeferencedBy,
> measurementDeterminedBy, recordedBy, identifiedBy) all currently
> refine an abstract term dwc:accordingTo, defined as "Abstract term to
> attribute information to a source." This refinement doesn't really do
> anything useful in my opinion. I think the term dwc:accordingTo should
> be dropped, and the terms listed above should instead refine
> dcterms:contributor.
>
> 2) For one of these concepts (nameAccordingTo) it was anticipated that
> there would be actionable GUIDs and the term nameAccordingToID was
> added to clarify the distinction. nameAccordingTo currently also
> refines dwc:accordingTo, but I think this is erroneous, as it is meant
> to refer to a publication, not to a dcterms:Agent. This problem will
> be resolved if dwc:accordingTo is dropped and nameAccordingTo is
> changed to have no refines value.
>
> 3) In cases where actionable GUIDs were anticipated to be commonly
> used, terms with names ending with "ID" were added specifically for
> this purpose (relatedResourceID, resourceID, resourceRelationshipID,
> higherGeographyID, parentNameUsageID, nameAccordingToID,
> originalNameUsageID, eventID, individualID, occurrenceID,
> geologicalContextID, namePublishedInID, measurementID,
> acceptedNameUsageID, scientificNameID, locationID, identificationID,
> datasetID, taxonConceptID, taxonID). All of these terms refine
> dcterms:identifier and many of them are examples illustrating Rod's
> recommendation to separate identifiers from string literal
> representations of the concepts. If it is deemed that further "ID"
> terms are needed, then a public discussion must take place on the
> tdwg-content list.
>
> Dropping dwc:accordingTo, and changing the refinements requires the
> procedure under section 3.3 of the Darwin Core Namespace Policy
> (http://rs.tdwg.org/dwc/terms/namespace/index.htm#classesofchanges).
> I'll follow the protocol for changes for items 1 and 2 above, and
> await further discussion on item 3.
>
> Cheers,
>
> John
>
> On Wed, May 19, 2010 at 10:00 PM, Bob Morris <morris.bob at gmail.com> wrote:
>> As Peter Ansell points out, another solution seems to be to use two
>> dwc:recordedBy elements:
>>
>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>                 xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
>>                 >
>> <dwc:Occurrence rdf:about="http://herbarium.org/hb123456">
>> <dwc:recordedBy
>> rdf:resource="http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me"/>
>> <dwc:recordedBy >Steve Baskauf</dwc:recordedBy>
>> </dwc:Occurrence>
>>
>>
>>
>> I don't think the parse error is about the use of rdf:resource per se.
>>  I must confess that I'm having trouble understanding how to conclude
>> from the RDF spec why the original is a parse error, but these things
>> usually can be understood by trying to show what the triples are, not
>> by staring at the spec for some rdf tag.  I'm finding RDF/XML
>> increasingly irksome and unpleasant in this kind of task. If one
>> unwinds it here,  I bet it is that your failing example has subject
>> "http://herbarium.org/hb123456", predicate dwc:recordedBy but is
>> trying to have  two objects.
>>
>> The fact that you want to associate the URI for you with the string
>> for you is probably a separate problem, and possibly not expressible
>> within dwc itself.
>>
>> Bob
>>
>>
>> On Wed, May 19, 2010 at 8:59 PM, Steve Baskauf
>> <steve.baskauf at vanderbilt.edu> wrote:
>>> In the specific case of RDF, having your cake and eating it doesn't work.
>>> Paste this:
>>>
>>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>                  xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
>>>                  >
>>> <dwc:Occurrence rdf:about="http://herbarium.org/hb123456">
>>> <dwc:recordedBy
>>> rdf:resource="http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me">Steve
>>> Baskauf</dwc:recordedBy>
>>> </dwc:Occurrence>
>>> </rdf:RDF>
>>>
>>> into the W3C RDF validator at:
>>> http://www.w3.org/RDF/Validator/
>>> and it will tell you "The attributes on this property element, are not
>>> permitted with any content; expecting end element tag.".  So in RDF elements
>>> having the rdf:resource attribute have to be empty elements.  I tried
>>> validating an example where the recordedBy property was included twice, once
>>> with a URI object and once with a string literal object.  It validated as
>>> "good" RDF, but I think it would be confusing to a linked data client that
>>> would really have no clue that both objects represented the same thing and
>>> would probably "assume" that the occurrence was recorded by two entities
>>> rather than one..
>>>
>>> A possible solution would be to use dcterms:description as another
>>> attribute.  dcterms:description is defined as "An account of the resource.
>>> Description may include ...a free-text account of the resource."  I couldn't
>>> find a more appropriate Dublin Core to use as an attribute.  So running this
>>> example:
>>>
>>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>>                 xmlns:dcterms="http://purl.org/dc/terms/"
>>>                  xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
>>>                  >
>>> <dwc:Occurrence rdf:about="http://herbarium.org/hb123456">
>>> <dwc:recordedBy
>>> rdf:resource="http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me"
>>> dcterms:description="Steve Baskauf" />
>>> </dwc:Occurrence>
>>> </rdf:RDF>
>>>
>>> through the validator shows that this RDF asserts the following triples:
>>> http://herbarium.org/hb123456  dwc:recordedBy
>>> http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me
>>> and that
>>> http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me
>>> dcterms:description  "Steve Baskauf"
>>>
>>> In other words, the occurrence was recorded by me (identified by my URI) and
>>> that the description of the thing represented by my URI is "Steve Baskauf".
>>> That is pretty much a correct representation of the situation, although the
>>> whole point of using a URI as the object of a property is for a client to
>>> dereference the URI to find out more about the object.  The FOAF file
>>> (pointed to by the URI) would provide that information without the
>>> dcterms:description attribute.
>>>
>>> Steve
>>>
>>>
>>> Jim Croft wrote:
>>>
>>> wondering if
>>> <dwc:recordedBy
>>> rdf:resource="http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me">Steve
>>> Baskauf</dwc:recordedBy>
>>> is legit?
>>>
>>> just a have your cake and eat it kinda guy...
>>>
>>> jim
>>>
>>> On Thu, May 20, 2010 at 7:41 AM, Kevin Richards
>>> <RichardsK at landcareresearch.co.nz> wrote:
>>>
>>>
>>> From my understanding (and after reading the example Bob referred to), the
>>> difference is:
>>>
>>> [referring to external id]
>>> <dwc:recordedBy
>>> rdf:resource="http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me" />
>>>
>>> [inline text]
>>> <dwc:recordedBy>Steve Baskauf</dwc:recordedBy>
>>>
>>> Look right?
>>>
>>> Kevin
>>>
>>> -----Original Message-----
>>> From: tdwg-tag-bounces at lists.tdwg.org
>>> [mailto:tdwg-tag-bounces at lists.tdwg.org] On Behalf Of Jim Croft
>>> Sent: Thursday, 20 May 2010 9:37 a.m.
>>> To: Bob Morris
>>> Cc: tdwg-tag at lists.tdwg.org
>>> Subject: Re: [tdwg-tag] string literals vs. uris for dwc:recordedBy,
>>> dwc:identifiedBy, and dwc:georeferencedBy in RDF
>>>
>>> Hi Bob - should the same term allow both types of content, or should
>>> there be a different term for each?  Does it matter?  Should
>>> applications be smart enough to tell the difference and know what to
>>> do with it?
>>>
>>> Not really asking what the specification says, but about purity and
>>> wholesomeness of design... :)
>>>
>>> jim
>>>
>>> On Thu, May 20, 2010 at 4:26 AM, Bob Morris <morris.bob at gmail.com> wrote:
>>>
>>>
>>> Exactly this example is given in
>>> http://web4.w3.org/TR/REC-rdf-syntax/#section-Syntax-property-attributes
>>> so I would find it regrettable if DwC does something somewhere that
>>> makes this substitution impossible or discouraged,  or encourages tool
>>> construction that does so, or encourages documention be interpreted in
>>> a way that does so.
>>>
>>> Indeed http://rs.tdwg.org/dwc/rdf/dwcterms.rdf defines its type to be
>>> rdf:Property and is silent on any semantics  but that. My own
>>> conclusion is that neither the intent or the outcome of the rdf
>>> version of dwcterms discourages what you want, though I suppose the
>>> intent part would be clearer if the documentation also said that a URI
>>> can always be used, but applications are responsible for interpreting
>>> it.
>>>
>>>
>>> On Wed, May 19, 2010 at 11:09 AM, Steve Baskauf
>>> <steve.baskauf at vanderbilt.edu> wrote:
>>>
>>>
>>> The definition for the Darwin Core term recordedBy
>>> http://rs.tdwg.org/dwc/terms/index.htm#recordedBy
>>> says "A list (concatenated and separated) of names ...".  The examples
>>> given are string literals.  However, when using this term as a predicate
>>> in RDF, it would seem preferable to use a URI to an RDF representation
>>> of the entity (if one exists) rather than a string literal.  For
>>> example, can I use:
>>> <dwc:recordedBy
>>> rdf:resource="http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me"/>
>>> rather than
>>> <dwc:recordedBy>Steven J. Baskauf</dwc:recordedBy>
>>> ?
>>>
>>> Steve Baskauf
>>> --
>>>
>>> Steven J. Baskauf, Ph.D., Senior Lecturer
>>> Vanderbilt University Dept. of Biological Sciences
>>>
>>> postal mail address:
>>> VU Station B 351634
>>> Nashville, TN  37235-1634,  U.S.A.
>>>
>>> delivery address:
>>> 2125 Stevenson Center
>>> 1161 21st Ave., S.
>>> Nashville, TN 37235
>>>
>>> office: 2128 Stevenson Center
>>> phone: (615) 343-4582,  fax: (615) 343-6707
>>> http://bioimages.vanderbilt.edu
>>>
>>> _______________________________________________
>>> tdwg-tag mailing list
>>> tdwg-tag at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
>>>
>>>
>>>
>>>
>>> --
>>> Robert A. Morris
>>> Emeritus Professor  of Computer Science
>>> UMASS-Boston
>>> 100 Morrissey Blvd
>>> Boston, MA 02125-3390
>>> Associate, Harvard University Herbaria
>>> email: ram at cs.umb.edu
>>> web: http://bdei.cs.umb.edu/
>>> web: http://etaxonomy.org/FilteredPush
>>> http://www.cs.umb.edu/~ram
>>> phone (+1)617 287 6466
>>> _______________________________________________
>>> tdwg-tag mailing list
>>> tdwg-tag at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
>>>
>>>
>>>
>>> --
>>> _________________
>>> Jim Croft ~ jim.croft at gmail.com ~ +61-2-62509499 ~
>>> http://www.google.com/profiles/jim.croft
>>> 'A civilized society is one which tolerates eccentricity to the point
>>> of doubtful sanity.'
>>>  - Robert Frost, poet (1874-1963)
>>> _______________________________________________
>>> tdwg-tag mailing list
>>> tdwg-tag at lists.tdwg.org
>>> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
>>>
>>> Please consider the environment before printing this email
>>> Warning:  This electronic message together with any attachments is
>>> confidential. If you receive it in error: (i) you must not read, use,
>>> disclose, copy or retain it; (ii) please contact the sender immediately by
>>> reply email and then delete the emails.
>>> The views expressed in this email may not be those of Landcare Research New
>>> Zealand Limited. http://www.landcareresearch.co.nz
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Steven J. Baskauf, Ph.D., Senior Lecturer
>>> Vanderbilt University Dept. of Biological Sciences
>>>
>>> postal mail address:
>>> VU Station B 351634
>>> Nashville, TN  37235-1634,  U.S.A.
>>>
>>> delivery address:
>>> 2125 Stevenson Center
>>> 1161 21st Ave., S.
>>> Nashville, TN 37235
>>>
>>> office: 2128 Stevenson Center
>>> phone: (615) 343-4582,  fax: (615) 343-6707
>>> http://bioimages.vanderbilt.edu
>>>
>>
>>
>>
>> --
>> Robert A. Morris
>> Emeritus Professor  of Computer Science
>> UMASS-Boston
>> 100 Morrissey Blvd
>> Boston, MA 02125-3390
>> Associate, Harvard University Herbaria
>> email: ram at cs.umb.edu
>> web: http://bdei.cs.umb.edu/
>> web: http://etaxonomy.org/FilteredPush
>> http://www.cs.umb.edu/~ram
>> phone (+1)617 287 6466
>> _______________________________________________
>> tdwg-tag mailing list
>> tdwg-tag at lists.tdwg.org
>> http://lists.tdwg.org/mailman/listinfo/tdwg-tag
>>
>



More information about the tdwg-tag mailing list