[tdwg-content] RFC 2119: Re: Public comment on the Darwin Core RDF Guide

Steve Baskauf steve.baskauf at vanderbilt.edu
Sun Dec 14 22:51:45 CET 2014


I had intended to make a comment on this topic earlier, but got too 
busy.  I wanted to provide an example where use of this kind of language 
might be good, but for which it is not clear to me which of the key 
words would be appropriate.

The section of the guide that discusses typed literals [1] notes that a 
literal has a different meaning depending on whether it has a datatype 
attribute or not.  The literal:

"35.85"^^xsd:decimal

is a decimal number.  The literal:

"35.85"

has the implied datatype attribute /xsd:string/.  Thus a client that is 
capable of semantic processing of ingested literals would not interpret 
these two literals as being the same thing.  One is an abstract 
mathematical entity and the other is a string of characters.

So should the guidelines say something like "literals MUST be typed so 
that clients can properly interpret them"?  That might be desirable if 
we assume that most clients will be doing serious semantic processing 
and will depend on receiving accurate information from providers.  
However, many providers probably won't (at least initially) bother to 
figure out what kind of datatype to assign to their literals; they will 
probably just spew out whatever strings are in their existing database 
and we have no way to enforce that they MUST do it.  So should we just 
say "literals SHOULD be typed so that clients can properly interpret 
them"?  On the other hand, we may be keen to get triples exposed as soon 
as possible and figure that it is the consuming client's job to figure 
out what literals "mean", rather than placing that burden on the 
provider.  The clients will probably have to do that anyway since we 
have no way to enforce that providers include datatypes.  So should the 
spec say "literals MAY be typed so that clients can properly interpret 
them"?  Or maybe we could assume that some kind of cleanup would be done 
on triples from the wild before they are placed into some kind of 
community triplestore?  In that case, we don't really care that much and 
could say that datatypes are OPTIONAL.

My point is that until we start seeing what the use cases are and how 
RDF ends up getting implemented in real life, it would be difficult to 
know which of the keywords should be used in the guide.  I certainly 
would not know which keywords I should use in many cases.

Steve

[1] 
https://code.google.com/p/tdwg-rdf/wiki/DwcRdfGuideProposalRevised#2.4.1.1_Typed_literals

Paul J. Morris wrote:
> Bob Morris noted that it may be appropriate for the DarwinCore RDF
> Guide to follow RFC 2119 (something for which at least the TDWG GUID
> applicability statement provides precedent).  
>
> The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
> "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
> document are to be interpreted as described in RFC 2119.
>
> Internet Engineering Task Force (IETF). RFC 2119. Key words for use in
> RFCs to Indicate Requirement Levels. http://www.ietf.org/rfc/rfc2119.txt
>
> This RFC notes: "These words are often capitalized" and "Imperatives of
> the type defined in this memo must be used with care and sparingly.  In
> particular, they MUST only be used where it is actually required for
> interoperation or to limit behavior which has potential for causing
> harm".   There are 69 occurrences of "should" in the guide, many of
> these instances appear to have the intent of colloqual english and
> probably other words should be substituted in those cases.  
>
> I've taken a stab here at a set of cases where it feels like it might
> be appropriate to use the RFC 2119 imperatives.  
>
> A question is whether it is the role of the guide to use MUST/MUST
> NOT/REQURED anywhere (except where that is forced by inheritance from
> elsewhere)?  A potential place for asserting MUST/MUST NOT is the
> distinction between dwc and dwciri namespaces.  I've put more
> discussion under the headings 2.4.1.2 and 2.5 below.
>
> -Paul
>
> ----
>
> 1.3.2.1 Persistent Identifiers
>
> s/the provider should take care to ensure/the provider SHOULD take care
> to ensure/
>
> s/it must be converted to an IRI/it MUST be converted to an IRI/
>
> ----
>
> 1.3.2.2 HTTP IRIs as self-resolving GUIDs
>
> s/through RDF should plan to implement GUIDs/through RDF MUST plan to
> implement GUIDs/
>
> For consistency with "Must" in the GUID applicability statement.
>
> ----
>
> 1.4.1 Well-known vocabularies
>
> s/the provider should assign the term an IRI,/the provider SHOULD
> assign the term an IRI,/
>
> ----
>
> 1.4.3 Use of Darwin Core terms in RDF
>
> s/each value should be referenced/each value MUST be referenced/
>
> By the nature of object references.
>
> ----
>
> 1.4.4 Limitations of this guide
>
> s/Darwin Core property terms should be used as RDF predicates and
> specifies that Darwin Core class terms should be used in
> rdf:type/Darwin Core property terms SHOULD be used as RDF predicates
> and specifies that Darwin Core class terms SHOULD be used in rdf:type/
>
> ----
>
> 1.5.5 Implications for expressing Darwin Core string values as RDF
>
> s/in which RDF should be structured/in which RDF SHOULD be structured/
>
> ----
>
> Example 1:
>
> s/Predicates must be identified by IRIs/Predicates MUST be identified
> by IRIs/
>
> ----
>
> 2.2 Subject resources
>
> s/it must be referenced by an IRI/it MUST be referenced by an IRI/
>
> ----
>
> 2.2.2 Associating a string identifier with a subject resource
>
> s/dcterms:identifier should be used to/dcterms:identifier SHOULD be
> used to/
>
> s/it is acceptable to present it as a string literal value for
> dcterms:identifier/it MAY be presented as a string literal value for
> dcterms:identifier/
>
> ----
>
> 2.3.1.1 rdf:type statement
>
> s/The class should be identified by an IRI reference/The class MUST be
> identified by an IRI reference/
>
> ----
>
> 2.3.1.3 Explicit vs. inferred type declarations
>
> s/data providers should exercise caution in using any such term in a
> non-standard way/data providers SHOULD NOT use any such term in a
> non-standard way/
>
> s/the provider should type the resource/the provider MAY type the
> resource/ ?? Or ??
> s/the provider should type the resource/the provider SHOULD type the
> resource/
>
> ----
>
> 2.3.1.4 Other predicates used to indicate type
>
> s/in an RDF description should be considered optional, while including
> rdf:type should be considered highly recommended/in an RDF description
> is OPTIONAL, while rdf:type SHOULD be included/
>
> s/A dwciri: analogue (Section 2.5) of dwc:basisOfRecord should not be
> used/A dwciri: analogue (Section 2.5) of dwc:basisOfRecord MUST NOT be
> used/
>
> ----
>
> 2.3.1.5 Classes to be used for type declarations of resources described
> using Darwin Core
>
> s/that should also be used/that SHOULD also be used/
>
> ----
>
> Example 9:
>
> s/a provider should include an xml:lang/a provider SHOULD include an
> xml:lang/
>
> ----
>
> Example 10:
>
> I'm not sure about this one: 
>
> /language tags should be interpreted by clients
>
> s/may initially choose to expose literals without datatype attributes,
> they should/MAY initially choose to expose literals without datatype
> attributes, they SHOULD/
>
> ----
>
> 2.4.1.2 Terms intended for use with literal objects
>
> I'll put a stake in the ground for discussion here:  Is the guide
> sufficiently normative to assert MUST (other than where that property
> is inherited from elsewhere)?  If so, then the distinction between dwc
> and dwciri is the place to make that assertion: 
>
> s/that terms in the dwc: namespace should be restricted to use with
> literal objects/that terms in the dwc: namespace MUST be restricted to
> use with literal objects/
>
> ----
>
> 2.4.2.1.1 Objects identified by LSIDs
>
> This one needs to be checked for consistency with the LSID
> Applicability Statements:
>
> s/version of the LSID should be used instead/version of the LSID SHOULD
> be used instead/
>
> ----
>
> Example 17:
>
> s/particular terms should be used with literal objects, or with IRI
> reference objects/particular terms SHOULD be used with literal objects,
> or SHOULD be used with IRI reference objects/
>
>
> ----
>
> 2.4.3.2 Literal values for non-literal resources in Darwin Core
>
> s/existing Darwin Core term in the dwc: namespace should have the same
> structure/existing Darwin Core term in the dwc: namespace SHOULD have
> the same structure/
>
> ----
>
> 2.5 Terms in the dwciri: namespace
>
> This is the companion case to 2.4.1.2 - is the distinction between
> dwciri and dwc to be asserted as SHOULD or MUST - using them
> incorrectly does have the "potential for causing harm" in the language
> of RFC 2119, but does the guide rise to the level of specifying an
> "absolute requirement of the specification".
>
> s/IRI reference objects and should NOT be used with literal/ IRI
> reference objects and MUST NOT be used with literal/
>
> ----
>
> 2.5.1 Definition of dwciri: terms
>
> s/resource described by a dwciri: property should be the subject of a
> triple for each value on the list/resource described by a dwciri:
> property SHOULD be the subject of a triple for each value on the list/
>
> A counter example here comes from the Harvard List of Botanists, where
> 	http://purl.oclc.org/net/edu.harvard.huh/guid/uuid/305dcac3-a748-4f47-8a18-3aaa9503c471
> references the collector team "J. D. Hooker & A. Gray", that is,
> http://purl.oclc.org/net/edu.harvard.huh/guid/uuid/cc3b7080-5fbb-4ea5-9655-49302d3e6a1b
> and
> http://purl.oclc.org/net/edu.harvard.huh/guid/uuid/3f8c70aa-1862-4784-8a53-f69ffe810fa0
>
> ----
>
> 2.5.3 Expectation of clients encountering RDF containing dwc: and
> dwciri: terms
>
> Instances of "should" here again cut to the heart of how strong the
> guidance is concerning dwc and dwciri.
>
> ----
>
> Table 3
>
> s/they should be used rather than using the Darwin Core ID/they SHOULD
> be used rather than using the Darwin Core ID/
>
> ----
>
> Example 22:
>
> s/The following points about the Example 22 should be noted:/Note the
> following points about Example 22:/
>
> s/related non-literal resources should use/related non-literal
> resources SHOULD use/
>
> /LSIDs are the objects of triples, they should be
>
> As with 2.4.2.1.1, this guidance needs to be checked against the GUID
> and LSID applicability statements.  I think I'm reading the GUID
> applicability statement as s/should/MUST/ here.
>
> ----
> 2.7.1 What purpose do convenience terms serve?
>
> s/In general, it should not be necessary for a data provider/In
> general, a data provider does not need/
>
> ----
>
> 2.7.3 Ownership of a collection item
>
> s/of the collection item should be indicated/of the collection item
> SHOULD be indicated/
>
> ----
>
> 2.7.4 Description of a taxonomic entity
>
> s/It is considered to be out of the scope of this document to specify
> how taxon concepts should be rendered as RDF/It is out of the scope of
> this document to specify how to render taxon concepts as RDF/
>
>
> s/terms for taxonomic entities should be properties of
> dwc:Identification/terms for taxonomic entities MAY be properties of
> dwc:Identification/
>
> s/The task of describing taxonomic entities using RDF must be an effort
> outside of Darwin Core/The task of describing taxonomic entities using
> RDF is out of scope of this document/
>
> ----
>
> 2.8.3 Expressing Darwin Core association terms as RDF with URI
> references
>
> s/it should be used to declare the type/rdf:type SHOULD be used to
> declare the type/
>
> ----
>
> Tables 3.4 to 3.7.
>
> s/should/SHOULD/ 
>
> --------
>
>   

-- 
Steven J. Baskauf, Ph.D., Senior Lecturer
Vanderbilt University Dept. of Biological Sciences

postal mail address:
PMB 351634
Nashville, TN  37235-1634,  U.S.A.

delivery address:
2125 Stevenson Center
1161 21st Ave., S.
Nashville, TN 37235

office: 2128 Stevenson Center
phone: (615) 343-4582,  fax: (615) 322-4942
If you fax, please phone or email so that I will know to look for it.
http://bioimages.vanderbilt.edu
http://vanderbilt.edu/trees


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tdwg.org/pipermail/tdwg-content/attachments/20141214/369bed6b/attachment.html 


More information about the tdwg-content mailing list