I had intended to make a comment on this topic earlier, but got too busy. I wanted to provide an example where use of this kind of language might be good, but for which it is not clear to me which of the key words would be appropriate.
The section of the guide that discusses typed literals [1] notes that a literal has a different meaning depending on whether it has a datatype attribute or not. The literal:
"35.85"^^xsd:decimal
is a decimal number. The literal:
"35.85"
has the implied datatype attribute /xsd:string/. Thus a client that is capable of semantic processing of ingested literals would not interpret these two literals as being the same thing. One is an abstract mathematical entity and the other is a string of characters.
So should the guidelines say something like "literals MUST be typed so that clients can properly interpret them"? That might be desirable if we assume that most clients will be doing serious semantic processing and will depend on receiving accurate information from providers. However, many providers probably won't (at least initially) bother to figure out what kind of datatype to assign to their literals; they will probably just spew out whatever strings are in their existing database and we have no way to enforce that they MUST do it. So should we just say "literals SHOULD be typed so that clients can properly interpret them"? On the other hand, we may be keen to get triples exposed as soon as possible and figure that it is the consuming client's job to figure out what literals "mean", rather than placing that burden on the provider. The clients will probably have to do that anyway since we have no way to enforce that providers include datatypes. So should the spec say "literals MAY be typed so that clients can properly interpret them"? Or maybe we could assume that some kind of cleanup would be done on triples from the wild before they are placed into some kind of community triplestore? In that case, we don't really care that much and could say that datatypes are OPTIONAL.
My point is that until we start seeing what the use cases are and how RDF ends up getting implemented in real life, it would be difficult to know which of the keywords should be used in the guide. I certainly would not know which keywords I should use in many cases.
Steve
[1] https://code.google.com/p/tdwg-rdf/wiki/DwcRdfGuideProposalRevised#2.4.1.1_T...
Paul J. Morris wrote:
Bob Morris noted that it may be appropriate for the DarwinCore RDF Guide to follow RFC 2119 (something for which at least the TDWG GUID applicability statement provides precedent).
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Internet Engineering Task Force (IETF). RFC 2119. Key words for use in RFCs to Indicate Requirement Levels. http://www.ietf.org/rfc/rfc2119.txt
This RFC notes: "These words are often capitalized" and "Imperatives of the type defined in this memo must be used with care and sparingly. In particular, they MUST only be used where it is actually required for interoperation or to limit behavior which has potential for causing harm". There are 69 occurrences of "should" in the guide, many of these instances appear to have the intent of colloqual english and probably other words should be substituted in those cases.
I've taken a stab here at a set of cases where it feels like it might be appropriate to use the RFC 2119 imperatives.
A question is whether it is the role of the guide to use MUST/MUST NOT/REQURED anywhere (except where that is forced by inheritance from elsewhere)? A potential place for asserting MUST/MUST NOT is the distinction between dwc and dwciri namespaces. I've put more discussion under the headings 2.4.1.2 and 2.5 below.
-Paul
1.3.2.1 Persistent Identifiers
s/the provider should take care to ensure/the provider SHOULD take care to ensure/
s/it must be converted to an IRI/it MUST be converted to an IRI/
1.3.2.2 HTTP IRIs as self-resolving GUIDs
s/through RDF should plan to implement GUIDs/through RDF MUST plan to implement GUIDs/
For consistency with "Must" in the GUID applicability statement.
1.4.1 Well-known vocabularies
s/the provider should assign the term an IRI,/the provider SHOULD assign the term an IRI,/
1.4.3 Use of Darwin Core terms in RDF
s/each value should be referenced/each value MUST be referenced/
By the nature of object references.
1.4.4 Limitations of this guide
s/Darwin Core property terms should be used as RDF predicates and specifies that Darwin Core class terms should be used in rdf:type/Darwin Core property terms SHOULD be used as RDF predicates and specifies that Darwin Core class terms SHOULD be used in rdf:type/
1.5.5 Implications for expressing Darwin Core string values as RDF
s/in which RDF should be structured/in which RDF SHOULD be structured/
Example 1:
s/Predicates must be identified by IRIs/Predicates MUST be identified by IRIs/
2.2 Subject resources
s/it must be referenced by an IRI/it MUST be referenced by an IRI/
2.2.2 Associating a string identifier with a subject resource
s/dcterms:identifier should be used to/dcterms:identifier SHOULD be used to/
s/it is acceptable to present it as a string literal value for dcterms:identifier/it MAY be presented as a string literal value for dcterms:identifier/
2.3.1.1 rdf:type statement
s/The class should be identified by an IRI reference/The class MUST be identified by an IRI reference/
2.3.1.3 Explicit vs. inferred type declarations
s/data providers should exercise caution in using any such term in a non-standard way/data providers SHOULD NOT use any such term in a non-standard way/
s/the provider should type the resource/the provider MAY type the resource/ ?? Or ?? s/the provider should type the resource/the provider SHOULD type the resource/
2.3.1.4 Other predicates used to indicate type
s/in an RDF description should be considered optional, while including rdf:type should be considered highly recommended/in an RDF description is OPTIONAL, while rdf:type SHOULD be included/
s/A dwciri: analogue (Section 2.5) of dwc:basisOfRecord should not be used/A dwciri: analogue (Section 2.5) of dwc:basisOfRecord MUST NOT be used/
2.3.1.5 Classes to be used for type declarations of resources described using Darwin Core
s/that should also be used/that SHOULD also be used/
Example 9:
s/a provider should include an xml:lang/a provider SHOULD include an xml:lang/
Example 10:
I'm not sure about this one:
/language tags should be interpreted by clients
s/may initially choose to expose literals without datatype attributes, they should/MAY initially choose to expose literals without datatype attributes, they SHOULD/
2.4.1.2 Terms intended for use with literal objects
I'll put a stake in the ground for discussion here: Is the guide sufficiently normative to assert MUST (other than where that property is inherited from elsewhere)? If so, then the distinction between dwc and dwciri is the place to make that assertion:
s/that terms in the dwc: namespace should be restricted to use with literal objects/that terms in the dwc: namespace MUST be restricted to use with literal objects/
2.4.2.1.1 Objects identified by LSIDs
This one needs to be checked for consistency with the LSID Applicability Statements:
s/version of the LSID should be used instead/version of the LSID SHOULD be used instead/
Example 17:
s/particular terms should be used with literal objects, or with IRI reference objects/particular terms SHOULD be used with literal objects, or SHOULD be used with IRI reference objects/
2.4.3.2 Literal values for non-literal resources in Darwin Core
s/existing Darwin Core term in the dwc: namespace should have the same structure/existing Darwin Core term in the dwc: namespace SHOULD have the same structure/
2.5 Terms in the dwciri: namespace
This is the companion case to 2.4.1.2 - is the distinction between dwciri and dwc to be asserted as SHOULD or MUST - using them incorrectly does have the "potential for causing harm" in the language of RFC 2119, but does the guide rise to the level of specifying an "absolute requirement of the specification".
s/IRI reference objects and should NOT be used with literal/ IRI reference objects and MUST NOT be used with literal/
2.5.1 Definition of dwciri: terms
s/resource described by a dwciri: property should be the subject of a triple for each value on the list/resource described by a dwciri: property SHOULD be the subject of a triple for each value on the list/
A counter example here comes from the Harvard List of Botanists, where http://purl.oclc.org/net/edu.harvard.huh/guid/uuid/305dcac3-a748-4f47-8a18-3... references the collector team "J. D. Hooker & A. Gray", that is, http://purl.oclc.org/net/edu.harvard.huh/guid/uuid/cc3b7080-5fbb-4ea5-9655-4... and http://purl.oclc.org/net/edu.harvard.huh/guid/uuid/3f8c70aa-1862-4784-8a53-f...
2.5.3 Expectation of clients encountering RDF containing dwc: and dwciri: terms
Instances of "should" here again cut to the heart of how strong the guidance is concerning dwc and dwciri.
Table 3
s/they should be used rather than using the Darwin Core ID/they SHOULD be used rather than using the Darwin Core ID/
Example 22:
s/The following points about the Example 22 should be noted:/Note the following points about Example 22:/
s/related non-literal resources should use/related non-literal resources SHOULD use/
/LSIDs are the objects of triples, they should be
As with 2.4.2.1.1, this guidance needs to be checked against the GUID and LSID applicability statements. I think I'm reading the GUID applicability statement as s/should/MUST/ here.
2.7.1 What purpose do convenience terms serve?
s/In general, it should not be necessary for a data provider/In general, a data provider does not need/
2.7.3 Ownership of a collection item
s/of the collection item should be indicated/of the collection item SHOULD be indicated/
2.7.4 Description of a taxonomic entity
s/It is considered to be out of the scope of this document to specify how taxon concepts should be rendered as RDF/It is out of the scope of this document to specify how to render taxon concepts as RDF/
s/terms for taxonomic entities should be properties of dwc:Identification/terms for taxonomic entities MAY be properties of dwc:Identification/
s/The task of describing taxonomic entities using RDF must be an effort outside of Darwin Core/The task of describing taxonomic entities using RDF is out of scope of this document/
2.8.3 Expressing Darwin Core association terms as RDF with URI references
s/it should be used to declare the type/rdf:type SHOULD be used to declare the type/
Tables 3.4 to 3.7.
s/should/SHOULD/