Dear all,
I recently finished a new TAPIR provider implementaiton and I've been taking notes about some issues with the protocol schema and the specification. Most are minor things that are now unspecified or that have some contradiction considering schema, specification and wiki notes. Some are really errors that need to be fixed and others are suggestions to improve the way that some things are defined now.
I'm listing all issues below suggesting an approach for each one. Most of them I already talked to Markus and we have a similar opinion. Please feel free to give your ideas or suggest different approaches. In case there are no objections, I should probably change the XML Schema and the specification by the end of this week.
Best Regards, -- Renato
=========================
Issues and proposed changes to TAPIR
1) Now the schema says that indexingElement is unbounded. However, there must be one and only one indexingElement for each output model. This is an error and must be fixed in the schema.
2) Some time ago there was a proposal to allow renaming <value> elements in inventory responses to ease the parsing (http://lists.tdwg.org/pipermail/tdwg-tapir/2005-October/000009.html). Apparently there was an agreement about this, but for some reason the schema was never changed. Suggested approach: change schema and specification accordingly. (inventory requests: concept elements will have an optional attribute "tagName", and inventory responses will need to have xsd:any inside <record>).
3) Currently the "basicSchemaLanguage" capability for response structures includes "...ComplexType, simpleType, complexContent and simpleContent only when related to local definitions". On the other hand, "xsd:extension" and "xsd:restriction" are optional capabilities. Since complexContent and simpleContent have to be used with either "xsd:extension" or "xsd:restriction", this becomes a contradiction. Suggested approach: remove complexContent and simpleContent from basicSchemaLanguage, also trying to keep it as "basic" as possible.
4) header/source@accesspoint is a mandatory attribute that can be either an IP or a name. For TAPIR responses this is fine, when @accesspoint should be the service accesspoint. But for requests it becomes incovenient to force clients to put an accesspoint there (which can potentially contradict with the REMOTE_ADDR environment variable). Suggested approach: make @accesspoint an optional attribute and change the specification saying that: @accesspoint must be present in all responses indicating the service accesspoint. On requests, @accesspoint must be present in all <source> elements except the last <source>. In these cases, when present it must contain the IP address. The IP of the last source must always be taken from REMOTE_ADDR.
5) There's no way to track the original source using KVP when there are intermediate services. Suggested approach: include an optional KVP parameter "source_ip" (same DiGIR approach).
6) Currently, local variables are defined with their own elements, like <dateLastUpdated/>. This makes it complicated for a service to define new variables (need to extend the schema!) and also to identify variables when parsing a request. Suggested approach: use the same strategy adopted by parameters: <variable name="var_name"/> with the difference that @name will be an extensible controlled vocabulary.
7) <mappedConcept> in capablities responses has an optional @alias. Suggestion: add an optional @alias attribute to <schema> as well.
8) There's no indication whether response structures must contain a targetNamespace or not. If there's no targetNamespace and the response includes the TAPIR envelope, then custom search elements will be forced to inherit from the TAPIR namespace. Suggested approach: change the specification to say that response structures (schema) should always indicate a targetNamespace.
9) The specification says that the default value for "start" (search and inventory operations) is "1", but schema says the default is "0". Suggested approach: use "0" (= index of the first record) and change the specification.
10) The specification says that the default value for "limit" (search and inventory operations) is "null" (unlimited), but schema says the default is "1". Suggested approach: when not specified ("null") consider unlimited. Change the schema to remove "default=1". Otherwise it would be necessary to define a specific value to indicate "unlimited".
11) Currently, case sensitivity can be separately specified for "equals" and "like" operators, while nothing is said about the "in" operator. Suggested approach: have a single setting for case sensitivity related to the three operators.
12) There's no default behaviour for "like" comparisons when there's no wildcard in the term. Suggested approach: include * both in the beginning and the end of the term.
13) There's nothing specificed about how to escape * in like terms. Suggestion: use _* to escape (avoiding reserved and unsafe URL characters).
14) Specification is not clear about what content should be included in reponses when one or more partial parameter is specified. Suggested approach: response content should be minimal (non-greedy) when partial is specified. This means "include only the partial nodes and all other nodes (below and above) that are mandatory". When partial is not specified then reponses should be greedy (include all possible content).
=================================
participants (1)
-
Renato De Giovanni