Good spotting Dave. This is fixed now in the TapirDotNET implementation of Tapir. I have updated the HerbIMI TapirDotNET implementation at http://lsid.herbimi.info/TapirDotNET/tapir.aspx/herbIMI So if you could like to check this provider, I would be interested in your results. Kevin
"Dave Vieglais" <vieglais@ku.edu> 9/11/2007 7:59 a.m. >>>
Hi Everyone, I've come across a minor issue with some existing TAPIR installations that should be easily fixed and will likely save some frustrations down the road. The TAPIR spec (http://www.tdwg.org/dav/subgroups/tapir/1.0/docs/TAPIRSpecification_2007-07-...) indicates a response Content-type of "text/xml". RFC 3023 (http://www.ietf.org/rfc/rfc3023.txt) indicates that in this case, when no "charset" parameter is specified in the HTTP response header, the implied character encoding of the response document is "us-ascii" (see s8.5). so for example: Good: response header = Content-type: text/xml; charset="utf-8" response document signature = <?xml version="1.0" encoding="utf-8"?> result = document is assumed to be UTF-8 Not so good: response header = Content-type: text/xml response document signature = <?xml version="1.0" encoding="utf-8"?> result = document is assumed to be us-ascii All TAPIR installations that I've examined so far do not set a charset value, and hence the character encoding of "us-ascii" is assumed by the consumer application, which is likely to cause some issues for consumer applications. This was also a significant issue for DiGIR provider installations. The solution is likely to be quite simple, and there seems to be two basic options: 1. Configure the webserver / application to insert a charset value of "UTF-8" to avoid the consumer falling back to the default of us-ascii. or 2. Return a Content-type of "Application/xml" or one of its subtypes. In this case RFC 3023 indicates the default character encoding should be assumed to be UTF-8. Note that simply specifying the content type does not automatically make the response properly encoded - it is still up to the web application (TAPIR in this case) to ensure that the output stream is actually UTF-8 encoded. regards, Dave V. _______________________________________________ tdwg-tapir mailing list tdwg-tapir@lists.tdwg.org http://lists.tdwg.org/mailman/listinfo/tdwg-tapir