Yes, you are quite right that XML should take care of it. And also right that using Java Readers and Writers addresses the well-formedness issue. Generally, XML management is quite easy in Java 6. However, my experience has been that some browsers do not not render some characters (or encodings?) perfectly and so I wondered whether there are recommendations. On Tue, Dec 22, 2009 at 5:08 AM, Gregor Hagedorn <g.m.hagedorn@gmail.com> wrote:
Hi Bob
is that necessary with xml? XML defines a set of allowable character encodings and any xml-processor "should" be able to read them.
Problems we experienced in key to nature really go back to java programmers considering writing xml the production of strings in handwritten code, rather than using xml readers or writers. The result was generally (sooner or later...) non-well formed xml.
Gregor
2009/12/21 Bob Morris <morris.bob@gmail.com>:
Does SDD (resp. tdwg) have any best-practice about character encodings and character sets in XML. The sad fact of life is that many printed, and word processing, systematics and ecology documents have odd symbols even in characters and states, e.g. the degree sign, +- sign, the circled 'x', em-dashes, ..., which are not always rendered by browsers.
-- Robert A. Morris Professor of Computer Science (nominally retired) UMASS-Boston 100 Morrissey Blvd Boston, MA 02125-3390 Associate, Harvard University Herbaria email: ram@cs.umb.edu web: http://bdei.cs.umb.edu/ web: http://etaxonomy.org/FilteredPush http://www.cs.umb.edu/~ram phone (+1)617 287 6466