[tdwg-content] character encodings

Bob Morris morris.bob at gmail.com
Tue Dec 22 14:57:53 CET 2009


Yes, you are quite right that XML should take care of it.  And also
right that using Java Readers and Writers addresses the
well-formedness issue.  Generally, XML management is quite easy in
Java 6.  However, my experience has been that some browsers do not not
render some characters (or encodings?) perfectly and so I wondered
whether there are recommendations.

On Tue, Dec 22, 2009 at 5:08 AM, Gregor Hagedorn <g.m.hagedorn at gmail.com> wrote:
> Hi Bob
>
> is that necessary with xml? XML defines a set of allowable character
> encodings and any xml-processor "should" be able to read them.
>
> Problems we experienced in key to nature really go back to java
> programmers considering writing xml the production of strings in
> handwritten code, rather than using xml readers or writers. The result
> was generally (sooner or later...) non-well formed xml.
>
> Gregor
>
> 2009/12/21 Bob Morris <morris.bob at gmail.com>:
>> Does SDD (resp. tdwg) have any best-practice about character encodings
>> and character sets in XML.  The sad fact of life is that many printed,
>> and word processing, systematics and ecology documents have odd
>> symbols even in characters and states, e.g. the degree sign, +- sign,
>> the circled 'x', em-dashes, ..., which are not always rendered by
>> browsers.
>



-- 
Robert A. Morris
Professor of Computer Science (nominally retired)
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
Associate, Harvard University Herbaria
email: ram at cs.umb.edu
web: http://bdei.cs.umb.edu/
web: http://etaxonomy.org/FilteredPush
http://www.cs.umb.edu/~ram
phone (+1)617 287 6466



More information about the tdwg-content mailing list