[XML-SIG] Troubles with latin 1 characters
Lars Marius Garshol
29 Sep 1999 17:29:08 +0200
* J. R. van Ossenbruggen
| Declaring the encoding didn't help, Fredrik explained to me expat
| always returns UTF-8.
These are separate issues. The encoding you declare is the one that
the input document uses (if you don't declare anything the parser will
assume that it's UTF-16/UTF-8). The encoding you get output from the
parser in is the one the parser writer decided you should get.
| But switching to another parser did help.
Which parser did you use initially? expat? Which parser did you switch
| Still, a proper error message (detecting characters which are not in
| the declared/default encoding) and/or a warning message (detecting
| charaters in the output that are not in UTF8) would have been
Agreed. If the parser assumes UTF-8 and gets incorrect bit sequences
(your input was illegal as UTF-8) it should complain very loudly.