[XML-SIG] Troubles with latin 1 characters

Lars Marius Garshol lmariusg@ifi.uio.no
29 Sep 1999 17:29:08 +0200


* J. R. van Ossenbruggen
| 
| Declaring the encoding didn't help, Fredrik explained to me expat
| always returns UTF-8.  

These are separate issues. The encoding you declare is the one that
the input document uses (if you don't declare anything the parser will
assume that it's UTF-16/UTF-8). The encoding you get output from the
parser in is the one the parser writer decided you should get.

| But switching to another parser did help.  

Which parser did you use initially? expat? Which parser did you switch
to?
 
| Still, a proper error message (detecting characters which are not in
| the declared/default encoding) and/or a warning message (detecting
| charaters in the output that are not in UTF8) would have been
| helpful...

Agreed. If the parser assumes UTF-8 and gets incorrect bit sequences
(your input was illegal as UTF-8) it should complain very loudly.
 
--Lars M.