[XML-SIG] unicode

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Sat, 11 Aug 2001 21:30:24 +0200


> Not really. In this case the parser should give a warning and just
> continue, I think, since this is something that could quite reasonably
> happen if your own code handles the conversion. This case is really no
> different from passing a SAX InputSource with a character stream.
> (BTW, does that work? It should.)

It would depend on the driver to support that properly, but I doubt
any driver currently does support it. I found the feature sufficiently
dubious to never care about it; to me, a resource to parse is always a
byte sequence.

pyexpat is not capable in principle to accept a Unicode stream, since
it only supports byte streams. You could encode the Unicode stream
before giving it to pyexpat, but then you'd have to tell pyexpat to
ignore any encoding directives that it sees; this is currently not
done in the driver.

drv_xmlproc does not implement parse() itself, but inherits it from
xmlreader.IncrementalReader, which only looks at the byte stream.

As I said, I'm not terribly interested in character streams in SAX,
but I won't stop anybody from fixing the code. That won't fix the
original problem, of course.

Regards,
Martin