[XML-SIG] xmlproc bug ?
Juergen Hermann
Juergen Hermann" <jh@web.de
Wed, 05 Sep 2001 20:15:01 +0200
On Wed, 5 Sep 2001 18:53:10 +0200 (CEST), Alexandre Fayolle wrote:
>Is the following behaviour a well known feature or a bug (or me deeply
>misunderstanding SAX)? It looks like xmlproc's Sax2 driver won't produc=
e
>UTF-8 encoded text when parsing a iso-8859-1 encoded file.
>
>The attached file demonstrates this. I tested it on python 1.5.2 and
>2.1.1, using PyXML 0.6.6.
>
>I'll register this in the bugtracker if it turns out to be a bug.
It's not related to xmlproc at all, but to "print" which uses "str" whic=
h in turn
uses the default encoding "USASCII". BTW, you also need to set the names=
apce
feature.
--- xmlproctest.py Wed Sep 5 18:10:39 2001
+++ xmlproctest2.py Wed Sep 5 18:13:21 2001
@@ -1,23 +1,30 @@
from xml import __version__
print 'pyxml version',__version__
-from xml.sax import make_parser
+import sys
+from xml.sax import make_parser, handler
from xml.sax.handler import ContentHandler
from StringIO import StringIO
+def write(x):
+ sys.stdout.write((x or "").encode("ISO-8859-1") + "\n")
+
class my_handler(ContentHandler):
def startElementNS(self,element,qname,attr):
- print element,qname
+ write(element[0])
+ write(element[1])
+ write(qname)
print attr.items()
-
def characters(self,buff):
- print buff
+ write(buff)
str =3D "<?xml version=3D'1.0' encoding=3D'iso-8859-1'?><=E9l=E9ment at=
tr=3D'=E0=E8=EF=F4=F9'>=E9p=E0</=E9l
=E9ment>"
source =3D StringIO(str)
p =3D make_parser("xml.sax.drivers2.drv_xmlproc")
+#p =3D make_parser("pirxx")
print p
+p.setFeature(handler.feature_namespaces, 1)
p.setContentHandler(my_handler())
p.parse(source)