[XML-SIG] xmlproc bug ?

Juergen Hermann Juergen Hermann" <jh@web.de
Wed, 05 Sep 2001 20:15:01 +0200


On Wed, 5 Sep 2001 18:53:10 +0200 (CEST), Alexandre Fayolle wrote:

>Is the following behaviour a well known feature or a bug (or me deeply
>misunderstanding SAX)? It looks like xmlproc's Sax2 driver won't produc=
e
>UTF-8 encoded text when parsing a iso-8859-1 encoded file.
>
>The attached file demonstrates this. I tested it on python 1.5.2 and
>2.1.1, using PyXML 0.6.6. 
>
>I'll register this in the bugtracker if it turns out to be a bug. 

It's not related to xmlproc at all, but to "print" which uses "str" whic=
h in turn 
uses the default encoding "USASCII". BTW, you also need to set the names=
apce 
feature.


--- xmlproctest.py      Wed Sep  5 18:10:39 2001
+++ xmlproctest2.py     Wed Sep  5 18:13:21 2001
@@ -1,23 +1,30 @@
 from xml import __version__
 print 'pyxml version',__version__
-from xml.sax import make_parser
+import sys
+from xml.sax import make_parser, handler
 from xml.sax.handler import ContentHandler
 from StringIO import StringIO

+def write(x):
+    sys.stdout.write((x or "").encode("ISO-8859-1") + "\n")
+
 class my_handler(ContentHandler):
     def startElementNS(self,element,qname,attr):
-        print element,qname
+        write(element[0])
+        write(element[1])
+        write(qname)
         print attr.items()

-
     def characters(self,buff):
-        print buff
+        write(buff)

 str =3D "<?xml version=3D'1.0' encoding=3D'iso-8859-1'?><=E9l=E9ment at=
tr=3D'=E0=E8=EF=F4=F9'>=E9p=E0</=E9l
=E9ment>"
 source =3D StringIO(str)

 p =3D make_parser("xml.sax.drivers2.drv_xmlproc")
+#p =3D make_parser("pirxx")
 print p
+p.setFeature(handler.feature_namespaces, 1)
 p.setContentHandler(my_handler())
 p.parse(source)