[issue14246] Accelerated ETree XMLParser cannot handle io.StringIO

Eli Bendersky report at bugs.python.org
Tue Mar 13 09:24:20 CET 2012


Eli Bendersky <eliben at gmail.com> added the comment:

Stefan,

Thanks a lot for taking the time to review the patch. As you correctly say, the current pathch's goal is just to align with existing behavior in the Python implementation of ET.

I understand the problem you are describing, but at least it's not a regression vs. previous behavior, while the original problem this issue complains about *is* a regression.

I propose to commit this to fix the regression and open a separate issue with the insight you provided. One easy solution could be to just require the encoding to be UTF-8 when passing unicode to the module, and to document it explicitly. Another solution would be to actually fix it in the module itself.

If there is a decision to fix it, the fix should then cover both the C and Python implementations, in all possible places (all functions reading XML from strings will also suffer from the same problem, since they get passed to xmlparse_Parse in pyexpat, which just uses PyArg_ParseTuple with the "s#" format - encoding unicode in utf-8 without looking at the XML encoding itself).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14246>
_______________________________________


More information about the Python-bugs-list mailing list