[issue9167] argv double encoding on OSX
Ronald Oussoren
report at bugs.python.org
Fri Jul 23 16:17:24 CEST 2010
Ronald Oussoren <ronaldoussoren at mac.com> added the comment:
Daniele: which version of OSX do you use? And if you use OSX 10.5 or 10.6: which is your system language according to system preferences (the topmost entry in the list of the "Language and Text" preference pane, whose icon looks a little like a UN flag.
I can only reproduce this by explicitly setting LANG=C before running the test on OSX 10.6 (with English as the main language)
This may be very hard to fix. What happens is that subprocess.Popen converts the argument array into the filesystem encoding (which on OSX is always UTF-8). The argv decoder then decodes the using the encoding specified in LANG, which on your system is different from UTF-8. This results in a string where each byte in the UTF-8 encoding of snowman is represented as a single character. Those characters are then encoded as UTF-8 by the test and that results in the error your seeing.
That is, the output looks like the output of this code:
>>> snowman = '\u2603'
>>> snowman.encode('utf-8').decode('latin1').encode('utf-8')
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9167>
_______________________________________
More information about the Python-bugs-list
mailing list