[issue19020] Regression: Windows-tkinter-idle, unicode, and 0xxx filename

Serhiy Storchaka report at bugs.python.org
Sun Nov 10 19:02:19 CET 2013


Serhiy Storchaka added the comment:

> I am also not clear on the relation between the UnicodeDecodeError and tuple splitting. Does '_flatten((self._w, cmd)))' call split or splitlist on the tuple arg? Is so, do you know why a problem with that would lead to the UDError? Does your patch fix the leading '0' regression?

The traceback is misleading. Full statement is:

            for x in self.tk.split(
                    self.tk.call(_flatten((self._w, cmd)))):

Where cmd is ('entryconfigure', index). The UnicodeDecodeError error was raised neither by _flatten() nor call(), but by split().

When run `./python -m idlelib.idle \\0.py` call() returns and split() gets a tuple of tuples: (('-activebackground', '', '', '', ''), ('-activeforeground', '', '', '', ''), ('-accelerator', '', '', '', ''), ('-background', '', '', '', ''), ('-bitmap', '', '', '', ''), ('-columnbreak', '', '', 0, 0), ('-command', '', '', '', '3067328620open_recent_file'), ('-compound', 'compound', 'Compound', <index object: 'none'>, 'none'), ('-font', '', '', '', ''), ('-foreground', '', '', '', ''), ('-hidemargin', '', '', 0, 0), ('-image', '', '', '', ''), ('-label', '', '', '', '1 /home/serhiy/py/cpython/\\0.py'), ('-state', '', '', <index object: 'normal'>, 'normal'), ('-underline', '', '', -1, 0)). When set wantobjects in Lib/tkinter/__init__.py to 0, it will get a string r"{-activebackground {} {} {} {}} {-activeforeground {} {} {} {}} {-accelerator {} {} {} {}} {-background {} {} {} {}} {-bitmap {} {} {} {}} {-columnbreak {} {} 0 0} {-command {} {} {} 3067013228open_recent_file} {-compound compound Compound none none} {-font {} {} {} {}} {-foreground {} {} {} {}} {-hidemargin {} {} 0 0} {-image {} {} {} {}} {-label {} {} {} {1 /home/serhiy/py/cpython/\0.py}} {-state {} {} normal normal} {-underline {} {} -1 0}".  Then split() try recursively split its argument. When it splits '1 /home/serhiy/py/cpython/\\0.py' it interprets '\\0' as backslash substitution of octal code 0 which means a character with code 0. Tcl uses modified UTF-8 encoding in which null code is encoded as b'\xC0\x80'. This bytes sequence is invalid UTF-8. That is why UnicodeDecodeError was raised (patch for issue13153 handles b'\xC0\x80' more correctly). When you will try '\101.py', it will be translated by split() to 'A.py'.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19020>
_______________________________________


More information about the Python-bugs-list mailing list