[issue20329] zipfile.extractall fails in Posix shell with utf-8 filename

Laurent Mazuel report at bugs.python.org
Wed Jan 22 08:39:33 CET 2014


Laurent Mazuel added the comment:

Thanks for your answer.

I think you can't transcode internal zip filenames to FS encoding. Actually, in Unix the FS only stores bytes for filename, there is no "FS encoding". Then, if you change your locale, the filename printed will change too in your console. If you transcode filename using the current locale, unzipping twice the same file with two different locales will lead to two different files, which is not (I think) you are intending for.
The problem will not arise in Windows (NTFS is UTF-16) nor MAC OSX (UTF-8)

Moreover, a simple "unzip" works like a charm. It doesn't care about encoding or current locale and extract the file using the initial bytes in the zip. Unzipping twice with the two different locales creates only one file.

An interesting link (even if it is not an official reference):
http://unix.stackexchange.com/questions/2089/what-charset-encoding-is-used-for-filenames-and-paths-on-linux

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue20329>
_______________________________________


More information about the Python-bugs-list mailing list