[Distutils] The Wheel specification and Unicode filenames
Daniel Holth
dholth at gmail.com
Thu Feb 21 16:27:12 CET 2013
On Thu, Feb 21, 2013 at 10:22 AM, Daniel Holth <dholth at gmail.com> wrote:
> On Thu, Feb 21, 2013 at 10:13 AM, Vinay Sajip <vinay_sajip at yahoo.co.uk>wrote:
>
>> The Wheel specification talks about supporting Unicode in the filename of
>> wheel
>> files, but is mute on the subject of the names of the entries in the
>> archive.
>>
>> It would be good to have clarity on this point. The Python docs for 2.x
>> and 3.x
>> tell us:
>>
>> There is no official file name encoding for ZIP files. If you have
>> unicode
>> file names, you must convert them to byte strings in your desired
>> encoding
>> before passing them to write(). WinZip interprets all file names as
>> encoded
>> in CP437, also known as DOS Latin.
>>
>> The "your desired encoding" is, I think, too loose for wheel files, as we
>> want
>> interoperability between implementations. We should mandate CP437
>> encoding if we
>> want the files to be examinable on Windows in e.g. WinZip or 7-Zip. On
>> Linux,
>> file-roller seems to be unable to display Unicode, whether you use CP437
>> for the
>> filenames or whether you use utf-8.
>>
>
> I feign ignorance of any coding that is not utf-8.
> http://hg.python.org/cpython/file/d49685548a7a/Lib/zipfile.py#l404
>
> http://hg.python.org/cpython/file/d49685548a7a/Lib/zipfile.py#l1000
>
I will clarify the spec to include utf-8 as the filename encoding. The zip
format allows it (set general purpose bit 11) but a lot of programs do not
understand it. Python's zipfile supports utf-8 in zip.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20130221/cf322b9a/attachment.html>
More information about the Distutils-SIG
mailing list