[Python-Dev] Import and unicode: part two

Glenn Linderman v+python at g.nevcal.com
Thu Jan 20 21:59:15 CET 2011


On 1/20/2011 12:27 PM, Glyph Lefkowitz wrote:
> To support the latter, could we just make sure that zipimport has a
> consistent, non-locale-or-operating-system-dependent interpretation of
> encoding?  That way a distributed egg would be importable from a zipfile
> regardless of how screwed up the distribution target machine's
> filesystem is.  (And this is yet more motivation for distributors to set
> zip_safe=True.)

I guess zip_safe is a distutils thing, and I haven't (yet) used distutils.

But regarding zip files, I was trying to figure out if ZipFile module 
supported the CP437/UTF-8 flag, but its documentation seems to predate 
that concept, and just talks about unencoded byte streams.  Yet, I think 
I have Python3 code that passes str to the filenames, and that works, so 
some amount of encoding and decoding to something must be happening 
behind the documentation's back?

It does seem that if a ZipFile is created with the UTF-8 flag turned on, 
that Python should respect that, and that should be independent of the 
file system configured encoding on the local machine on which the 
ZipFile is used (as long as the name of the ZipFile is usable).

I do know that listing filenames from a zip file created without the 
UTF-8 flag, using ZipFile to access it and place the names inside a web 
page that specifies its encoding to be UTF-8 produces illegal 
characters, so I've become tuned in recently to the zip files do have 
such a flag, and have been learning the right options to turn it on for 
the command line tools I use to create zip files... but was surprised 
when investigating the same for ZipFile.



More information about the Python-Dev mailing list