zipfile with umlaut in filename

Martin v. Löwis martin at v.loewis.de
Mon Apr 28 16:34:26 EDT 2003


Patrick Useldinger <uselpa at myrealbox.com> writes:

> Well, doing the same thing with WinZip works ok and keeps the name as
> it should.
> Also, reading *that* archive back in Python yields, as a result of
> namelist():
> ['Der h\x94fische Charakter der Liebe.doc']
> for a file called
> 'Der höfische Character der Liebe'.

And why would you think that \x94 is "o-umlaut"?

It appears that Python puts the non-ASCII characters into the zipfile
as the system reports them (i.e. in the ANSI code page on your
system); winzip apparently converts them to some other encoding first,
probably code page 437 or code page 850.

Neither approach is more correct than the other - they are just
incompatible, and the zip format specification does not specify a
right approach. Non-ASCII is simply not supported in zipfiles (WinZip
supports it, zipfile.py supports it as well, but with a different
approach).

To get the effect of Winzip in Python, when adding a file with name N,
do

  zfile.write(N, unicode(N, "cp1252").encode("cp437"))

and then try to open the zipfile with Winzip.

HTH,
Martin





More information about the Python-list mailing list