zipfile module: problems with filename having non ascii characters

vincent_delft at yahoo.com vincent_delft at yahoo.com
Sun Aug 22 16:36:04 CEST 2004


"Martin v. Löwis" wrote:

> vincent_delft at yahoo.com wrote:
>> That limitation is only valid for zip files ?
> 
> It appears that WinZip and other tools interpret the file names in a
> zipfile in CP437. So to properly put non-ASCII file names into a
> zipfile, you need to convert them into CP437. If the file name
> contains a character which is not available in CP437, you cannot
> save the file in a zipfile (without renaming it).
> 

Thanks, with cp437 it rocks!!!!


> Not really a Unicode problem, but rather a problem that Unicode
> tries to solve.
> 
>> Is there an another "compression tool" that don't have such limitation
>> (tgz? , bz2? , ???à
> 
> tar, traditionally, is also unaware of character sets. Single Unix 3
> (and I believe also earlier) ended the tar wars with the introduction
> of the pax utility, which does allow for specification of a character
> set in a pax file; among the supported character sets are ISO-8859-n,
> and UTF-8.

Thanks for the info.

> 
> Jörg Schilling's star(1) also uses UTF-8 for file names.
> 
> On the non-tar side of the world, WinRAR supports Unicode in archives.
> For compatibility, they also put a non-Unicode name into the archive,
> but the Unicode name, if present, is meant to take precedence.
> 

Thus, the most "portable" compression tool.

Thanks for those valuable remarks.

Vincent



More information about the Python-list mailing list