[New-bugs-announce] [issue13664] UnicodeEncodeError in gzip when filename contains non-ascii

Jason R. Coombs report at bugs.python.org
Mon Dec 26 16:55:32 CET 2011

New submission from Jason R. Coombs <jaraco at jaraco.com>:

While investigating #11638, I encountered another encoding issue related to tarballs. Consider this command:

python -c "import gzip; gzip.GzipFile(u'\xe5rchive', 'w', fileobj=open(u'\xe5rchive', 'wb'))"

When run, it triggers the following traceback:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\python\lib\gzip.py", line 127, in __init__
  File "c:\python\lib\gzip.py", line 172, in _write_gzip_header
    self.fileobj.write(fname + '\000')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe5' in position 0: ordinal not in range(128)

Based on the resolution of #13639, I believe the recommended fix is to handle unicode here much like Python 3 does--specifically, detect unicode, encode to 'latin-1' if possible or leave the filename blank if not.

messages: 150265
nosy: jason.coombs
priority: low
severity: normal
status: open
title: UnicodeEncodeError in gzip when filename contains non-ascii
versions: Python 2.7

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list