[New-bugs-announce] [issue17153] tarfile extract fails when Unicode in pathname
Vinay Sajip
report at bugs.python.org
Thu Feb 7 17:43:21 CET 2013
New submission from Vinay Sajip:
The attached file failing.tar.gz contains a path with UTF-8-encoded Unicode. This causes extractall() to fail, but only when the destination path is Unicode. That's because it leads to a implicit str->unicode conversion using ASCII.
Test script:
import shutil, tarfile, tempfile
tf = tarfile.open('failing.tar.gz', 'r:gz')
workdir = tempfile.mkdtemp()
try:
# N.B. ensure dest path is Unicode to trigger the failure
tf.extractall(unicode(workdir))
finally:
shutil.rmtree(workdir)
Result:
$ python untar.py
Traceback (most recent call last):
File "untar.py", line 8, in <module>
tf.extractall(unicode(workdir))
File "/usr/lib/python2.7/tarfile.py", line 2046, in extractall
self.extract(tarinfo, path)
File "/usr/lib/python2.7/tarfile.py", line 2083, in extract
self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
File "/usr/lib/python2.7/posixpath.py", line 71, in join
path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 44: ordinal not in range(128)
----------
components: Library (Lib), Unicode
messages: 181631
nosy: ezio.melotti, vinay.sajip
priority: normal
severity: normal
status: open
title: tarfile extract fails when Unicode in pathname
type: behavior
versions: Python 2.7
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17153>
_______________________________________
More information about the New-bugs-announce
mailing list