[New-bugs-announce] [issue8633] tarfile doesn't support undecodable filename in PAX format
report at bugs.python.org
Thu May 6 00:14:51 CEST 2010
New submission from STINNER Victor <victor.stinner at haypocalc.com>:
tarfile is unable to open a TAR archive in PAX format embedding invalid filenames (filename not encoded in utf8, an undecodable filename). Attached file is an example (contain the file b'z/\xff', not decodable from utf8).
PAX specification has a "invalid" option with 4 values: bypass (default), rename, UTF-8, write.
As it was done for other formats in issue #8390, PAX can use Python surrogateescape error handler to store undecodable bytes as unicode surrogates.
I think that PAX should be strict by default, but have an option to enable surrogateescape mode.
components: Library (Lib)
nosy: haypo, lars.gustaebel, loewis
title: tarfile doesn't support undecodable filename in PAX format
versions: Python 3.2
Added file: http://bugs.python.org/file17230/z-pax.tar
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce