[New-bugs-announce] [issue8633] tarfile doesn't support undecodable filename in PAX format
STINNER Victor
report at bugs.python.org
Thu May 6 00:14:51 CEST 2010
New submission from STINNER Victor <victor.stinner at haypocalc.com>:
tarfile is unable to open a TAR archive in PAX format embedding invalid filenames (filename not encoded in utf8, an undecodable filename). Attached file is an example (contain the file b'z/\xff', not decodable from utf8).
PAX specification has a "invalid" option with 4 values: bypass (default), rename, UTF-8, write.
http://www.opengroup.org/onlinepubs/009695399/utilities/pax.html
As it was done for other formats in issue #8390, PAX can use Python surrogateescape error handler to store undecodable bytes as unicode surrogates.
I think that PAX should be strict by default, but have an option to enable surrogateescape mode.
----------
components: Library (Lib)
files: z-pax.tar
messages: 105094
nosy: haypo, lars.gustaebel, loewis
priority: normal
severity: normal
status: open
title: tarfile doesn't support undecodable filename in PAX format
versions: Python 3.2
Added file: http://bugs.python.org/file17230/z-pax.tar
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8633>
_______________________________________
More information about the New-bugs-announce
mailing list