[Python-Dev] tarfile and unicode filenames in windows
Facundo Batista
facundobatista at gmail.com
Thu Jun 8 21:11:06 CEST 2006
I'm working in Windows 2K SP4. I have a directory with non-ascii names
(i.e.: "camión.txt").
I'm trying to tar.bzip it:
nomdir = sys.argv[1]
tar = tarfile.open("prueba.tar.bz2", "w:bz2")
tar.add(nomdir)
tar.close()
This works ok, even considering that the "ó" in the filename is not
ascii 7-bits.
But then I put a file in that directory that has a more strange name
(one with an "o" and a dash above it): Myō-ō.txt
Here, the tarfile can't find the file. This is the same limitation
that with listdir(), where I have to pass the directory name unicoded,
to the system be able to find it. So:
nomdir = unicode(sys.argv[1])
tar = tarfile.open("prueba.tar.bz2", "w:bz2")
tar.add(nomdir)
tar.close()
The problem is that when tarfile finds that name, it crashes:
Traceback (most recent call last):
File "comprim.py", line 8, in ?
tar.add(nomdir)
File "C:\python24\lib\tarfile.py", line 1239, in add
self.add(os.path.join(name, f), os.path.join(arcname, f))
File "C:\python24\lib\tarfile.py", line 1232, in add
self.addfile(tarinfo, f)
File "C:\python24\lib\tarfile.py", line 1297, in addfile
self.fileobj.write(tarinfo.tobuf())
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in
position 8: ordinal not in range(128)
This is because tarinfo.tobuf() creates a unicode object (because it
has the filename on it), and file.write() must have a standard string.
This is a known problem? Shall I post a bug? Couldn't find any
regarding this, and google didn't help here.
Thank you very much!
--
. Facundo
Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
More information about the Python-Dev
mailing list