[Tutor] Anyone using tarfile?
Terry Carroll
carroll at tjc.com
Tue Jul 15 09:21:23 CEST 2008
I'm trying to use tarfile with no luck. Anyone on this list used it
successfully?
Here's a sample program pared down to illustrate the error. I'm
arbitrarily trying to extract the 4th TARred file in the tarball (a file
that I know from other debugging efforts is data/c410951c, and that I can
see by inspection does exist in the tarball).
My code (file playtar.py):
import tarfile, os
TARFILENAME = "freedb-update-20080601-20080708.tar.bz2"
assert os.path.exists(TARFILENAME)
assert tarfile.is_tarfile(TARFILENAME)
tf = tarfile.open(TARFILENAME, "r:bz2")
tf.debug=3 ; tf.errorlevel=2
tmembers = tf.getmembers()
sample = tmembers[4]
RC = tf.extract(sample)
The result:
C:\test\freedb>playtar.py
Traceback (most recent call last):
File "C:\test\freedb\playtar.py", line 10, in <module>
RC = tf.extract(sample)
File "C:\Python25\lib\tarfile.py", line 1495, in extract
self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
File "C:\Python25\lib\tarfile.py", line 1562, in _extract_member
if upperdirs and not os.path.exists(upperdirs):
File "C:\Python25\lib\ntpath.py", line 255, in exists
st = os.stat(path)
TypeError: stat() argument 1 must be (encoded string without NULL bytes),
not str
The file comes from here: http://ftp.freedb.org/pub/freedb/
The bzip2 compression is unrelated to this. If I manually bunzip the .bz2
file to a plain old tar file (and open it with mode "r" instead of
"r:bz2"), I get an identical error.
During some earlier poking around, I see some interesting things: instead
of sample.name being "data/c410951c" (13 characters) or "/data/c410951c"
(14 characters) as I would expect, it's a 169-character string: "
11034624707 11032232071 /data/c410951c". I think those are zeroes, not
blanks.
Curiously, there is also another attribute named "prefix" that is not
documented in the tarfile.py documentation. "prefix" is a 155-character
string that is equal to the first 155 characters of this oddly-too-long
name. In fact, if you cut off this "prefix" from the name, you're left
with "/data/c410951c", whic his kind of what I was expecting name to be in
the first place.
The deeper I look into this, the more mystified I become. Any ideas?
More information about the Tutor
mailing list