[Tutor] Anyone using tarfile?

Terry Carroll carroll at tjc.com
Tue Jul 15 09:21:23 CEST 2008


I'm trying to use tarfile with no luck.  Anyone on this list used it 
successfully?

Here's a sample program pared down to illustrate the error.  I'm 
arbitrarily trying to extract the 4th TARred file in the tarball (a file 
that I know from other debugging efforts is data/c410951c, and that I can 
see by inspection does exist in the tarball). 

My code (file playtar.py):

import tarfile, os
TARFILENAME = "freedb-update-20080601-20080708.tar.bz2"
assert os.path.exists(TARFILENAME)
assert tarfile.is_tarfile(TARFILENAME)

tf = tarfile.open(TARFILENAME, "r:bz2")
tf.debug=3 ; tf.errorlevel=2
tmembers = tf.getmembers()
sample = tmembers[4]
RC = tf.extract(sample)

The result:

C:\test\freedb>playtar.py
Traceback (most recent call last):
  File "C:\test\freedb\playtar.py", line 10, in <module>
    RC = tf.extract(sample)
  File "C:\Python25\lib\tarfile.py", line 1495, in extract
    self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
  File "C:\Python25\lib\tarfile.py", line 1562, in _extract_member
    if upperdirs and not os.path.exists(upperdirs):
  File "C:\Python25\lib\ntpath.py", line 255, in exists
    st = os.stat(path)
TypeError: stat() argument 1 must be (encoded string without NULL bytes), 
not str

The file comes from here: http://ftp.freedb.org/pub/freedb/

The bzip2 compression is unrelated to this.  If I manually bunzip the .bz2 
file to a plain old tar file (and open it with mode "r" instead of 
"r:bz2"), I get an identical error.

During some earlier poking around, I see some interesting things: instead 
of sample.name being "data/c410951c" (13 characters) or "/data/c410951c" 
(14 characters) as I would expect, it's a 169-character string: "                                                       
11034624707 11032232071 /data/c410951c".  I think those are zeroes, not 
blanks. 

Curiously, there is also another attribute named "prefix" that is not
documented in the tarfile.py documentation.  "prefix" is a 155-character
string that is equal to the first 155 characters of this oddly-too-long
name.  In fact, if you cut off this "prefix" from the name, you're left
with "/data/c410951c", whic his kind of what I was expecting name to be in
the first place.

The deeper I look into this, the more mystified I become.  Any ideas?



More information about the Tutor mailing list