[New-bugs-announce] [issue24514] tarfile fails to extract archive (handled fine by gnu tar and bsdtar)

Philippe report at bugs.python.org
Fri Jun 26 11:18:57 CEST 2015


New submission from Philippe:

The extraction fails when calling tarfile.open using this archive: http://archive.apache.org/dist/commons/logging/source/commons-logging-1.1.2-src.tar.gz

After some investigation, the file can be extracted with gnu tar and bsdtar and the gzip compression is not the issue: if I gunzip the tar.gz to a tar and call tarfile on plain tar, the problem is the same.

Also this archive was created most likely on Windows (based on the `file` command output) using some Java tools per http://commons.apache.org/proper/commons-logging/building.html from these original files: http://svn.apache.org/repos/asf/commons/proper/logging/tags/LOGGING_1_1_2/ ... that's all I could find out.


The error trace is slightly different on 2.7 and 3.4 but similar. 
The problem has been verified on Linux 64 with Python 2.7 and 3.4 and on Windows with Python 2.7.

On 2.7:

>>> TarFile.taropen(name)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/tarfile.py", line 1705, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib/python2.7/tarfile.py", line 1574, in __init__
    self.firstmember = self.next()
  File "/usr/lib/python2.7/tarfile.py", line 2335, in next
    raise ReadError(str(e))
tarfile.ReadError: invalid header


On 3.4:

>>> TarFile.taropen(name)
Traceback (most recent call last):
  File "/usr/lib/python3.4/tarfile.py", line 180, in nti
    n = int(nts(s, "ascii", "strict") or "0", 8)
ValueError: invalid literal for int() with base 8: '       '

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.4/tarfile.py", line 2248, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/lib/python3.4/tarfile.py", line 1083, in fromtarfile
    obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
  File "/usr/lib/python3.4/tarfile.py", line 1032, in frombuf
    obj.uid = nti(buf[108:116])
  File "/usr/lib/python3.4/tarfile.py", line 182, in nti
    raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.4/tarfile.py", line 1595, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib/python3.4/tarfile.py", line 1469, in __init__
    self.firstmember = self.next()
  File "/usr/lib/python3.4/tarfile.py", line 2260, in next
    raise ReadError(str(e))
tarfile.ReadError: invalid header

----------
components: Library (Lib)
files: commons-logging-1.1.2-src.tar.gz
messages: 245839
nosy: lars.gustaebel, pombreda
priority: normal
severity: normal
status: open
title: tarfile fails to extract archive (handled fine by gnu tar and bsdtar)
versions: Python 2.7, Python 3.4
Added file: http://bugs.python.org/file39814/commons-logging-1.1.2-src.tar.gz

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24514>
_______________________________________


More information about the New-bugs-announce mailing list