[Python-Dev] bug in tarfile module?
R. David Murray
rdmurray at bitdance.com
Thu Aug 23 14:40:21 CEST 2012
On Thu, 23 Aug 2012 12:25:26 +0300, Petri Lehtinen <petri at digip.org> wrote:
> Chris Withers wrote:
> > Hi All,
> >
> > This feels like a bug, but just wanted to check here before filing a
> > report if I've missed something:
> >
> > buzzkill$ python2.7
> > Enthought Python Distribution -- www.enthought.com
> > Version: 7.2-2 (32-bit)
> >
> > Python 2.7.2 |EPD 7.2-2 (32-bit)| (default, Sep 7 2011, 09:16:50)
> > [GCC 4.0.1 (Apple Inc. build 5493)] on darwin
> > Type "packages", "demo" or "enthought" for more information.
> > >>> import tarfile
> > >>> source = open('/src/Python-2.6.7.tgz', 'rb')
> > >>> tar = tarfile.open(fileobj=source, mode='r|*')
> > >>> member = tar.extractfile('Python-2.6.7/Lib/genericpath.py')
> > >>> data = member.read()
> > Traceback (most recent call last):
> > File "<stdin>", line 1, in <module>
> > File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/tarfile.py",
> > line 815, in read
> > buf += self.fileobj.read()
> > File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/tarfile.py",
> > line 735, in read
> > return self.readnormal(size)
> > File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/tarfile.py",
> > line 742, in readnormal
> > self.fileobj.seek(self.offset + self.position)
> > File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/tarfile.py",
> > line 554, in seek
> > raise StreamError("seeking backwards is not allowed")
> > tarfile.StreamError: seeking backwards is not allowed
> >
> > The key is the "mode='r*|" which I understood to be specifically for
> > reading blocks from a stream without seeking that would cause
> > problems.
>
> When discussing "filemode|[compression]" modes, the docs say:
>
> However, such a TarFile object is limited in that it does not
> allow to be accessed randomly
>
> I'm not a tarfile expert, but extracting a single file sounds like
> random access to me. If it was the first file in the archive (or there
> was only one file) it probably wouldn't count as random access.
There is an open doc bug for this:
http://bugs.python.org/issue10436
--David
More information about the Python-Dev
mailing list