[ python-Bugs-1543303 ] tarfile in mode w|gz adds padding that annoys gunzip

SourceForge.net noreply at sourceforge.net
Mon Aug 21 20:49:24 CEST 2006


Bugs item #1543303, was opened at 2006-08-19 18:48
Message generated for change (Comment added) made by nnorwitz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1543303&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.5
>Status: Closed
>Resolution: Fixed
Priority: 8
Submitted By: alexis (asak)
Assigned to: Neal Norwitz (nnorwitz)
Summary: tarfile in mode w|gz adds padding that annoys gunzip

Initial Comment:
In mode w|gz tarfile pads the final block with NULs,
until its size reaches the bufsize value passed to
tarfile.open.  This makes gunzip complain about
"invalid compressed data" because of CRC and length errors.

To reproduce it, put this fragment in a file archive.py


import sys
import tarfile

tar = tarfile.open(mode='w|gz', fileobj=sys.stdout)
tar.close()


and then:
$ python2.5 archive.py | gunzip -c

gunzip: stdin: invalid compressed data--crc error

gunzip: stdin: invalid compressed data--length error

Everything works fine with python 2.3.5 and 2.4.1 on
Debian sarge.

The padding is added by the following lines in
_Stream.close:

blocks, remainder = divmod(len(self.buf), self.bufsize)
if remainder > 0:
    self.buf += NUL * (self.bufsize - remainder)

They were added in revision 38581, but I'm not sure why
- at first sight, "Add tarfile open mode r|*" shouldn't
have to change this write path.

Removing them makes gunzip happy again, but I have no
idea if it breaks something else (test_tarfile doesn't
complain).

A similar problem happens if you use mode w|bz2 and
feed the output to bunzip - it complains about
"trailing garbage after EOF ignored".

Problems found while running the test suite from the
Mercurial SCM.

----------------------------------------------------------------------

>Comment By: Neal Norwitz (nnorwitz)
Date: 2006-08-21 11:49

Message:
Logged In: YES 
user_id=33168

Committed revision 51432. (2.6)
51436 (2.5)

----------------------------------------------------------------------

Comment By: Lars Gustäbel (gustaebel)
Date: 2006-08-21 05:08

Message:
Logged In: YES 
user_id=642936

I just created patch #1543897 that removes the 3 lines of
code and adds a test to test_tarfile.py. Thank you for the
detailed report.


----------------------------------------------------------------------

Comment By: Georg Brandl (gbrandl)
Date: 2006-08-20 07:17

Message:
Logged In: YES 
user_id=849994

This should be resolved before 2.5 final.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1543303&group_id=5470


More information about the Python-bugs-list mailing list