[Patches] [ python-Patches-918101 ] tarfile.py enhancements

SourceForge.net noreply at sourceforge.net
Wed Jul 21 09:54:28 CEST 2004


Patches item #918101, was opened at 2004-03-17 16:59
Message generated for change (Comment added) made by gustaebel
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=918101&group_id=5470

Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Lars Gustäbel (gustaebel)
Assigned to: Nobody/Anonymous (nobody)
Summary: tarfile.py enhancements

Initial Comment:
I still develop tarfile.py sporadically on a separate
branch (http://www.gustaebel.de/lars/tarfile/), and so
there are two features from this branch that I'd like
to propose for inclusion in Python's tarfile.py:

1. Overcoming the 8GB file size limit (8GB-limit.patch)

At the moment it is not possible to add files to a tar
archive that exceed 8GB size. Although this is POSIX
compliant, GNU tar offers an extension header for
largefiles that encodes file sizes in an 88-bit number
instead of the common 11-digits octal number. Like all
other GNU extensions in tarfile.py, this feature is
turned on and off using the TarFile.posix attribute. 

2. Automatic compression detection for the stream
interface (stream-detect-compr.patch)

tarfile.py's stream interface (which can be used to
access tape devices or simply read a tar from stdin) is
a bit difficult to use because it's not able to detect
whether an archive is compressed or not. Compression
has to be explicitly specified using mode ("r|",
"r|gz", "r|bz2"). The patch introduces a fourth mode
"r|*" that makes automatic detection possible.


Both patches are not vitally important, but especially
the 8GB-patch is useful IMO.

----------------------------------------------------------------------

>Comment By: Lars Gustäbel (gustaebel)
Date: 2004-07-21 09:54

Message:
Logged In: YES 
user_id=642936

tarfile.py's stream interface must be used if the user wants
to read an archive that is not a seekable file, e.g. stdin
or a tape device. ATM, it is the user's job to find out
whether the stream is compressed (mode="r|gz" or "r|bz2") or
uncompressed (mode="r|"), which makes the stream interface
kind of awkward and unusable for many users. The patch
introduces an additional mode "r|*" which does this job. I
admit it's just a convenience thing but I think the stream
interface is somehow too complicated without it.

The reason why I changed the "type" argument to "comptype"
was just that the TarFile class uses "comptype" and the
_Stream class uses "type" for the same thing. It doesn't
need to be changed.

You're absolute right about the testcase. I had enough time
to write one ;-)

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2004-07-21 00:31

Message:
Logged In: YES 
user_id=33168

Lars, could you look at bug 949052 and provide any guidance?
 Thanks.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2004-07-21 00:28

Message:
Logged In: YES 
user_id=33168

I checked in the 8GB limit patch. Lib/tarfile.py 1.14.

I didn't check in the stream patch for 2 reasons:
1) I don't know the need.  Is this common?  I've never heard
of it.
2) The type parameter name was changed to comtype.  I wasn't
sure if this was necessary.  It potentially (albeit
unlikely) could break a program.  I'm not concerned about
changing the name of attribute.

Lars, can you provide a good reason to add this part of the
patch?  If it's not likely to be used, I don't think it
should be added.  If it is added, there should also be a test.

Thanks.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=918101&group_id=5470


More information about the Patches mailing list