[Python-Dev] Copying zlib compression objects

Guido van Rossum guido at python.org
Fri Feb 17 21:11:29 CET 2006


Please submit your patch to SourceForge.

On 2/17/06, Chris AtLee <chris at atlee.ca> wrote:
> I'm writing a program in python that creates tar files of a certain
>  maximum size (to fit onto CD/DVD).  One of the problems I'm running
>  into is that when using compression, it's pretty much impossible to
>  determine if a file, once added to an archive, will cause the archive
>  size to exceed the maximum size.
>
>
> I believe that to do this properly, you need to copy the state of tar
>  file (basically the current file offset as well as the state of the
>  compression object), then add the file.  If the new size of the archive
>  exceeds the maximum, you need to restore the original state.
>
>
> The critical part is being able to copy the compression object.
>  Without compression it is trivial to determine if a given file will
>  "fit" inside the archive.  When using compression, the compression
>  ratio of a file depends partially on all the data that has been
>  compressed prior to it.
>
>
> The current implementation in the standard library does not allow you
>  to copy these compression objects in a useful way, so I've made some
>  minor modifications (patch attached) to the standard 2.4.2 library:
>  - Add copy() method to zlib compression object.  This returns a new
>  compression object with the same internal state.  I named it copy() to
>  keep it consistent with things like sha.copy().
>  - Add snapshot() / restore() methods to GzipFile and TarFile.  These
>  work only in write mode.  snapshot() returns a state object.  Passing
>  in this state object to restore() will restore the state of the
>  GzipFile / TarFile to the state represented by the object.
>
>
> Future work:
>  - Decompression objects could use a copy() method too
>  - Add support for copying bzip2 compression objects
>
>
> Although this patch isn't complete, does this seem like a good approach?
>
>
> Cheers,
>  Chris
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list