[Patches] [ python-Patches-813436 ] Scalable zipfile extension

SourceForge.net noreply at sourceforge.net
Tue Feb 13 09:59:41 CET 2007


Patches item #813436, was opened at 2003-09-27 10:09
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=813436&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Performance
Group: Python 2.5
>Status: Closed
>Resolution: Rejected
Priority: 5
Private: No
Submitted By: Marc De Falco (deufeufeu)
Assigned to: Nobody/Anonymous (nobody)
Summary: Scalable zipfile extension

Initial Comment:
Playing around with large zipfiles (> 10000 files),
I've encountered big loading time, even if after having
loaded it I use only 30 files in it.
So I've introduced a differed parameter to the
Zipfile.__init__ in order to load headers on-demand.
As it's not a really good idea to activated it for all
zip it defaults to False.
I've updated the documentation too.

Thx and keep the good work ;)

P.S. : Dunno if it can be added to 2.3 or have to be
included in 2.4, so I've choosed 2.4 group.


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2007-02-13 09:59

Message:
Logged In: YES 
user_id=21627
Originator: NO

I'm rejecting the patch, for the following reasons:
- I agree with ronaldoussoren that this deferred loading already happens
in the 2.5 version (specifically, it happens inside read)
- I also agree that making it an optional parameter is unnecessary, it
just complicates the interface. I also think the proposed parameter name
('differed') is mis-spelled, and should have been 'deferred' (unless I'm
missing a meaning of 'differed')
- the implementation of the patch is unacceptable because it duplicates
code.

As for why this patch is "put on hold": it was not because a rewrite was
planned. The patch was contributed in 2003, and jafo's (first) comment was
in 2006. The patch was "on hold" because nobody found the time to review it
(just like these other 400 or so patches).

----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2006-12-22 20:00

Message:
Logged In: YES 
user_id=11375
Originator: NO

Patch #1446489, mentioned in Ronald's 2006-05-27 14:22 comment, was
committed and is in Python 2.5.  Is this patch still relevant?


----------------------------------------------------------------------

Comment By: A.M. Kuchling (akuchling)
Date: 2006-12-22 19:58

Message:
Logged In: YES 
user_id=11375
Originator: NO

According to
http://mail.python.org/pipermail/python-dev/2006-November/069969.html, the
author of the zipfile rewrite isn't quite happy with it.  It doesn't look
like the new module will be API-compatible with zipfile, so I think this
patch should still be considered for inclusion.


----------------------------------------------------------------------

Comment By: Ronald Oussoren (ronaldoussoren)
Date: 2006-05-27 20:22

Message:
Logged In: YES 
user_id=580910

Patch [1446489 ] zipfile: support for ZIP64 also addresses this as a
side-
effect of adding support ZIP64 support (for very big zipfiles).

BTW. I don't quite understand why this patch is put on hold just because a

rewrite of the zipfile module is planned. 

W.r.t. this patch: why is the on-demand loading optional? Loading the
per-file 
headers when the zipfile is opened is not necessary for normal operation,
the 
current zipfile module is basically doing a full verify of the zipfile on
all 
occassions. This isn't necessary for normal operation and I don't think
the 
infozip tools do this (probably because verification is very  expensive).

----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2006-05-25 16:38

Message:
Logged In: YES 
user_id=81797

Actually, we'll leave it open until the Summer of Code
implementation is completed and accepted.

Sean

----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2006-05-25 16:36

Message:
Logged In: YES 
user_id=81797

There is a summer of code project to re-write the zipfile
module, so this patch is moot.

Sean

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=813436&group_id=5470


More information about the Patches mailing list