Inconsistencies between zipfile and tarfile APIs

rantingrick rantingrick at gmail.com
Fri Jul 22 02:40:51 EDT 2011


On Jul 22, 12:45 am, Terry Reedy <tjre... at udel.edu> wrote:
> On 7/22/2011 12:48 AM, rantingrick wrote:
> > On Jul 21, 11:13 pm, Corey Richardson<kb1... at aim.com>  wrote:

> Hmm. Archives are more like directories than files. Windows, at least,
> seems to partly treat zipfiles as more or less as such.

Yes but a zipfile is just a file not a directory. This is not the
first time Microsoft has "mislead" people you know. ;-)

> Certainly, 7zip
> present a directory interface. So opening a zipfile/tarfile would be
> like opening a directory, which we normally do not do. On the other
> hand, I am not sure I like python's interface to directories that much.

I don't think we should make comparisons between applications and
API's.

> It would be more sensible to open files within the archives. Certainly,
> it would be nice to have the result act like file objects as much as
> possible.

Well you still need to start at the treetop (which is the zip/tar
file) because lots of important information is exposed at that level:

 * compressed file listing
 * created, modified times
 * adding / deleting
 * etc.

I'll admit you could think of it as a directory but i would not want
to do that. People need to realize that tar and zip files are FILES
and NOT folders.

> Seaching open issues for 'tarfile' or 'zipfile' returns about 40 issues
> each. So I think some people would care more about fixing bugs than
> adjusting the interfaces. Of course, some of the issues may be about the
> interface and increasing consistency where it can be done without
> compatibility issues.

Yes i agree! If we can at least do something as meager as this it
would be a step forward. However i still believe the current API is
broken beyond repair so we must introduce a new "archive" module.
That's my opinion anyway.

> However, I do not think there are any active
> developers focued on those two modules.

We need some fresh blood infused into Python-dev. I have been trying
to get involved for a long time. We as a community need to realize
that this community is NOT a homogeneous block. We need to be a little
more accepting of new folks and new ideas. I know this language would
evolve much quicker if we did.

> > Unfortunately i know what the "powers that be" are going to say about
> > fixing this wart.
>
> > PTB: "Sorry we cannot break backwards compatibility"
>
> Do you propose we break compatibility more than we do? You are not the
> only Python ranter. People at Google march into Guido's office to
> complain instead of posting here.

Well, i do feel for Guido because i know he's taking holy hell over
this whole Python 3000 thing. If you guys don't remember i was a
strong opponent of almost all the changes a few years ago (search the
archives). However soon after taking a "serious" look at the changes
and considering the benefits i was convinced. I believe we are moving
in the correct direction with the language HOWEVER the library is
growing stale by the second. I want to breathe new life into this
library and i believe many more people like myself exist but they
don't know how to get involved. I can tell everyone who is listening
the easiest first step is simply to speak up and make a voice for
yourself. Don't be afraid to state your opinions. You can start right
now by chiming in on this thread. Anybody is welcome to offer opinions
no matter what experience level.

> > Rick: But what about Python 3000?
> > PTB: " Oh, well, umm, lets see. Well that was then and this is now!
>
> The changes made for 3.0 were more than enough for some people to
> discourage migration to Py3. And we *have* made additional changes
> since. So the resistance to incompatible feature changes has increased.

Yes i do understand these changes have been very painful for some
folks (me included). However there is only but one constant in this
universe and that constant is change. I believe we can improve many of
these API's starting with zip/tar modules. By the time Python 4000
gets here (and it will be much sooner than you guys realize!) we need
to have this stdlib in pristine condition. That means:

 * Removing style guide violations.
 * Removing inconsistencies in existing API's.
 * Making sure doc strings and comments are everywhere.
 * Cleaning up the IDLE library (needs a complete re-write!)
 * Cleaning up Tkinter.
 * And more

Baby steps are the key to winning this battle. We hit all the easy
stuff first (doc-strings and style guide) and save the painful stuff
for Python 4000. Meanwhile we introduce new modules and deprecate the
old stuff. However we need to start the python 4000 migration now. We
cannot keep putting off what should have already been done in Python
3000.

> > Maybe i can offer a solution. A NEW module called "archive.py" (could
> > even be a package!) which exports both the zip and tar file classes.
> > However, unlike the current situation this archive module will be
> > consistent with it's API.
>
> Not a bad idea. Put it on PyPI and see how much support you can get.

Thanks, I might just do that!



More information about the Python-list mailing list