Re: [Python-Dev] Zip format (was: Questions about distutils strategy )

Apparently I mis-phrased my question, I'll try again. When people suggested to use zip format as the standard Python archive format I was a bit worried, becuase I've had it happen to me various times that I was unable to create a ZIP archive with two files with the same name but different paths (i.e. create an archive of a directory that contains both a foo/bar.py and a foo/spam/bar.py). So, my question was: has this happened to me because the winzip I used was braindead, or is there possibly a problem with the ZIP file format that disallows two files with the same name in one archive? Most zip programs I've seen also seem to present filenames as the primary metaphore, with full pathnames somewhat "tacked on". If the latter is the case I wonder whether zip is the right format to use... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm

Again, the zip format does not have this problem. Some zip tools may -- then we don't use those. --Guido van Rossum (home page: http://www.python.org/~guido/)

Jack Jansen wrote:
Hmm, I've been doing the above for years now... never had a problem with it (I use Info-ZIPs tools, BTW), e.g. /home/lemburg> unzip -l projects/distribution/mxODBC-1.1.1.zip Archive: projects/distribution/mxODBC-1.1.1.zip Length Date Time Name -------- ---- ---- ---- 131316 06-09-99 14:10 ODBC/EasySoft/mxODBC.c 131316 06-09-99 14:10 ODBC/Informix/mxODBC.c ... Would be cool if I could use my packages as ZIP files :-) So here's another vote for using the ZIP format. BTW, wouldn't it make sense to include the zlib code in the core distribution much like the pcre stuff is now ? AFAIK, it is public domain and including it would remedy many of the compatibility issues with the different zlib versions around. Ok, that was my wish for Xmas :-) The rest is up to Mr. Clause... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

What compatibility issues? Note that the Win32 distri already comes with zlib statically linked into zlib.pyd. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
There were issues with zlib 1.0.4 and later ones. Also, many Linux distributions don't have the zlib header files installed. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

There were issues with zlib 1.0.4 and later ones. Also, many Linux distributions don't have the zlib header files installed.
Hm. I don't recall having any problems reported to me. I'd rather not include the entire zlib distri in the Python distri -- zlib is rather big. Adding only the Unix source would be cheating. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
How about only adding those parts which would be needed to at least deflate the ZIP archive contents ? If the ZIP archive format becomes the standard for Python, we'd have to ensure that all Python users can read them. Well, at least that's what I would expect from a standard format :-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

How about only adding those parts which would be needed to at least deflate the ZIP archive contents ?
Ditto -- still lots of portability issues I bet.
There's a simple solution: don't use compression. With current disk prices it's really not worth it. Let the installer do the decompression (installers travel across networks where compression *is* worth it). --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Hmm, not sure: zlib is pretty portable. Its the interface changes that can break code, not so much the zlib portability.
That's a possibility, right. It would still let us use the many ZIP tools while not adding complexity to the core. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

"M.-A. Lemburg" wrote:
I think that for now we will need to create archives with compression method zero: no compression. That is a valid compression method all ZIP utilities support. The point is that zlib just isn't part of Python. Jim

On Fri, 10 Dec 1999 11:19:47 -0500, you wrote:
Minor data point on the importance of zlib. I spent a long time figuring out what Adobe PDF's "flate filter" was before I discovered it was the inverse of "deflate" (yes, there were loud sounds of head-slapping when I clicked) and discovered that zlib.compress() was EXACTLY what you need to create compressed streams in PDF documents. Being a Windows person, I naively assumed zlib was in the standard distribution everywhere, and subsequently discovered Mac and Unix users were not so happy. So if you want to make PDFs, having zlib around is very useful indeed... - Andy

Andy Robinson writes:
... So if you want to make PDFs, having zlib around is very useful indeed...
This raises a good point, though I still dislike the idea of including the zlib library. It would be nice if Setup.in would be autogenerated to compile all the modules it can -- bsddb if it finds libdb, zlib if it finds libz.a. I vaguely recall once working on a Python script that would generate a customized Setup.in file, though I can't find it at the moment. Given that someone has already suggested automatically enabling threads on those platforms that support it, why not go all the way? (But a Python script that generates a Setup.in isn't going to work, unless we compile a minipython first and then create a more complete Setup file.) -- A.M. Kuchling http://starship.python.net/crew/amk/ The most merciful thing in the world... is the inability of the human mind to correlate all its contents. -- H.P. Lovecraft

Andrew M. Kuchling [akuchlin@mems-exchange.org] wrote:
WEll, one warning about BSDdb, is that it comes in 3 incarnations that all might be -ldb :-): 1.85 2.x 3.x and they are NOT compatible with eachother. 1.85 has serious brain damage, and honestly I'd love to see Robin Dunn's 2.x work rolled in to replace it, but not sure how viable that is---people might actually want the 1.85 breakage. Chris -- | Christopher Petrilli | petrilli@amber.org

M.-A. Lemburg writes:
There were issues with zlib 1.0.4 and later ones. Also, many Linux distributions don't have the zlib header files installed.
For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm, and zlib.XXX.rpm only contains libz.so. On the other hand, anyone who's compiling Python should really have the various -devel RPMs installed. I'd argue against including it, because it might cause odd versioning problems. For example, what if I have PIL compiled against zlib1.1.2 (zlib is used for writing PNGs) and the Python binary includes zlib1.1.3? There might be hard-to-debug problems caused by calling the wrong symbol. PCRE is a special case, because we've actually hacked the code a lot; it's not the PCRE code as Philip Hazel distributes it. Just received Guido's email suggesting skipping compression in archives; not a bad idea. You'd use less CPU, but might do more I/O because you're reading more sectors off disk. There probably isn't much need for compression when the archive is on-disk; Java needed it because of applets. -- A.M. Kuchling http://starship.python.net/crew/amk/ The NSA response was, "Well, that was interesting, but there aren't any ciphers like that." -- Gus Simmons, "The History of Subliminal Channels"

On Fri, 10 Dec 1999, Andrew M. Kuchling wrote:
Exactly. The distro's *have* the headers -- it all depends on what you installed. I happen to have the headers on my system (because I installed zlib-devel, as AMK mentions).
I totally agree.
There are all kinds of things that we can do here. Consider mmap'ing the archive into a shared memory segment, used by all the Python processes on the system... woo! :-) IMO, the standard distro can use zip files, and just bail if they are compressed, but Python cannot load zlib. Obvious failure with an obvious remedy. No big deal. As Guido also mentions, an installer can just bring along zlib if they want to use a compressed archive. i.e. their choice. Cheers, -g -- Greg Stein, http://www.lyra.org/

Greg Stein <gstein@lyra.org> wrote:
it doesn't really look like this, but I hope we're defining interfaces here, and not just "one true solution". I'd be very annoyed if it turned out that we couldn't use works' archives with the new standard importer...
As Guido also mentions, an installer can just bring along zlib if they want to use a compressed archive. i.e. their choice.
in the pythonworks universe, the installer and the application is the same thing... </F>

On Sat, 11 Dec 1999, Fredrik Lundh wrote:
Oh, I was just having fun there :-). I don't see "one true solution" at all. Just some standards.
very annoyed if it turned out that we couldn't use works' archives with the new standard importer...
get_code() and its processing is not going anywhere. Some stuff will change under the covers, and we'll be using sys.path (typically) rather than chaining (although chaining will still exist!). I would think that your Importer subclass would be directly usable, but the installation could/would be a bit different. Heck, worst case, nothing is going to invalidate your archive format -- feel free to berate me if I ever break that! Cheers, -g -- Greg Stein, http://www.lyra.org/

Jack Jansen wrote:
No problem. But most zip tools will create an archive with either no path (file name is "bar.py") or full path (filename "foo/bar.py". If paths are different Ok, not sure about duplicate bare names. The difference is an option and has nothing to do with how the file name is specified to the utility. JimA

Again, the zip format does not have this problem. Some zip tools may -- then we don't use those. --Guido van Rossum (home page: http://www.python.org/~guido/)

Jack Jansen wrote:
Hmm, I've been doing the above for years now... never had a problem with it (I use Info-ZIPs tools, BTW), e.g. /home/lemburg> unzip -l projects/distribution/mxODBC-1.1.1.zip Archive: projects/distribution/mxODBC-1.1.1.zip Length Date Time Name -------- ---- ---- ---- 131316 06-09-99 14:10 ODBC/EasySoft/mxODBC.c 131316 06-09-99 14:10 ODBC/Informix/mxODBC.c ... Would be cool if I could use my packages as ZIP files :-) So here's another vote for using the ZIP format. BTW, wouldn't it make sense to include the zlib code in the core distribution much like the pcre stuff is now ? AFAIK, it is public domain and including it would remedy many of the compatibility issues with the different zlib versions around. Ok, that was my wish for Xmas :-) The rest is up to Mr. Clause... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

What compatibility issues? Note that the Win32 distri already comes with zlib statically linked into zlib.pyd. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
There were issues with zlib 1.0.4 and later ones. Also, many Linux distributions don't have the zlib header files installed. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

There were issues with zlib 1.0.4 and later ones. Also, many Linux distributions don't have the zlib header files installed.
Hm. I don't recall having any problems reported to me. I'd rather not include the entire zlib distri in the Python distri -- zlib is rather big. Adding only the Unix source would be cheating. --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
How about only adding those parts which would be needed to at least deflate the ZIP archive contents ? If the ZIP archive format becomes the standard for Python, we'd have to ensure that all Python users can read them. Well, at least that's what I would expect from a standard format :-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

How about only adding those parts which would be needed to at least deflate the ZIP archive contents ?
Ditto -- still lots of portability issues I bet.
There's a simple solution: don't use compression. With current disk prices it's really not worth it. Let the installer do the decompression (installers travel across networks where compression *is* worth it). --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Hmm, not sure: zlib is pretty portable. Its the interface changes that can break code, not so much the zlib portability.
That's a possibility, right. It would still let us use the many ZIP tools while not adding complexity to the core. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 21 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

"M.-A. Lemburg" wrote:
I think that for now we will need to create archives with compression method zero: no compression. That is a valid compression method all ZIP utilities support. The point is that zlib just isn't part of Python. Jim

On Fri, 10 Dec 1999 11:19:47 -0500, you wrote:
Minor data point on the importance of zlib. I spent a long time figuring out what Adobe PDF's "flate filter" was before I discovered it was the inverse of "deflate" (yes, there were loud sounds of head-slapping when I clicked) and discovered that zlib.compress() was EXACTLY what you need to create compressed streams in PDF documents. Being a Windows person, I naively assumed zlib was in the standard distribution everywhere, and subsequently discovered Mac and Unix users were not so happy. So if you want to make PDFs, having zlib around is very useful indeed... - Andy

Andy Robinson writes:
... So if you want to make PDFs, having zlib around is very useful indeed...
This raises a good point, though I still dislike the idea of including the zlib library. It would be nice if Setup.in would be autogenerated to compile all the modules it can -- bsddb if it finds libdb, zlib if it finds libz.a. I vaguely recall once working on a Python script that would generate a customized Setup.in file, though I can't find it at the moment. Given that someone has already suggested automatically enabling threads on those platforms that support it, why not go all the way? (But a Python script that generates a Setup.in isn't going to work, unless we compile a minipython first and then create a more complete Setup file.) -- A.M. Kuchling http://starship.python.net/crew/amk/ The most merciful thing in the world... is the inability of the human mind to correlate all its contents. -- H.P. Lovecraft

Andrew M. Kuchling [akuchlin@mems-exchange.org] wrote:
WEll, one warning about BSDdb, is that it comes in 3 incarnations that all might be -ldb :-): 1.85 2.x 3.x and they are NOT compatible with eachother. 1.85 has serious brain damage, and honestly I'd love to see Robin Dunn's 2.x work rolled in to replace it, but not sure how viable that is---people might actually want the 1.85 breakage. Chris -- | Christopher Petrilli | petrilli@amber.org

M.-A. Lemburg writes:
There were issues with zlib 1.0.4 and later ones. Also, many Linux distributions don't have the zlib header files installed.
For example, on RH6.0, zlib.h and zlib.a are in zlib-devel.XXX.rpm, and zlib.XXX.rpm only contains libz.so. On the other hand, anyone who's compiling Python should really have the various -devel RPMs installed. I'd argue against including it, because it might cause odd versioning problems. For example, what if I have PIL compiled against zlib1.1.2 (zlib is used for writing PNGs) and the Python binary includes zlib1.1.3? There might be hard-to-debug problems caused by calling the wrong symbol. PCRE is a special case, because we've actually hacked the code a lot; it's not the PCRE code as Philip Hazel distributes it. Just received Guido's email suggesting skipping compression in archives; not a bad idea. You'd use less CPU, but might do more I/O because you're reading more sectors off disk. There probably isn't much need for compression when the archive is on-disk; Java needed it because of applets. -- A.M. Kuchling http://starship.python.net/crew/amk/ The NSA response was, "Well, that was interesting, but there aren't any ciphers like that." -- Gus Simmons, "The History of Subliminal Channels"

On Fri, 10 Dec 1999, Andrew M. Kuchling wrote:
Exactly. The distro's *have* the headers -- it all depends on what you installed. I happen to have the headers on my system (because I installed zlib-devel, as AMK mentions).
I totally agree.
There are all kinds of things that we can do here. Consider mmap'ing the archive into a shared memory segment, used by all the Python processes on the system... woo! :-) IMO, the standard distro can use zip files, and just bail if they are compressed, but Python cannot load zlib. Obvious failure with an obvious remedy. No big deal. As Guido also mentions, an installer can just bring along zlib if they want to use a compressed archive. i.e. their choice. Cheers, -g -- Greg Stein, http://www.lyra.org/

Greg Stein <gstein@lyra.org> wrote:
it doesn't really look like this, but I hope we're defining interfaces here, and not just "one true solution". I'd be very annoyed if it turned out that we couldn't use works' archives with the new standard importer...
As Guido also mentions, an installer can just bring along zlib if they want to use a compressed archive. i.e. their choice.
in the pythonworks universe, the installer and the application is the same thing... </F>

On Sat, 11 Dec 1999, Fredrik Lundh wrote:
Oh, I was just having fun there :-). I don't see "one true solution" at all. Just some standards.
very annoyed if it turned out that we couldn't use works' archives with the new standard importer...
get_code() and its processing is not going anywhere. Some stuff will change under the covers, and we'll be using sys.path (typically) rather than chaining (although chaining will still exist!). I would think that your Importer subclass would be directly usable, but the installation could/would be a bit different. Heck, worst case, nothing is going to invalidate your archive format -- feel free to berate me if I ever break that! Cheers, -g -- Greg Stein, http://www.lyra.org/

Jack Jansen wrote:
No problem. But most zip tools will create an archive with either no path (file name is "bar.py") or full path (filename "foo/bar.py". If paths are different Ok, not sure about duplicate bare names. The difference is an option and has nothing to do with how the file name is specified to the utility. JimA
participants (10)
-
Andrew M. Kuchling
-
andy@robanal.demon.co.uk
-
Christopher Petrilli
-
Fredrik Lundh
-
Greg Stein
-
Guido van Rossum
-
Jack Jansen
-
James C. Ahlstrom
-
Jean-Claude Wippler
-
M.-A. Lemburg