
After a brief discussion on the Doc-SIG, it looks like I can reasonably drop the .tar.gz packaging for the documentation, leaving only .zip and .tar.bz2 formats. Are there any strong objections to this change? (If this flies here, I'll ask the larger community, but I wanted to get and idea of the reaction here first.) -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation

Hi, Fred L. Drake, Jr. wrote:
What is the reason to do so? Can it do any harm do leave it in? just curious... Gerrit Holl. -- 6. If any one steal the property of a temple or of the court, he shall be put to death, and also the one who receives the stolen thing from him shall be put to death. -- 1780 BC, Hammurabi, Code of Law -- Asperger Syndroom - een persoonlijke benadering: http://people.nl.linux.org/~gerrit/ Het zijn tijden om je zelf met politiek te bemoeien: http://www.sp.nl/

On Wed, Oct 01, 2003, Gerrit Holl wrote:
Two points: * It's another step in the release process * It takes up extra space on the servers Following Fred's suggestion saves time and space. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan

Aahz writes:
* It's another step in the release process
The way I wrote up the documentation release in PEP 101, generating the files isn't even a step. There are a couple of make commands that cause these to be generated; these would not change; just the definitions for those targets would change.
* It takes up extra space on the servers
There is this; not a huge deal, but considering we're running on hardware owned by XS4ALL, and we're dependent on their goodwill, we shouldn't waste the space if we don't need to.
Following Fred's suggestion saves time and space.
I think more important is that it reduces the number of options that get presented to some who's looking to download something. The plethora of documentation packages is almost embarassing when compared to the number of packages for the interpreter itself: the sources as a .tar.gz package (no ZIP!), and the Windows installer. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation

My $0.02 (Canadian), for what it's worth: While Windows users may have trouble with *.bz2, and be unfamiliar enough with the extension *.tgz to not even try (even if it does work), I've never known a *nix box to have trouble with *.zip or known a unix user who had trouble with *.zip. So I'd suggest keeping the various flavors of documentation, but standardize on zip compression. That will at least remove one variable. I agree that the main point of all of this is to reduce confusion for the newbie coming to the site to download it. But 90% of those are going to be windows users, and the rest of us have gotten used to living in a windows-dominated world. Using bz2 may get you better compression and save bandwidth, but it wasn't standard the last time I installed RedHat or Debian. Zip has it's faults, but everybody is familiar with it. --Dethe

Dethe Elza writes: [...]
What Unix boxen do you use? I often run into Solaris, IRIX, and HP-UX boxen that lack unzip.
I wouldn't switch to bz2. Even tgz can be confusing. Having .zip files for Windows users and .tar.gz files for Unix users is a happy medium that should work most everywhere. Of course for maximum Unix portability I suppose you could use .tar.Z ;-) -tree -- Tom Emerson Basis Technology Corp. Software Architect http://www.basistech.com "Beware the lollipop of mediocrity: lick it once and you suck forever"

[Dethe Elza]
A difficulty is that the HTML doc set compresses *much* better under bz2 than under zip format, and many people download over slow and expensive dialup lines. bz2 is preferred for that reason (smaller file == faster and cheaper download).
I don't believe that, because the Windows installer for Python includes the full doc set in a Windows-friendly format. So there's simply no reason for the vast majority of Windows Python users to download the doc distribution at all. Fred, do we have stats on how often each of the files got downloaded for previous releases?
No argument there.

Tim Peters writes:
Fred, do we have stats on how often each of the files got downloaded for previous releases?
No, but we should be able to pull those from the server logs. Maybe this weekend I'll get time to write a script to pull that data out. Tom Emerson writes:
Interesting. bzip2 saves half a MB over gzip for the HTML and PostScript formats. What reason do you have for not using bzip2? It was very heavily requested for the file-size advantage.
Of course for maximum Unix portability I suppose you could use .tar.Z ;-)
Except nobody remembers what to do with those anymore. ;-) I haven't used compress/uncompress in *many* years. -Fred -- Fred L. Drake, Jr. <fred at zope.com> PythonLabs at Zope Corporation

On Wed, 1 Oct 2003, Fred L. Drake, Jr. wrote:
Interesting. bzip2 saves half a MB over gzip for the HTML and PostScript formats.
If you're producing PDF, why produce Postscript? AFAIK, Ghostscript digests PDF and can generate Postscript for those that have/want to use a Postscript printer. Around here, print shops seem to actually _prefer_ PDF. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au (pref) | Snail: PO Box 370 andymac@pcug.org.au (alt) | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia

Andrew MacIntyre writes:
I recall a number of people wanting to use the PostScript to drive real PostScript printers directly. That was some time ago; perhaps Ghostscript can handle PDF sufficiently now. If there's no longer any interest in having the PostScript available, I'll be glad to drop that. I guess I really should come up with a script that pulls the relevant stats from the site logs. -Fred -- Fred L. Drake, Jr. <fred at zope.com> PythonLabs at Zope Corporation

[Andrew MacIntyre]
But some of us are not print shops, and have Postscript printers, which are better fed with Postscript, and do not directly accept PDF. PDF to Postscript converters are not 100% dependable, even if they do the job most of the time. Given `.pdf' and `.ps', for one, I would almost always pick the `.ps' file, to avoid possible fights and trouble. -- François Pinard http://www.iro.umontreal.ca/~pinard

Dethe Elza writes:
At this point, the bzip2 compression has been the most-requested (in terms of emails begging us to add it); the most important aspect that makes it desirable is that the file sizes are so much better. From this perspective, ZIP files are the worst for the formats which cause a lot of individual files to be packaged (most importantly, the HTML and LaTeX source formats). There are still a lot of people who want to pull the files over slow links that this seems valuable, at least for those two formats. (It may be that it's *only* valuable for those formats, and can be dropped for the PDF and PostScript formats.)
Interesting; I don't recall the last time I had to build my own bzip2. I'm pretty sure I didn't do anything special to get it on RedHat recently. The bandwidth savings aren't nearly so valuable to python.org as they are to end users on metered internet connections; those are the users who were so incredibly vocal that we actually started posting those. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation

On Wed, 2003-10-01 at 16:24, Fred L. Drake, Jr. wrote:
No, I'm sure you didn't. bzip2 decompression should be standard on RH9, and there's even a tar option to read and write it. What I don't know is whether bz2 decompression is generally available on MacOSX... minority-platform-ly y'rs, -Barry

Barry Warsaw wrote:
Considering RH hosts the bzip2 site I would hope you could build on their OS. =)
What I don't know is whether bz2 decompression is generally available on MacOSX...
It is; StuffIt can decompress it. I just downloaded the GNU Info docs and had no problem with double-clicking the file and decompressing. -Brett

Barry> What I don't know is whether bz2 decompression is generally Barry> available on MacOSX... Fink is your friend: % type bzip2 bzip2 is /sw/bin/bzip2 so, no, it's not standard on Mac OS X. S

Skip Montanaro <skip@pobox.com> writes:
Just because fink supplies something doesn't mean it didn't come with the base install. Jaguar has bzip2 installed; I don't think 10.1 did. Cheers, mwh -- SCSI is not magic. There are fundamental technical reasons why it is necessary to sacrifice a young goat to your SCSI chain now and then. -- John Woods

Barry> What I don't know is whether bz2 decompression is generally Barry> available on MacOSX... Skip> Fink is your friend: Skip> % type bzip2 Skip> bzip2 is /sw/bin/bzip2 Skip> so, no, it's not standard on Mac OS X. Sorry, should have used "type -a" so I saw the version in /usr/bin. Skip

There's always machines out there that won't support newer formats out of the box, so may I suggest the following course of action: 1. For now we add bz2 compression, and put that at the top of the list, with gz far below it. If we want to get real fancy we could even put it behind another link "old formats". 2. At some point in the future we look at the http logs to see how many people still use the older format. .Z files were still very useful to some people long after .gz had become the norm, just because they were stuck on old boxes. And if Python goes out of its way to remain buildable on various old boxes as-os it would be silly if we would require people to download third-party stuff just to decode the documentation... -- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

[Jack Jansen]
.Z files were still very useful to some people long after .gz had become the norm, just because they were stuck on old boxes.
Do anybody remember `.z' files? (`pack' and `unpack' were the tools, unless I'm mistaken). I'm _not_ suggesting that they get supported :-). Despite `.Z' is not as old as `.z', they are not very far, once added some perspective. -- François Pinard http://www.iro.umontreal.ca/~pinard

Jack Jansen writes:
At this point, we've been providing bzip2-compressed tarballs for three years; they became available with Python 1.6 (does anyone even remember that release?).
2. At some point in the future we look at the http logs to see how many people still use the older format.
I'm hoping to write the script to do that this weekend. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation

Fred L. Drake, Jr. <fdrake@acm.org> wrote:
It was included in the RedHat 6.2 distribution, possibly in 6.1 and 6.0 as well, though I can't check that. It hasn't been an "exotic" package in many years, although it's not necessarily installed by default in a "base" install. I see no reason not to use .bz2 as the default format. Charles -- ----------------------------------------------------------------------- Charles Cazabon <python@discworld.dyndns.org> GPL'ed software available at: http://www.qcc.ca/~charlesc/software/ -----------------------------------------------------------------------

Dethe> I've never known a *nix box to have trouble with *.zip or known a Dethe> unix user who had trouble with *.zip. So I'd suggest keeping the Dethe> various flavors of documentation, but standardize on zip Dethe> compression. That will at least remove one variable. Agreed. We did encounter a problem with a zip file in the SpamBayes group recently which we believe (though haven't confirmed - the OP has apparently gone underground) was related to WinZip problems. As I understand it, if you set an option in WinZip to "flatten" a zip file, all future zip files are also flattened. I guess it's a case of setting that option then poking the "Save Options" or "OK" button, then forgetting that other zip files will have structure which shouldn't be eliminated. Skip

Hi, Fred L. Drake, Jr. wrote:
What is the reason to do so? Can it do any harm do leave it in? just curious... Gerrit Holl. -- 6. If any one steal the property of a temple or of the court, he shall be put to death, and also the one who receives the stolen thing from him shall be put to death. -- 1780 BC, Hammurabi, Code of Law -- Asperger Syndroom - een persoonlijke benadering: http://people.nl.linux.org/~gerrit/ Het zijn tijden om je zelf met politiek te bemoeien: http://www.sp.nl/

On Wed, Oct 01, 2003, Gerrit Holl wrote:
Two points: * It's another step in the release process * It takes up extra space on the servers Following Fred's suggestion saves time and space. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan

Aahz writes:
* It's another step in the release process
The way I wrote up the documentation release in PEP 101, generating the files isn't even a step. There are a couple of make commands that cause these to be generated; these would not change; just the definitions for those targets would change.
* It takes up extra space on the servers
There is this; not a huge deal, but considering we're running on hardware owned by XS4ALL, and we're dependent on their goodwill, we shouldn't waste the space if we don't need to.
Following Fred's suggestion saves time and space.
I think more important is that it reduces the number of options that get presented to some who's looking to download something. The plethora of documentation packages is almost embarassing when compared to the number of packages for the interpreter itself: the sources as a .tar.gz package (no ZIP!), and the Windows installer. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation

My $0.02 (Canadian), for what it's worth: While Windows users may have trouble with *.bz2, and be unfamiliar enough with the extension *.tgz to not even try (even if it does work), I've never known a *nix box to have trouble with *.zip or known a unix user who had trouble with *.zip. So I'd suggest keeping the various flavors of documentation, but standardize on zip compression. That will at least remove one variable. I agree that the main point of all of this is to reduce confusion for the newbie coming to the site to download it. But 90% of those are going to be windows users, and the rest of us have gotten used to living in a windows-dominated world. Using bz2 may get you better compression and save bandwidth, but it wasn't standard the last time I installed RedHat or Debian. Zip has it's faults, but everybody is familiar with it. --Dethe

Dethe Elza writes: [...]
What Unix boxen do you use? I often run into Solaris, IRIX, and HP-UX boxen that lack unzip.
I wouldn't switch to bz2. Even tgz can be confusing. Having .zip files for Windows users and .tar.gz files for Unix users is a happy medium that should work most everywhere. Of course for maximum Unix portability I suppose you could use .tar.Z ;-) -tree -- Tom Emerson Basis Technology Corp. Software Architect http://www.basistech.com "Beware the lollipop of mediocrity: lick it once and you suck forever"

[Dethe Elza]
A difficulty is that the HTML doc set compresses *much* better under bz2 than under zip format, and many people download over slow and expensive dialup lines. bz2 is preferred for that reason (smaller file == faster and cheaper download).
I don't believe that, because the Windows installer for Python includes the full doc set in a Windows-friendly format. So there's simply no reason for the vast majority of Windows Python users to download the doc distribution at all. Fred, do we have stats on how often each of the files got downloaded for previous releases?
No argument there.

Tim Peters writes:
Fred, do we have stats on how often each of the files got downloaded for previous releases?
No, but we should be able to pull those from the server logs. Maybe this weekend I'll get time to write a script to pull that data out. Tom Emerson writes:
Interesting. bzip2 saves half a MB over gzip for the HTML and PostScript formats. What reason do you have for not using bzip2? It was very heavily requested for the file-size advantage.
Of course for maximum Unix portability I suppose you could use .tar.Z ;-)
Except nobody remembers what to do with those anymore. ;-) I haven't used compress/uncompress in *many* years. -Fred -- Fred L. Drake, Jr. <fred at zope.com> PythonLabs at Zope Corporation

On Wed, 1 Oct 2003, Fred L. Drake, Jr. wrote:
Interesting. bzip2 saves half a MB over gzip for the HTML and PostScript formats.
If you're producing PDF, why produce Postscript? AFAIK, Ghostscript digests PDF and can generate Postscript for those that have/want to use a Postscript printer. Around here, print shops seem to actually _prefer_ PDF. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au (pref) | Snail: PO Box 370 andymac@pcug.org.au (alt) | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia

Andrew MacIntyre writes:
I recall a number of people wanting to use the PostScript to drive real PostScript printers directly. That was some time ago; perhaps Ghostscript can handle PDF sufficiently now. If there's no longer any interest in having the PostScript available, I'll be glad to drop that. I guess I really should come up with a script that pulls the relevant stats from the site logs. -Fred -- Fred L. Drake, Jr. <fred at zope.com> PythonLabs at Zope Corporation

[Andrew MacIntyre]
But some of us are not print shops, and have Postscript printers, which are better fed with Postscript, and do not directly accept PDF. PDF to Postscript converters are not 100% dependable, even if they do the job most of the time. Given `.pdf' and `.ps', for one, I would almost always pick the `.ps' file, to avoid possible fights and trouble. -- François Pinard http://www.iro.umontreal.ca/~pinard

Dethe Elza writes:
At this point, the bzip2 compression has been the most-requested (in terms of emails begging us to add it); the most important aspect that makes it desirable is that the file sizes are so much better. From this perspective, ZIP files are the worst for the formats which cause a lot of individual files to be packaged (most importantly, the HTML and LaTeX source formats). There are still a lot of people who want to pull the files over slow links that this seems valuable, at least for those two formats. (It may be that it's *only* valuable for those formats, and can be dropped for the PDF and PostScript formats.)
Interesting; I don't recall the last time I had to build my own bzip2. I'm pretty sure I didn't do anything special to get it on RedHat recently. The bandwidth savings aren't nearly so valuable to python.org as they are to end users on metered internet connections; those are the users who were so incredibly vocal that we actually started posting those. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation

On Wed, 2003-10-01 at 16:24, Fred L. Drake, Jr. wrote:
No, I'm sure you didn't. bzip2 decompression should be standard on RH9, and there's even a tar option to read and write it. What I don't know is whether bz2 decompression is generally available on MacOSX... minority-platform-ly y'rs, -Barry

Barry Warsaw wrote:
Considering RH hosts the bzip2 site I would hope you could build on their OS. =)
What I don't know is whether bz2 decompression is generally available on MacOSX...
It is; StuffIt can decompress it. I just downloaded the GNU Info docs and had no problem with double-clicking the file and decompressing. -Brett

Barry> What I don't know is whether bz2 decompression is generally Barry> available on MacOSX... Fink is your friend: % type bzip2 bzip2 is /sw/bin/bzip2 so, no, it's not standard on Mac OS X. S

Skip Montanaro <skip@pobox.com> writes:
Just because fink supplies something doesn't mean it didn't come with the base install. Jaguar has bzip2 installed; I don't think 10.1 did. Cheers, mwh -- SCSI is not magic. There are fundamental technical reasons why it is necessary to sacrifice a young goat to your SCSI chain now and then. -- John Woods

Barry> What I don't know is whether bz2 decompression is generally Barry> available on MacOSX... Skip> Fink is your friend: Skip> % type bzip2 Skip> bzip2 is /sw/bin/bzip2 Skip> so, no, it's not standard on Mac OS X. Sorry, should have used "type -a" so I saw the version in /usr/bin. Skip

There's always machines out there that won't support newer formats out of the box, so may I suggest the following course of action: 1. For now we add bz2 compression, and put that at the top of the list, with gz far below it. If we want to get real fancy we could even put it behind another link "old formats". 2. At some point in the future we look at the http logs to see how many people still use the older format. .Z files were still very useful to some people long after .gz had become the norm, just because they were stuck on old boxes. And if Python goes out of its way to remain buildable on various old boxes as-os it would be silly if we would require people to download third-party stuff just to decode the documentation... -- Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman

[Jack Jansen]
.Z files were still very useful to some people long after .gz had become the norm, just because they were stuck on old boxes.
Do anybody remember `.z' files? (`pack' and `unpack' were the tools, unless I'm mistaken). I'm _not_ suggesting that they get supported :-). Despite `.Z' is not as old as `.z', they are not very far, once added some perspective. -- François Pinard http://www.iro.umontreal.ca/~pinard

Jack Jansen writes:
At this point, we've been providing bzip2-compressed tarballs for three years; they became available with Python 1.6 (does anyone even remember that release?).
2. At some point in the future we look at the http logs to see how many people still use the older format.
I'm hoping to write the script to do that this weekend. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation

Fred L. Drake, Jr. <fdrake@acm.org> wrote:
It was included in the RedHat 6.2 distribution, possibly in 6.1 and 6.0 as well, though I can't check that. It hasn't been an "exotic" package in many years, although it's not necessarily installed by default in a "base" install. I see no reason not to use .bz2 as the default format. Charles -- ----------------------------------------------------------------------- Charles Cazabon <python@discworld.dyndns.org> GPL'ed software available at: http://www.qcc.ca/~charlesc/software/ -----------------------------------------------------------------------

Dethe> I've never known a *nix box to have trouble with *.zip or known a Dethe> unix user who had trouble with *.zip. So I'd suggest keeping the Dethe> various flavors of documentation, but standardize on zip Dethe> compression. That will at least remove one variable. Agreed. We did encounter a problem with a zip file in the SpamBayes group recently which we believe (though haven't confirmed - the OP has apparently gone underground) was related to WinZip problems. As I understand it, if you set an option in WinZip to "flatten" a zip file, all future zip files are also flattened. I guess it's a case of setting that option then poking the "Save Options" or "OK" button, then forgetting that other zip files will have structure which shouldn't be eliminated. Skip
participants (16)
-
Aahz
-
Andrew MacIntyre
-
Barry Warsaw
-
Brett C.
-
Charles Cazabon
-
Dethe Elza
-
François Pinard
-
Fred L. Drake, Jr.
-
Fred L. Drake, Jr.
-
Gerrit Holl
-
Jack Jansen
-
Michael Hudson
-
Ronald Oussoren
-
Skip Montanaro
-
Tim Peters
-
Tom Emerson