
Hello, For 2.7/3.2, I am in the process of removing modules in Distutils that can be replaced by calls to existing functions in stdlib. For instance, "dir_util" and "file_util" (old modules from the Python 1.x era) are going away in favor of calls to shutil (and os), so the Distutils package gets lighter. Another module I would like to move away from Distutils is "archive_util". It contains helpers to build archives, whether they are zip or tar files. I propose to move those useful functions into shutil, as this seems the most logical place. I also propose to maintain this "shutil" module for now on (no one is declared as a maintainer in maintainers.rst) since Distutils will become a heavy user of its functions. Any objections/opinions ? Regards, Tarek -- Tarek Ziadé | http://ziade.org

On 17/01/2010 19:51, Tarek Ziadé wrote:
I think it's a great idea. :-) Michael -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Sun, Jan 17, 2010 at 8:55 PM, Brett Cannon <brett@python.org> wrote:
In more details: It allows the creation of gzip, bzip2, tar and zip files through a single API. There's a registry of supported formats and the API is driven by a format identifier. To do the work it uses stdlib's compression modules. Although it tries the "zip" system command as a fallback if the "zipfile" module is not present. (notice that I've removed the support of "compress" (.Z) some time ago) Regards Tarek -- Tarek Ziadé | http://ziade.org

On 1/17/2010 12:09 PM, Tarek Ziadé wrote:
Will it also allow decompression of the said archive types? Distribute has some utility code to handle zip/tar archives. So does PyPM. This is because the `tarfile` and `zipfile` modules do not "just work" due to several issues. See http://gist.github.com/279606 Take note of the following in the above code: 1) _ensure_read_write_access 2) *File.is_valid 3) ZippedFile.extract ... issue 6510 4) ZippedFile.extract ... issue 6609 5) TarredFile.extract ... issue 6584 6) The way unpack() detects the unpacked directory. -srid

On Sun, Jan 17, 2010 at 10:50 PM, Sridhar Ratnakumar <sridharr@activestate.com> wrote: [..]
Will it also allow decompression of the said archive types?
No but it would make sense having this one as well. Distribute/Setuptools' "unpack_archive" (used by easy_install) was implemented using the same principle as Distutils' "make_archive". I can add it as well while I am at it : it has been successfully used for years by easy_install.
Looks like some of these are already fixed (I just looked quickly in the bug tracker though) If its not already done and pending, it would be great if you could refactor your fixes into patches for the remaining issues for each one of those modules Regards Tarek -- Tarek Ziadé | http://ziade.org

On 17/01/2010 20:04, Antoine Pitrou wrote:
I believe that part of the work that Tarek has been doing has been to make these distutils commands use the Python standard library and not depend on external programs. In which case they seem like *excellent* additions to the shutil module. Of course Tarek can speak for himself... Michael
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Sun, Jan 17, 2010 at 9:15 PM, Michael Foord <fuzzyman@voidspace.org.uk> wrote: [..]
yes, in the past the "tar" files where created using the "tar" command but this has been changed. For a while now, they are portable and they use stdlib code only. A recent addition is to be able to define user/group permissions in the tar files, thanks to Lars' work in the tarfile module. There's one remaining external call for "zip" done if the zip module is not found, but I am happy to remove it and throw an exception if it's not found, and keep the external "zip" call on Distutils side, so shutil stays 100% stdlib-powered.
Of course Tarek can speak for himself...
Thanks for explaining it ! :) Regards Tarek -- Tarek Ziadé | http://ziade.org

Tarek Ziadé wrote:
+1 for that approach. These changes all sound like nice additions to shutil, and hooray for every module that gets adopted by an active maintainer :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On 18 Jan 2010, at 13:40 , Nick Coghlan wrote:
Isn't it a bit weird to include that to shutil though? shutil advertises itself as "a number of high-level operations on files and collections of files." and from what I understood it was a bunch of shell-type utility functions to easily copy, move or remove files and directories (that's pretty much all there is in it at this time). Wouldn't it make more sense to put those "archive utils" functions/objects in a new module separate from shutil, dealing specifically with cross-archive APIs and linked from the current archive-specific modules (essentially, just take the current archive_util, move it to the toplevel of the stdlib and maybe rename it)? It would also make the module much easier to find when searching through the module listing, I think.

On 18/01/2010 13:46, Doug Hellmann wrote:
Well - isn't what's being proposed "a number of high-level operations on files and collections of files." ?
and from what I understood it was a bunch of shell-type utility functions
like tar and zip? :-)
to easily copy, move or remove files and directories (that's pretty much all there is in it at this time).
I don't think the additions are out of place prima-facie.
Proliferation of modules is itself a bad thing though. Michael
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Mon, Jan 18, 2010 at 2:57 PM, Michael Foord <fuzzyman@voidspace.org.uk> wrote: [..]
I am with Michael here. I think having this function in shutil is the right place. For the find problem, I think docs.python.org is easy to search now, as long as the shutil module documentation has an good set of examples for this new API. We could even add references in the tarfile, zipfile modules documentation pointing to these examples. Regards Tarek -- Tarek Ziadé | http://ziade.org

On 18 Jan 2010, at 14:57 , Michael Foord wrote:
Well no those are high-level operations on a very restricted set of file types (archives).
and from what I understood it was a bunch of shell-type utility functions
like tar and zip? :-)
Tar and zip have a module each at this time, so they're considered pretty big. I don't think anybody would consider "mv" big enough to give it a module on its own.
Plus having this as a toplevel module/package would open the window to moving all archive-related modules within it (in the py4 window), à la xml package without having to move itself.

On Mon, Jan 18, 2010 at 3:56 PM, Masklinn <masklinn@masklinn.net> wrote: [..]
Well - isn't what's being proposed "a number of high-level operations on files and collections of files." ?
Well no those are high-level operations on a very restricted set of file types (archives)
not really: make_archive/unpack_archive, are also dealing with files and directories. [..]
I am not sure why this would happen. For instance, shutil is already on the top of the os module since a few major versions IIRC, because it reads and writes files and directories. But it was not moved into the os package (or vice-versa) The shutil module uses APIs to read and write files. So if it works with archives, it's just a specific read/write API that is used, but that doesn't mean tarfile and zipfile might be reunited with shutil imho. If the shutil module is restricted to high-level files and directories manipulation, working with archives is just a target like another I think. But at the end I am 0- to create a new module, because what really matters to me is to take it out of Distutils :) Regards, Tarek -- Tarek Ziadé | http://ziade.org

On Mon, Jan 18, 2010 at 02:34:14PM +0100, Masklinn wrote:
Isn't it a bit weird to include that to shutil though? shutil advertises itself as "a number of high-level operations on files and collections of files." and from what I understood it was a bunch of shell-type utility functions to easily copy, move or remove files and directories (that's pretty much all there is in it at this time).
Wouldn't it make more sense to put those "archive utils" functions/objects in a new module separate from shutil, dealing specifically with cross-archive APIs and linked from the current archive-specific modules (essentially, just take the current archive_util, move it to the toplevel of the stdlib and maybe rename it)? It would also make the module much easier to find when searching through the module listing, I think.
+1 Oleg. -- Oleg Broytman http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On Jan 18, 2010, at 8:34 AM, Masklinn wrote:
As much of a pain as it is to get new modules accepted, I agree that mixing archiving functions into shutil is not the right way to do it and that a separate archive_util module would make much more sense and would give a logical place to put any extensions to archive handling. S

On Mon, 18 Jan 2010 10:56:05 -0500, "Steve Steiner (listsin)" <listsin@integrateddevcorp.com> wrote:
Looking at the source code and API for both shutil and archive_util, I think that the archive_util methods fit into shutil. shutil currently wraps some standard library facilities with convenience functions for operations you might otherwise perform at the shell command line using OS facilities. As far as I can tell, archive_util does the same, and seems quite within the shutil mission of "high level file operations". So +1 from me for putting these in shutil. -- R. David Murray www.bitdance.com Business Process Automation - Network/Server Management - Routers/Firewalls

2010/1/18 R. David Murray <rdmurray@bitdance.com>:
Conceptually, I'm happy with these going into shutil (and +1 on the rest of Tarek's proposal, too!) To my mind, shutil is a module for higher-level operations on files - the sort of things you'd do in shell commands, like move a batch of files around (mv), create a directory tree (mkdir -p). Tarring or zipping up a batch of files fits nicely into that space. Paul.

Paul Moore wrote:
This is also reflected in the way at least Windows handles archives these days - it took them a couple of iterations to get it right (and resolve some of the performance impacts), but Explorer now does a decent job of integrating archives into the directory tree as "folders that happen to be compressed". Are archives as fundamental as directories and files? No. But in the context of shutil, the fact that their internal structure is largely about directories and files makes them more than just another arbitrary file type. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mon, Jan 18, 2010 at 1:21 AM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
+1 for this. Just make sure that you change the docstring of shutil which now reads as, " shutil - Utility functions for copying files and directory trees." According to this "definition", archives don't fit in there. But the functionality does fit right in, so just need to make sure that it is reflected in the __doc__ .
-- --Anand

On 17/01/2010 19:51, Tarek Ziadé wrote:
I think it's a great idea. :-) Michael -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Sun, Jan 17, 2010 at 8:55 PM, Brett Cannon <brett@python.org> wrote:
In more details: It allows the creation of gzip, bzip2, tar and zip files through a single API. There's a registry of supported formats and the API is driven by a format identifier. To do the work it uses stdlib's compression modules. Although it tries the "zip" system command as a fallback if the "zipfile" module is not present. (notice that I've removed the support of "compress" (.Z) some time ago) Regards Tarek -- Tarek Ziadé | http://ziade.org

On 1/17/2010 12:09 PM, Tarek Ziadé wrote:
Will it also allow decompression of the said archive types? Distribute has some utility code to handle zip/tar archives. So does PyPM. This is because the `tarfile` and `zipfile` modules do not "just work" due to several issues. See http://gist.github.com/279606 Take note of the following in the above code: 1) _ensure_read_write_access 2) *File.is_valid 3) ZippedFile.extract ... issue 6510 4) ZippedFile.extract ... issue 6609 5) TarredFile.extract ... issue 6584 6) The way unpack() detects the unpacked directory. -srid

On Sun, Jan 17, 2010 at 10:50 PM, Sridhar Ratnakumar <sridharr@activestate.com> wrote: [..]
Will it also allow decompression of the said archive types?
No but it would make sense having this one as well. Distribute/Setuptools' "unpack_archive" (used by easy_install) was implemented using the same principle as Distutils' "make_archive". I can add it as well while I am at it : it has been successfully used for years by easy_install.
Looks like some of these are already fixed (I just looked quickly in the bug tracker though) If its not already done and pending, it would be great if you could refactor your fixes into patches for the remaining issues for each one of those modules Regards Tarek -- Tarek Ziadé | http://ziade.org

On 17/01/2010 20:04, Antoine Pitrou wrote:
I believe that part of the work that Tarek has been doing has been to make these distutils commands use the Python standard library and not depend on external programs. In which case they seem like *excellent* additions to the shutil module. Of course Tarek can speak for himself... Michael
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Sun, Jan 17, 2010 at 9:15 PM, Michael Foord <fuzzyman@voidspace.org.uk> wrote: [..]
yes, in the past the "tar" files where created using the "tar" command but this has been changed. For a while now, they are portable and they use stdlib code only. A recent addition is to be able to define user/group permissions in the tar files, thanks to Lars' work in the tarfile module. There's one remaining external call for "zip" done if the zip module is not found, but I am happy to remove it and throw an exception if it's not found, and keep the external "zip" call on Distutils side, so shutil stays 100% stdlib-powered.
Of course Tarek can speak for himself...
Thanks for explaining it ! :) Regards Tarek -- Tarek Ziadé | http://ziade.org

Tarek Ziadé wrote:
+1 for that approach. These changes all sound like nice additions to shutil, and hooray for every module that gets adopted by an active maintainer :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On 18 Jan 2010, at 13:40 , Nick Coghlan wrote:
Isn't it a bit weird to include that to shutil though? shutil advertises itself as "a number of high-level operations on files and collections of files." and from what I understood it was a bunch of shell-type utility functions to easily copy, move or remove files and directories (that's pretty much all there is in it at this time). Wouldn't it make more sense to put those "archive utils" functions/objects in a new module separate from shutil, dealing specifically with cross-archive APIs and linked from the current archive-specific modules (essentially, just take the current archive_util, move it to the toplevel of the stdlib and maybe rename it)? It would also make the module much easier to find when searching through the module listing, I think.

On 18/01/2010 13:46, Doug Hellmann wrote:
Well - isn't what's being proposed "a number of high-level operations on files and collections of files." ?
and from what I understood it was a bunch of shell-type utility functions
like tar and zip? :-)
to easily copy, move or remove files and directories (that's pretty much all there is in it at this time).
I don't think the additions are out of place prima-facie.
Proliferation of modules is itself a bad thing though. Michael
-- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.

On Mon, Jan 18, 2010 at 2:57 PM, Michael Foord <fuzzyman@voidspace.org.uk> wrote: [..]
I am with Michael here. I think having this function in shutil is the right place. For the find problem, I think docs.python.org is easy to search now, as long as the shutil module documentation has an good set of examples for this new API. We could even add references in the tarfile, zipfile modules documentation pointing to these examples. Regards Tarek -- Tarek Ziadé | http://ziade.org

On 18 Jan 2010, at 14:57 , Michael Foord wrote:
Well no those are high-level operations on a very restricted set of file types (archives).
and from what I understood it was a bunch of shell-type utility functions
like tar and zip? :-)
Tar and zip have a module each at this time, so they're considered pretty big. I don't think anybody would consider "mv" big enough to give it a module on its own.
Plus having this as a toplevel module/package would open the window to moving all archive-related modules within it (in the py4 window), à la xml package without having to move itself.

On Mon, Jan 18, 2010 at 3:56 PM, Masklinn <masklinn@masklinn.net> wrote: [..]
Well - isn't what's being proposed "a number of high-level operations on files and collections of files." ?
Well no those are high-level operations on a very restricted set of file types (archives)
not really: make_archive/unpack_archive, are also dealing with files and directories. [..]
I am not sure why this would happen. For instance, shutil is already on the top of the os module since a few major versions IIRC, because it reads and writes files and directories. But it was not moved into the os package (or vice-versa) The shutil module uses APIs to read and write files. So if it works with archives, it's just a specific read/write API that is used, but that doesn't mean tarfile and zipfile might be reunited with shutil imho. If the shutil module is restricted to high-level files and directories manipulation, working with archives is just a target like another I think. But at the end I am 0- to create a new module, because what really matters to me is to take it out of Distutils :) Regards, Tarek -- Tarek Ziadé | http://ziade.org

On Mon, Jan 18, 2010 at 02:34:14PM +0100, Masklinn wrote:
Isn't it a bit weird to include that to shutil though? shutil advertises itself as "a number of high-level operations on files and collections of files." and from what I understood it was a bunch of shell-type utility functions to easily copy, move or remove files and directories (that's pretty much all there is in it at this time).
Wouldn't it make more sense to put those "archive utils" functions/objects in a new module separate from shutil, dealing specifically with cross-archive APIs and linked from the current archive-specific modules (essentially, just take the current archive_util, move it to the toplevel of the stdlib and maybe rename it)? It would also make the module much easier to find when searching through the module listing, I think.
+1 Oleg. -- Oleg Broytman http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.

On Jan 18, 2010, at 8:34 AM, Masklinn wrote:
As much of a pain as it is to get new modules accepted, I agree that mixing archiving functions into shutil is not the right way to do it and that a separate archive_util module would make much more sense and would give a logical place to put any extensions to archive handling. S

On Mon, 18 Jan 2010 10:56:05 -0500, "Steve Steiner (listsin)" <listsin@integrateddevcorp.com> wrote:
Looking at the source code and API for both shutil and archive_util, I think that the archive_util methods fit into shutil. shutil currently wraps some standard library facilities with convenience functions for operations you might otherwise perform at the shell command line using OS facilities. As far as I can tell, archive_util does the same, and seems quite within the shutil mission of "high level file operations". So +1 from me for putting these in shutil. -- R. David Murray www.bitdance.com Business Process Automation - Network/Server Management - Routers/Firewalls

2010/1/18 R. David Murray <rdmurray@bitdance.com>:
Conceptually, I'm happy with these going into shutil (and +1 on the rest of Tarek's proposal, too!) To my mind, shutil is a module for higher-level operations on files - the sort of things you'd do in shell commands, like move a batch of files around (mv), create a directory tree (mkdir -p). Tarring or zipping up a batch of files fits nicely into that space. Paul.

Paul Moore wrote:
This is also reflected in the way at least Windows handles archives these days - it took them a couple of iterations to get it right (and resolve some of the performance impacts), but Explorer now does a decent job of integrating archives into the directory tree as "folders that happen to be compressed". Are archives as fundamental as directories and files? No. But in the context of shutil, the fact that their internal structure is largely about directories and files makes them more than just another arbitrary file type. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mon, Jan 18, 2010 at 1:21 AM, Tarek Ziadé <ziade.tarek@gmail.com> wrote:
+1 for this. Just make sure that you change the docstring of shutil which now reads as, " shutil - Utility functions for copying files and directory trees." According to this "definition", archives don't fit in there. But the functionality does fit right in, so just need to make sure that it is reflected in the __doc__ .
-- --Anand
participants (13)
-
Anand Balachandran Pillai
-
Antoine Pitrou
-
Brett Cannon
-
Doug Hellmann
-
Masklinn
-
Michael Foord
-
Nick Coghlan
-
Oleg Broytman
-
Paul Moore
-
R. David Murray
-
Sridhar Ratnakumar
-
Steve Steiner (listsin)
-
Tarek Ziadé