Re: [Python-Dev] Support for Encrypted Zip as python scripts

There is a standard for encrypting entire zip files. And I was looking at the zip docs the other day and zipfile can already decrypt but not encrypt (assuming my memory is accurate; doing this from my phone on vacation). On Aug 23, 2009 2:10 PM, "Guido van Rossum" <guido@python.org> wrote: On Sun, Aug 23, 2009 at 9:09 AM, Shashank Singh< shashank.sunny.singh@gmail.com> wrote: > There is an... MvL already asked for a patch so I suppose that means he thinks it's useful. Personally I've never encountered an encrypted zipfile, so I just have questions: is there a standard encryption algorithm? What is encrypted? The entire file or individual members? How are you supposed to give the password? Also, I suppose there could be (US) export problems with the code, so it would have to be optional (and we might not be able to build it into binaries we distribute from python.org). -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org

On Sun, Aug 23, 2009 at 1:24 PM, Brett Cannon<brett@python.org> wrote:
Ah, cool. Then the only issue for the patch presumably is an API to provide the password. Passing it as a command-line flag seems very insecure (though in some cases there may be no choice), so presumably it needs to be prompted and read from stdin. (Though it appears from skimming zipfile.py that it support encrypted individual archive members, not the zipfile as a whole. Also the docs mention that decryption is "extremely slow as it is implemented in native python rather than C.") Anyway it looks like if someone wants to try this, only the code in runpy.py needs to be touched. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Anyway it looks like if someone wants to try this, only the code in runpy.py needs to be touched.
The necessary work would actually be in zipimport. runpy doesn't know anything about the details of where the module code comes from, it just asks the relevant importer for the details. For zipfile and directory execution, they get added to the start of sys.path and then runpy is invoked to look for the module "__main__". From that point on most of the heavy lifting is handled by the regular import machinery (aside from using the pkgutil emulation for the basic import behaviour that isn't fully exposed by the imp module). I added a -1 to the tracker issue as well. That's due both to my opinion on the inherent idiocy of DRM though (since shared secrets don't provide any security when the attacker in your threat model is one of the people you are sharing the secret with) and to the fact that associating passwords with the relevant zipfile entries on sys.path would get messy fairly quickly. Cheers. Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mon, Aug 24, 2009 at 7:39 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- Regards Shashank Singh Senior Undergraduate, Department of Computer Science and Engineering Indian Institute of Technology Bombay shashank.sunny.singh@gmail.com http://www.cse.iitb.ac.in/~shashanksingh

oops..sorry for the empty mail :P On Mon, Aug 24, 2009 at 8:09 AM, Shashank Singh < shashank.sunny.singh@gmail.com> wrote:
That is where I see the problem in creating a natural approach. Correct me if I am wrong here but since runpy doesn't know anything about the script being a zip file to add such a support we will have to break the current delegation mechanism and bring runpy in the loop too. Also, since a zip file is automatically checked for (I believe there are no switches to specify that the script is a zip) will it not be a two trip mechanism: You naively try a to run a zip; get an error (say ERR_ZIP_ENCRYPTED) and then ask for password?
-- Regards Shashank Singh Senior Undergraduate, Department of Computer Science and Engineering Indian Institute of Technology Bombay shashank.sunny.singh@gmail.com http://www.cse.iitb.ac.in/~shashanksingh

OMG, the use case is actually running a script without giving the user access to the script's source? Agreed that's a big -1. I thought it was just for running a zip containing code so secret you don't want to leave it around on your hard drive without encryption (say, the program you use to compute your employee's bonuses, or perhaps a patented algoritm for detecting spam). That use case would make a small amount of sense, though I personally don't care enough to write the code to support it. --Guido On Sun, Aug 23, 2009 at 7:09 PM, Nick Coghlan<ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Actually, the issue posting doesn't say either way - it doesn't provide any real use cases at all. For local protection of confidential information there are already much better solutions out there (e.g. whole disk encryption, OS file permissions, OS folder encryption), so a poor-man's DRM was the only remaining remotely plausible use case I could see (and that's a bad idea for all the reasons that DRM is almost always a bad idea). Now, that could just be a failure of imagination on my part, but genuine use case suggestions for the feature have been non existent so far. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

A litle off topic but the zipfile doc says: "Decryption is extremely slow as it is implemented in native python rather than C". Why is this limitation there? I mean, is there any specific reason for not implementing it in C? On Mon, Aug 24, 2009 at 8:45 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- Regards Shashank Singh Senior Undergraduate, Department of Computer Science and Engineering Indian Institute of Technology Bombay shashank.sunny.singh@gmail.com http://www.cse.iitb.ac.in/~shashanksingh

Because it is easier to write in Python, and (as Greg explained) the encryption is so lousy that you're unlikely to find heavy use of it. Therefore nobody (so far) has cared to write an accelerator in C. On Sun, Aug 23, 2009 at 9:23 PM, Shashank Singh<shashank.sunny.singh@gmail.com> wrote:
-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Anyway it looks like if someone wants to try this, only the code in runpy.py needs to be touched.
Where is runpy.py to be found? I'm trying to find whatever implements python -m and the other python command line options... Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk

Benjamin Peterson wrote:
Heh, grep beats Mk I eyeball ;-) (I did actually look in Lib...) Anyway, so how is the stuff in runpy.py wired up to the command line options passed to the interpretter? Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk

Benjamin Peterson wrote:
The most relevant functions in there are "RunMainFromImporter()" (attempting zipfile/directory execution) and "RunModule()" (-m switch and also called for zipfile/directory execution). The latter function just uses normal C API calls to actually invoke the runpy code (specifically "runpy._run_module_as_main()"). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Take a look at two PEPs referenced in runpy doc, http://docs.python.org/3.1/library/runpy.html : PEP 338 - Executing modules as scripts PEP written and implemented by Nick Coghlan. PEP 366 - Main module explicit relative imports PEP written and implemented by Nick Coghlan. (Nick is too modest to self-reference, but these two PEPs give an excellent exposition. :-) On Tue, Aug 25, 2009 at 5:01 AM, Nick Coghlan<ncoghlan@gmail.com> wrote:

Alexander Belopolsky wrote:
The PEPs don't go into the process of how we actually hook the command line up to the runpy module though - that's something you need to dig into the main.c code to really understand. The command line documentation is also relevant since it defines the intended behaviour: http://docs.python.org/dev/using/cmdline.html#command-line (Drop the /dev from the URL to see the defined behaviour for 2.6) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Yeah, main.c does quite a lot... ;-) This all spawned from a suggestion by Jim Fulton over on the distutils-sig that it would be nice if there was a python module that did all of the various types of launching found in main.c. His use case is so that buildout scripts can easily use the same functionality that the interpreter startup uses. I didn't spot any, but does anyone know of code in that mix that couldn't be moved to a pure python module like runpy? If not, how would people feel about the various types of launching all moving to runpy rather than just the -m stuff being there? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk

Chris Withers wrote:
I haven't timed it, but I believe runpy is a fair bit slower than the native C functions in main. (That first part of the comment means I could easily be wrong though - it's definitely possible that overall interpreter startup time will dwarf any difference between the two launch mechanisms). That said, while actually ditching the C code might cause an argument, expanding runpy with Python equivalents of the C level functionality (i.e. run script by name, run directory/zipfile by name, '-c' switch, and other odds and ends that I'm probably forgetting right now, with all associated modifications to sys.argv and the __main__ module attributes) should be far less controversial. For example, _run_module_as_main() has survived long enough now without anyone poking holes in it (unlike the holes in the original run_module() that PJE drove a truck through!) that I could probably be talked into removing the comment I put on it and making it public :) As you say, making all of that functionality accessible from Python would allow launch scripts to be far more flexible in handling arguments as if they were the normal interpreter. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mon, Aug 31, 2009 at 06:36, Nick Coghlan<ncoghlan@gmail.com> wrote:
That's quite possible. If you benchmark it you might be able to convince people.
It also has the perk of letting alternative VMs not have to implement all of that stuff themselves, potentially helping to unify even the command-line interfaces for all the VMs. -Brett

Brett Cannon wrote:
I created a tracker item for the idea so I don't forget about it: http://bugs.python.org/issue6816 Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Sun, Aug 23, 2009 at 1:24 PM, Brett Cannon<brett@python.org> wrote:
Ah, cool. Then the only issue for the patch presumably is an API to provide the password. Passing it as a command-line flag seems very insecure (though in some cases there may be no choice), so presumably it needs to be prompted and read from stdin. (Though it appears from skimming zipfile.py that it support encrypted individual archive members, not the zipfile as a whole. Also the docs mention that decryption is "extremely slow as it is implemented in native python rather than C.") Anyway it looks like if someone wants to try this, only the code in runpy.py needs to be touched. -- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Anyway it looks like if someone wants to try this, only the code in runpy.py needs to be touched.
The necessary work would actually be in zipimport. runpy doesn't know anything about the details of where the module code comes from, it just asks the relevant importer for the details. For zipfile and directory execution, they get added to the start of sys.path and then runpy is invoked to look for the module "__main__". From that point on most of the heavy lifting is handled by the regular import machinery (aside from using the pkgutil emulation for the basic import behaviour that isn't fully exposed by the imp module). I added a -1 to the tracker issue as well. That's due both to my opinion on the inherent idiocy of DRM though (since shared secrets don't provide any security when the attacker in your threat model is one of the people you are sharing the secret with) and to the fact that associating passwords with the relevant zipfile entries on sys.path would get messy fairly quickly. Cheers. Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mon, Aug 24, 2009 at 7:39 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- Regards Shashank Singh Senior Undergraduate, Department of Computer Science and Engineering Indian Institute of Technology Bombay shashank.sunny.singh@gmail.com http://www.cse.iitb.ac.in/~shashanksingh

oops..sorry for the empty mail :P On Mon, Aug 24, 2009 at 8:09 AM, Shashank Singh < shashank.sunny.singh@gmail.com> wrote:
That is where I see the problem in creating a natural approach. Correct me if I am wrong here but since runpy doesn't know anything about the script being a zip file to add such a support we will have to break the current delegation mechanism and bring runpy in the loop too. Also, since a zip file is automatically checked for (I believe there are no switches to specify that the script is a zip) will it not be a two trip mechanism: You naively try a to run a zip; get an error (say ERR_ZIP_ENCRYPTED) and then ask for password?
-- Regards Shashank Singh Senior Undergraduate, Department of Computer Science and Engineering Indian Institute of Technology Bombay shashank.sunny.singh@gmail.com http://www.cse.iitb.ac.in/~shashanksingh

OMG, the use case is actually running a script without giving the user access to the script's source? Agreed that's a big -1. I thought it was just for running a zip containing code so secret you don't want to leave it around on your hard drive without encryption (say, the program you use to compute your employee's bonuses, or perhaps a patented algoritm for detecting spam). That use case would make a small amount of sense, though I personally don't care enough to write the code to support it. --Guido On Sun, Aug 23, 2009 at 7:09 PM, Nick Coghlan<ncoghlan@gmail.com> wrote:
-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Actually, the issue posting doesn't say either way - it doesn't provide any real use cases at all. For local protection of confidential information there are already much better solutions out there (e.g. whole disk encryption, OS file permissions, OS folder encryption), so a poor-man's DRM was the only remaining remotely plausible use case I could see (and that's a bad idea for all the reasons that DRM is almost always a bad idea). Now, that could just be a failure of imagination on my part, but genuine use case suggestions for the feature have been non existent so far. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

A litle off topic but the zipfile doc says: "Decryption is extremely slow as it is implemented in native python rather than C". Why is this limitation there? I mean, is there any specific reason for not implementing it in C? On Mon, Aug 24, 2009 at 8:45 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
-- Regards Shashank Singh Senior Undergraduate, Department of Computer Science and Engineering Indian Institute of Technology Bombay shashank.sunny.singh@gmail.com http://www.cse.iitb.ac.in/~shashanksingh

Because it is easier to write in Python, and (as Greg explained) the encryption is so lousy that you're unlikely to find heavy use of it. Therefore nobody (so far) has cared to write an accelerator in C. On Sun, Aug 23, 2009 at 9:23 PM, Shashank Singh<shashank.sunny.singh@gmail.com> wrote:
-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Guido van Rossum wrote:
Anyway it looks like if someone wants to try this, only the code in runpy.py needs to be touched.
Where is runpy.py to be found? I'm trying to find whatever implements python -m and the other python command line options... Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk

Benjamin Peterson wrote:
Heh, grep beats Mk I eyeball ;-) (I did actually look in Lib...) Anyway, so how is the stuff in runpy.py wired up to the command line options passed to the interpretter? Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk

Benjamin Peterson wrote:
The most relevant functions in there are "RunMainFromImporter()" (attempting zipfile/directory execution) and "RunModule()" (-m switch and also called for zipfile/directory execution). The latter function just uses normal C API calls to actually invoke the runpy code (specifically "runpy._run_module_as_main()"). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Take a look at two PEPs referenced in runpy doc, http://docs.python.org/3.1/library/runpy.html : PEP 338 - Executing modules as scripts PEP written and implemented by Nick Coghlan. PEP 366 - Main module explicit relative imports PEP written and implemented by Nick Coghlan. (Nick is too modest to self-reference, but these two PEPs give an excellent exposition. :-) On Tue, Aug 25, 2009 at 5:01 AM, Nick Coghlan<ncoghlan@gmail.com> wrote:

Alexander Belopolsky wrote:
The PEPs don't go into the process of how we actually hook the command line up to the runpy module though - that's something you need to dig into the main.c code to really understand. The command line documentation is also relevant since it defines the intended behaviour: http://docs.python.org/dev/using/cmdline.html#command-line (Drop the /dev from the URL to see the defined behaviour for 2.6) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Yeah, main.c does quite a lot... ;-) This all spawned from a suggestion by Jim Fulton over on the distutils-sig that it would be nice if there was a python module that did all of the various types of launching found in main.c. His use case is so that buildout scripts can easily use the same functionality that the interpreter startup uses. I didn't spot any, but does anyone know of code in that mix that couldn't be moved to a pure python module like runpy? If not, how would people feel about the various types of launching all moving to runpy rather than just the -m stuff being there? cheers, Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk

Chris Withers wrote:
I haven't timed it, but I believe runpy is a fair bit slower than the native C functions in main. (That first part of the comment means I could easily be wrong though - it's definitely possible that overall interpreter startup time will dwarf any difference between the two launch mechanisms). That said, while actually ditching the C code might cause an argument, expanding runpy with Python equivalents of the C level functionality (i.e. run script by name, run directory/zipfile by name, '-c' switch, and other odds and ends that I'm probably forgetting right now, with all associated modifications to sys.argv and the __main__ module attributes) should be far less controversial. For example, _run_module_as_main() has survived long enough now without anyone poking holes in it (unlike the holes in the original run_module() that PJE drove a truck through!) that I could probably be talked into removing the comment I put on it and making it public :) As you say, making all of that functionality accessible from Python would allow launch scripts to be far more flexible in handling arguments as if they were the normal interpreter. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mon, Aug 31, 2009 at 06:36, Nick Coghlan<ncoghlan@gmail.com> wrote:
That's quite possible. If you benchmark it you might be able to convince people.
It also has the perk of letting alternative VMs not have to implement all of that stuff themselves, potentially helping to unify even the command-line interfaces for all the VMs. -Brett

Brett Cannon wrote:
I created a tracker item for the idea so I don't forget about it: http://bugs.python.org/issue6816 Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
participants (7)
-
Alexander Belopolsky
-
Benjamin Peterson
-
Brett Cannon
-
Chris Withers
-
Guido van Rossum
-
Nick Coghlan
-
Shashank Singh