On 28 May 2015 at 16:58, Barry Warsaw firstname.lastname@example.org wrote:
On May 28, 2015, at 11:39 AM, Donald Stufft wrote:
You don’t need a "fully functioning Python" for a single file binary, you only need enough to actually run your application. For example, if you're making an application that can download files over HTTP, you don't need to include parts of the stdlib like xmlrpc, pickle, shelve, marshall, sqlite, csv, email, mailcap, mailbox, imaplib, nntplib, etc.
There are actually two related but different use cases to "single file executables".
The first is nicely solved by tools like pex, where you don't need to include a fully functional Python at the head of the zip file because the environment you're deploying it into will have enough Python to make the zip work. This can certainly result in smaller zip files. This is the approach I took with Snappy Ubuntu Core support for Python 3, based on the current situation that the atomic upgrade client is written in Python 3. If that changes and Python 3 is removed from the image, then this approach won't work.
pex (and others) does a great job at this, so unless there are things better refactored into upstream Python, I don't think we need to do much here.
One problem with pex is that it doesn't appear to work on Windows (I just gave it a try, and got errors because it relies on symlinks).
IMO, any solution to "distributing Python applications" that is intended to compete with the idea that "go produces nice single-file executables" needs to be cross-platform. At the moment, zipapp (and in general, the core support for running applications from a zip file) handles this for the case where you're allowed to assume an already installed Python interpreter. The proviso here, as Donald pointed out, is that it doesn't handle C extensions.
The biggest problem with 3rd-party solutions is that they don't always support the full range of platforms that Python supports. That's fine for a 3rd party tool, but if we want to have a response to people asking how to bundle their application written in Python, we need a better answer than "if you're on Windows, use py2exe, or if you're on Unix use pex, or maybe..."
Python has core support for the equivalent of Java's jar format in zipapp. It's not well promoted (and doesn't support C extensions) but it's a pretty viable option for a lot of situations.
The second use case is as you describe: put a complete functional Python environment at the head of the zip file so you don't need anything in the target deployment environment. "Complete" can easily mean the entire stdlib, and although that would usually be more bloated than you normally need, hey, it's just some extra unused bits so who cares? <wink>. I think this would be an excellent starting point which can be optimized to trim unnecessary bits later, maybe by third party tools.
Tools like py2exe and cx_Freeze do this, and are pretty commonly used on Windows. An obvious example of use is Mercurial. If you're looking at this scenario, a good place to start would probably be understanding why cx_Freeze isn't more commonly used on Unix (AFAIK, it supports Unix, but I've only ever really heard of it being used on Windows).
I suspect "single file executables" just aren't viewed as a desirable solution on Unix. Although Donald referred to a 4K binary, which probably means just a stub exe that depends on system-installed .so files, likely including Python (I'm just guessing here). It's easy to do something similar on Windows, but it's *not* what most Windows users think of when you say a "single file executable for a Python program" (because there's no system package manager doing dependencies for you).
Again, platform-specific answers are one thing, and are relatively common, but having a good cross-platform answer at the language level (a section on docs.python.org "How to ship your Python program") is much harder.
Of course deciding which pieces you include in the zip file you're appending to the end of Python is up to whatever tool builds this executable which doesn't need to be part of Python itself. If Python itself gained the ability to operate in that manner than third party tools could handle trying to do the optimizations where it only includes the things it actually needs in the stdlib and excludes things it doesn't. The key thing here is that since you're doing a single file binary, you don't need to have a Python which is suitable to execute random Python code, you only need one that is suitable to execute this particular code so you can specialize what that includes.
I'd love to see Python itself gain such a tool, but if it had the critical pieces to execute in this way, that would enable a common approach to supporting this in third party tools, on a variety of platforms.
Stripping out unused code is a hard problem in a language as dynamic as Python. It would be great to see it happen, but I'm not sure how much better we can do than existing tools like modulefinder. (consider that stripping out parts of the stdlib is the same in principle as stripping out unused bits of a 3rd party library like requests - when this issue comes up, people often talk about slimming down the stdlib to just what's needed, but why not take out the json support from requests if you don't use it?)
I do think single-file executables are an important piece to Python's long-term competitiveness.
Agreed. But also, I think that "single-file" executables (single-directory, in practice) are *already* important - as I say, for projects like Mercurial. Doing better is great, but we could do worse than start by asking the Mercurial/TortoiseHg project and others what are the problems with the current situation that changes to the core could help to improve. I doubt "please make pythonXY.zip 50% smaller" would be the key issue :-)