On 1 February 2014 05:31, Brian Wickman <wickman@gmail.com> wrote:
This is in response to Vinay's thread but since I wasn't subscribed to distutils-sig, I couldn't easily respond directly to it.
Vinay's right, the technology here isn't revolutionary but what's notable is that we've been using it in production for almost 3 years at Twitter. It's also been open-sourced for a couple years at https://github.com/twitter/commons/tree/master/src/python/twitter/common/pyt... but not widely announced (it is, after all, just a small subdirectory in a fairly large mono-repo, and was only recently published independently to PyPI as twitter.common.python.)
PEX files are just executable zip files with hashbangs containing a carefully constructed __main__.py and a PEX-INFO, which is json-encoded dictionary describing how to scrub and bootstrap sys.path and the like. They work equally well unpacked into a standalone directory.
In practice PEX files are simultaneously our replacement for virtualenv and also our way of distributing Python applications to production. Now we could use virtualenv to do this but it's hard to argue with a deployment process that is literally "cp". Furthermore, most of our machines don't have compiler toolchains or external network access, so hermetically sealing all dependencies once at build time (possibly for multiple platforms since all developers use Macs) has huge appeal. This is even more important at Twitter where it's common to run a dozen different Python applications on the same box at the same time, some using 2.6, some 2.7, some PyPy, but all with varying versions of underlying dependencies.
Ah, very interesting - this is exactly the kind of thing we were trying to enable with the executable directory/zip file support in Python 2.6, and then we went and forgot to cover it in the original "What's New in Python 2.6?" doc, so an awful lot of people never learned about the existence of the feature :P As Daniel noted, PEP 441 is at least as much about letting people know "Hey, Python has supported direct execution of directories and zip archives since Python 2.6!" as it is about actually providing some tools that better support doing things that way :)
Speaking to recent distutils-sig threads, we used to go way out of our way to never hit disk (going so far as building our own .egg packager and pure python recursive zipimport implementation so that we could import from eggs within zips, write ephemeral .so's to dlopen and unlink) but we've since moved away from that position for simplicity's sake. For practical reasons we've always needed "not zip-safe" PEX files where all code is written to disk prior to execution (ex: legacy Django applications that have __file__-relative business logic) so we decided to just throw away the magical zipimport stuff and embrace using disk as a pure cache. This seems more compatible philosophically with the direction wheels are going for example.
Since there's been more movement in the PEP space recently, we've been evolving PEX in order to be as standards-compliant as possible, which is why I've been more visible recently re: talks, .whl support and the like. I'd also love to chat about more about PEX and how it relates to things like PEP 441 and/or other attempts like pyzzer.
I think it's very interesting and relevant indeed :) The design in PEP 441 just involves providing some very basic infrastructure around making use of the direct execution support, mostly in order to make the feature more discoverable in the first place, but it sounds like Twitter have a much richer and battle-hardened approach in PEX. It may still prove to be more appropriate to keep the stdlib infrastructure very basic (i.e. more at the PEP 441) level, leaving something like PEX free to provide cross-version consistency. I'll wait and see how the public documentation for PEX evolves before forming a firm opinion one way or the other, but one possible outcome would be a situation akin to the pyvenv vs virtualenv distinction, where pyvenv has the benefit of always being available even in environments where it is difficult to get additional third party tools approved, but also has the inherent downside of Python version dependent differences in available features and behaviour. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia