On 5/2/18 2:24 PM, Barry Warsaw wrote:
On May 2, 2018, at 09:42, Gregory Szorc <gregory.szorc@gmail.com> wrote:
As for things Python could do to make things better, one idea is for "package bundles." Instead of using .py, .pyc, .so, etc files as separate files on the filesystem, allow Python packages to be distributed as standalone "archive" files.
Of course, .so files have to be extracted to the file system, because we have to live with dlopen()’s API. In our first release of shiv, we had a loader that did exactly that for just .so files. We ended up just doing .pyz file unpacking unconditionally, ignoring zip-safe, mostly because too many packages still use __file__, which doesn’t work in a zipapp.
FWIW, Google has a patched glibc that implements dlopen_with_offset(). It allows you to do things like memory map the current binary and then dlopen() a shared library embedded in an ELF section. I've seen the code in the branch at https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/google/grte/.... It likely exists elsewhere. An attempt to upstream it occurred at https://sourceware.org/bugzilla/show_bug.cgi?id=11767. It is probably well worth someone's time to pick up the torch and get this landed in glibc so everyone can be a massive step closer to self-contained, single binary applications. Of course, it will take years before you can rely on a glibc version with this API being deployed universally. But the sooner this lands...
I’ll plug shiv and importlib.resources (and the standalone importlib_resources) again here. :)
If you go this route, please don't require the use of zlib for file compression, as zlib is painfully slow compared to alternatives like lz4 and zstandard.
shiv works in a similar manner to pex, although it’s a completely new implementation that doesn’t suffer from huge sys.paths or the use of pkg_resources. shiv + importlib.resources saves us 25-50% of warm cache startup time. That makes things better but still not ideal. Ultimately though that means we don’t suffer from the slowness of zlib since we don’t count cold cache times (i.e. before the initial pyz unpacking operation).
Cheers, -Barry