On Fri, Nov 20, 2020 at 8:48 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 20/11/20 9:17 pm, Andrew Svetlov wrote:
> Digging into the problem more, I've figured out that PyInstaller has
> hooks
> <https://github.com/pyinstaller/pyinstaller/tree/develop/PyInstaller/hooks>
> for a bunch of popular libraries to make them work.

I've seen this sort of thing in other app bundlers too, e.g.
last time I looked py2app and py2exe had a bunch of special casing
for various libraries.

This is quite a big problem, IMO. It makes these tools very
flakey.

What is it about app bundling in Python that makes these things
necessary? Something seems very wrong somewhere.


Sorry Greg, forgot to reply-all initially.

Speaking as someone who wrote hooks for pyinstaller for a large framework, there are multiple reasons for hooks. Pyinstaller seems to build an import tree by parsing the package and all its dependencies that seem to be imported and includes them in the packaged app. From https://pyinstaller.readthedocs.io/en/stable/operating-mode.html:

PyInstaller reads a Python script written by you. It analyzes your code to discover every other module and library your script needs in order to execute. Then it collects copies of all those files – including the active Python interpreter! – and puts them with your script in a single folder, or optionally in a single executable file.

So e.g. in a framework with many backends or where you have indirect imports or any import tricks, pyinstaller won't know that some modules are imported so they have to be manually listed.

Additionally, distributing compiled extension packages or binary dependencies in python has always been difficult and not very clear, although now with wheels and easy CI things have gotten much much better. But still, currently, if your framework depends on binary dependencies, you need to tell pyinstaller to include all their files. E.g. if framework foo includes binary blobs under python/share/foo (e.g. on windows), you need to tell pyinstaller to include them. Or if your package includes assets.

You may wonder, why not include all the files listed when the python package is installed with pip? I don't know the history, but definitely that would result in a compiled exe that includes too much stuff and is too big. Pyinstaller goes for the minimum package as I understand it.

Finally, there are runtime hooks that frameworks can include, because sometimes your package needs to be set up in a certain way before it can run (e.g. a gstreamer env variable needs to be set).

As for including hooks for many packages, previously, pyinstaller took the approach to include hooks from any popular library that made a PR, so there is a lot of maintenance work for them as the hooks changed. However, recently, they added an entry point mechanism [1] for packages so packages can register the hooks automatically with pyinstaller and keep them in the package, and they seem to be moving away from managing hooks for other libraries.

[1] https://pyinstaller.readthedocs.io/en/stable/hooks.html#providing-pyinstaller-hooks-with-your-package

Matt