[Python-ideas] PEP: Distributing a Subset of the Standard Library

Nick Coghlan ncoghlan at gmail.com
Tue Nov 29 21:56:19 EST 2016


On 30 November 2016 at 04:33, Brett Cannon <brett at python.org> wrote:
> On Tue, 29 Nov 2016 at 06:49 Nick Coghlan <ncoghlan at gmail.com> wrote:
>>
>> On 29 November 2016 at 20:54, Tomas Orsava <torsava at redhat.com> wrote:
>> > With a metapath hook, .missing.py files are probably overkill, and the
>> > hook
>> > can just look at one file (or a static compiled-in list) of
>> > ModuleNotFound/ImportError messages for all missing modules, as M.-A.
>> > Lemburg and others are suggesting. We'll just need to think about
>> > coordinating how the list is generated/updated: the current PEP
>> > implicitly
>> > allows other parties, besides Python and the distributors, to step in
>> > cleanly if they need to—needing to update a single list could lead to
>> > messy
>> > hacks.
>>
>> What if, rather than using an explicitly file-based solution, this was
>> instead defined as a new protocol module, where the new metapath hook
>> imported a "__missing__" module and called a particular function in it
>> (e.g. "__missing__.module_not_found(modname)")?
>
>
> You can answer this question the best, Nick, but would it be worth defining
> a _stdlib.py that acts as both a marker for where the stdlib is installed --
> instead of os.py which is the current marker -- and which also stores
> metadata like an attribute called `missing` which is a dict that maps
> modules to ModuleNotFoundError messages? Although maybe this is too specific
> of a solution (or still too general and we use an e.g. missing.json off of
> sys.path which contains the same mapping).

Really, I think the ideal solution from a distro perspective would be
to enable something closer to what bash and other shells support for
failed CLI calls:

    $ blender
    bash: blender: command not found...
    Install package 'blender' to provide command 'blender'? [N/y] n

This would allow redistributors to point folks towards platform
packages (via apt/yum/dnf/PyPM/conda/Canopy/etc) for the components
they provide, and towards pip/PyPI for everything else (and while we
don't have a dist-lookup-by-module-name service for PyPI *today*, it's
something I hope we'll find a way to provide sometime in the next few
years).

I didn't suggest that during the Fedora-level discussions of this PEP
because it didn't occur to me - the elegant simplicity of the new
import suffix as a tactical solution to the immediate "splitting the
standard library" problem [1] meant I missed that it was really a
special case of the general "provide guidance on obtaining missing
modules from the system package manager" concept.

The problem with that idea however is that while it provides the best
possible interactive user experience, it's potentially really slow,
and hence too expensive to do for every import error - we would
instead need to find a way to run with Wolfgang Maier's suggestion of
only doing this for *unhandled* import errors.

Fortunately, we do have the appropriate mechanisms in place to support
that approach:

1. For interactive use, we have sys.excepthook
2. For non-interactive use, we have the atexit module

As a simple example of the former:

    >>> def module_missing(modname):
    ...     return f"Module not found: {modname}"
    >>> def my_except_hook(exc_type, exc_value, exc_tb):
    ...     if isinstance(exc_value, ModuleNotFoundError):
    ...         print(module_missing(exc_value.name))
    ...
    >>> sys.excepthook = my_except_hook
    >>> import foo
    Module not found: foo
    >>> import foo.bar
    Module not found: foo
    >>> import sys.bar
    Module not found: sys.bar

For the atexit handler, that could be installed by the `site` module,
so the existing mechanisms for disabling site module processing would
also disable any default exception reporting hooks. Folks could also
register their own handlers via either `sitecustomize.py` or
`usercustomize.py`.

And at that point the problem starts looking less like "Customise the
handling of missing modules" and more like "Customise the rendering
and reporting of particular types of unhandled exceptions". For
example, a custom handler for subprocess.CalledProcessError could
introspect the original command and use `shutil.which` to see if the
requested command was even visible from the current process (and, in a
redistributor provided Python, indicate which system packages to
install to obtain the requested command).

> My personal vote is a callback called at
> https://github.com/python/cpython/blob/master/Lib/importlib/_bootstrap.py#L948
> with a default implementation that raises ModuleNotFoundError just like the
> current line does.

Ethan's observation about try/except import chains has got me think
that limiting this to handling errors within the context of single
import statement will be problematic, especially given that folks can
already write their own metapath hook for that case if they really
want to.

Cheers,
Nick.

[1] For folks wondering "This problem has existed for years, why
suddenly worry about it now?", Fedora's in the process of splitting
out an even more restricted subset of the standard library for system
tools to use: https://fedoraproject.org/wiki/Changes/System_Python

That means "You're relying on a missing stdlib module" is going to
come up more often for system tools developers trying to stick within
that restricted subset.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list