[Import-SIG] Round 2 for "A ModuleSpec Type for the Import System"

Eric Snow ericsnowcurrently at gmail.com
Tue Aug 13 06:17:58 CEST 2013


On Sun, Aug 11, 2013 at 2:08 PM, Brett Cannon <brett at python.org> wrote:

> On Fri, Aug 9, 2013 at 6:58 PM, Eric Snow <ericsnowcurrently at gmail.com>wrote:
>
>>
>> ``is_package``
>>
>> Whether or not the module is a package.  This dynamic attribute is True
>> if ``path`` is set (even if empty), else it is false.
>>
>
> "is True if ``path`` is not None (e.g. the empty list is a "true" value),
> else it is False".
>

Thanks.  That is clearer.


>
>
>>
>> ``origin``
>>
>> A string for the location from which the module originates.  If
>> ``filename`` is set, ``origin`` should be set to the same value unless
>> some other value is more appropriate.  ``origin`` is used in
>> ``module_repr()`` if it does not match the value of ``filename``.
>>
>> Using ``filename`` for this meaning would be inaccurate, since not all
>> modules have path-based locations.  For instance, built-in modules do
>> not have ``__file__`` set.  Yet it is useful to have a descriptive
>> string indicating that it originated from the interpreter as a built-in
>> module.  So built-in modules will have ``origin`` set to ``"built-in"``.
>>
>
> I still don't know what you would put there for a zipfile-based loader.
> Would you still put __file__ or would you put the zipfile? I ask because I
> would want a way to pass along in a zipfile finder to the loader where the
> zipfile is located and then the internal location of the file. Otherwise
> you need to pass in the zip path separately from the internal path to the
> loader constructor instead of simply passing in a ModuleSpec (e.g. see
> _split_path in http://bugs.python.org/file30660/zip_importlib.diff).
>

For me origin makes the most sense as the "string for the location from
which the module originates".  I'd think it would be the same as gets put
into __file__ right now.  However, you're right that there's more useful
info that could be stored on the spec.  In this case I'd expect it to be
added as an extra attribute on the spec rather than as part of the normal
ModuleSpec attributes.  However, as Nick pointed out, custom attributes
currently don't have a good strategy for avoiding collisions with future
normal ModuleSpec attributes.


> ``path``
>>
>> The list of path entries in which to search for submodules if this
>> module is a package.  Otherwise it is ``None``.
>>
>> .. XXX add a path-based subclass?
>>
>
> You mean like namespace package's __path__ object? Or are you saying you
> want ModuleSpec vs. PackageSpec?
>

More like ModuleSpec and PathModuleSpec.  PathModuleSpec would have
filename, cached, and path (an associated handling), while ModuleSpec would
not.  At the same time I like having a one-size-fits-all ModuleSpec if
possible, since it should probably pretty closely follow the
one-size-fits-all module type.


>
>
>>
>> ModuleSpec Methods
>> ------------------
>>
>> ``from_loader(name, loader, *, is_package=None, origin=None,
>> filename=None, cached=None, path=None)``
>>
>> .. XXX use a different name?
>>
>> A factory classmethod that returns a new ``ModuleSpec`` derived from the
>> arguments.  ``is_package`` is used inside the method to indicate that
>> the module is a package.
>>
>
> Why is this parameter instead of the other than inferring from 'path' or
> loader.is_package() as you fall back on? What's the motivation?
>

In part it's intended to lower the barrier to entry for people learning
about the import system and getting their hands dirty.  It's just more
obvious as an explicit parameter.  Of course, it means there are two
parameters that basically accomplish the same thing, so perhaps it's not
worth it.  Furthermore, `from_loader()` may go the way of the dodo since
the motivation for it has mostly gone away with other API changes.


> Just to be clear,
>> here is a more detailed description of those calculations::
>>
>>    If not passed in, ``filename`` is to the result of calling the
>>    loader's ``get_filename()``, if available.  Otherwise it stays
>>    unset (``None``).
>>
>>    If not passed in, ``path`` is set to an empty list if
>>    ``is_package`` is true.  Then the directory from ``filename`` is
>>    appended to it, if possible.  If ``is_package`` is false, ``path``
>>    stays unset.
>>
>>    If ``cached`` is not passed in and ``filename`` is passed in,
>>    ``cached`` is derived from it.  For filenames with a source suffix,
>>    it set to the result of calling
>>    ``importlib.util.cache_from_source()``.  For bytecode suffixes (e.g.
>>    ``.pyc``), ``cached`` is set to the value of ``filename``.  If
>>    ``filename`` is not passed in or ``cache_from_source()`` raises
>>    ``NotImplementedError``, ``cached`` stays unset.
>>
>>    If not passed in, ``origin`` is set to ``filename``.  Thus if
>>    ``filename`` is unset, ``origin`` stays unset.
>>
>
> Why is this a static constructor instead of a method like infer_values()
> or an infer_values keyword-only argument to the constructor to do this if
> requested?
>

Good point.  I was already planning on yanking `from_loader()`.  That
kw-only argument would probably be a good fit.  I'll try it out.


>
>
>>
>> ``module_repr()``
>>
>> Returns a repr string for the module if ``origin`` is set and
>> ``filename`` is not set.  The string refers to the value of ``origin``.
>> Otherwise ``module_repr()`` returns None.  This indicates to the module
>> type's ``__repr__()`` that it should fall back to the default repr.
>>
>
> This makes me think that origin is an odd name if all it affects is
> module_repr().
>

It's also informational, of course.


>
>
>>
>> We could also have ``module_repr()`` produce the repr for the case where
>> ``filename`` is set or where ``origin`` is not set, mirroring the repr
>> that the module type produces directly.  However, the repr string is
>> derived from the import-related module attributes, which might be out of
>> sync with the spec.
>>
>
>
>  [SNIP]
>
>
>> .. XXX add reload(module=None) and drop load()'s parameters entirely?
>>
>
> If you are going to make these semantics of making the module argument
> only good for reloading then I say yes, make it a separate method.
>

Yeah, I think it's settled.  I like Nick's suggestion of calling it
`exec()`.


>
>> .. XXX add more of importlib.reload()'s boilerplate to load()/reload()?
>>
>> Backward Compatibility
>> ----------------------
>>
>> Since ``Finder.find_module()`` methods would now return a module spec
>> instead of loader, specs must act like the loader that would have been
>> returned instead.  This is relatively simple to solve since the loader
>> is available as an attribute of the spec.  We will use ``__getattr__()``
>> to do it.
>>
>> However, ``ModuleSpec.is_package`` (an attribute) conflicts with
>> ``InspectLoader.is_package()`` (a method).  Working around this requires
>> a more complicated solution but is not a large obstacle.  Simply making
>> ``ModuleSpec.is_package`` a method does not reflect that is a relatively
>> static piece of data.
>>
>
> Maybe, but depending on what your "more complicated solution" it it might
> be best to just give up the purity and go with the practicality.
>

It's not that complicated, but not exactly pretty:

class _TruthyFunction:
    def __init__(self, func, is_true):
        self.func = func
        self._is_true = bool(is_true)
    def __repr__(self):
        return repr(self._is_true)
    def __bool__(self):
        return self._is_true
    def __call__(self, *args, **kwargs):
        return self.func(*args, **kwargs)

class ModuleSpec:
    ...
    @property
    def is_package(self):
        loader = self.loader
        is_package = False
        if self.path is not None:
            is_package = True
        elif hasattr(self.loader, 'is_package'):
            try:
                is_package = loader.is_package(self.name)
            except ImportError:
                pass
        # Since InspectLoader also has is_package(), we have to
        # accommodate the use of the return value as a function.
        def func(*args, **kwargs):
            # XXX Throw a DeprecationWarning here?
            return self.loader.is_package(*args, **kwargs)
        return _TruthyFunction(func, is_package)


>
>>  ``module_repr()`` also conflicts with the same
>> method on loaders, but that workaround is not complicated since both are
>> methods.
>>
>> Unfortunately, the ability to proxy does not extend to ``id()``
>> comparisons and ``isinstance()`` tests.  In the case of the return value
>> of ``find_module()``, we accept that break in backward compatibility.
>> However, we will mitigate the problem with ``isinstance()`` somewhat by
>> registering ``ModuleSpec`` on the loaders in ``importlib.abc``.
>>
>
> Actually, ModuleSpec doesn't even need to register; __instancecheck__ and
> __subclasscheck__ can just be defined and delegate by calling
> issubclass/isinstance on the loader as appropriate.
>

Do you mean add custom versions of those methods to importlib.abc.Loader?
 That should work as well as the register approach.  It won't work for all
loaders but should be good enough.  I was just planning on registering
ModuleSpec on the loader in the setter for a `loader` property on
ModuleSpec.

-eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20130812/5c5df9ef/attachment-0001.html>


More information about the Import-SIG mailing list