Draft PEP: "Simplified Package Layout and Partitioning"
So, over on the Import-SIG, we were talking about the implementation and terminology for PEP 382, and it became increasingly obvious that things were, well, not entirely okay in the "implementation is easy to explain" department. Anyway, to make a long story short, we came up with an alternative implementation plan that actually solves some other problems besides the one that PEP 382 sets out to solve, and whose implementation a bit is easier to explain. (In fact, for users coming from various other languages, it hardly needs any explanation at all.) However, for long-time users of Python, the approach may require a bit more justification, which is why roughly 2/3rds of the PEP consists of a detailed rationale, specification overview, rejected alternatives, and backwards-compatibility discussion... which is still a lot less verbiage than reading through the lengthy Import-SIG threads that led up to the proposal. ;-) (The remaining 1/3rd of the PEP is the short, sweet, and easy-to-explain implementation detail.) Anyway, the PEP has already been discussed on the Import-SIG, and is proposed as an alternative to PEP 382 ("Namespace packages"). We expect, however, that many people will be interested in it for reasons having little to do with the namespace packaging use case. So, we would like to submit this for discussion, hole-finding, and eventual Pronouncement. As Barry put it, "I think it's certainly worthy of posting to python-dev to see if anybody else can shoot holes in it, or come up with useful solutions to open questions. I'll be very interested to see Guido's reaction to it. :)" So, without further ado, here it is: PEP: XXX Title: Simplified Package Layout and Partitioning Version: $Revision$ Last-Modified: $Date$ Author: P.J. Eby Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 12-Jul-2011 Python-Version: 3.3 Post-History: Replaces: 382 Abstract ======== This PEP proposes an enhancement to Python's package importing to: * Surprise users of other languages less, * Make it easier to convert a module into a package, and * Support dividing packages into separately installed components (ala "namespace packages", as described in PEP 382) The proposed enhancements do not change the semantics of any currently-importable directory layouts, but make it possible for packages to use a simplified directory layout (that is not importable currently). However, the proposed changes do NOT add any performance overhead to the importing of existing modules or packages, and performance for the new directory layout should be about the same as that of previous "namespace package" solutions (such as ``pkgutil.extend_path()``). The Problem =========== .. epigraph:: "Most packages are like modules. Their contents are highly interdependent and can't be pulled apart. [However,] some packages exist to provide a separate namespace. ... It should be possible to distribute sub-packages or submodules of these [namespace packages] independently." -- Jim Fulton, shortly before the release of Python 2.3 [1]_ When new users come to Python from other languages, they are often confused by Python's packaging semantics. At Google, for example, Guido received complaints from "a large crowd with pitchforks" [2]_ that the requirement for packages to contain an ``__init__`` module was a "misfeature", and should be dropped. In addition, users coming from languages like Java or Perl are sometimes confused by a difference in Python's import path searching. In most other languages that have a similar path mechanism to Python's ``sys.path``, a package is merely a namespace that contains modules or classes, and can thus be spread across multiple directories in the language's path. In Perl, for instance, a ``Foo::Bar`` module will be searched for in ``Foo/`` subdirectories all along the module include path, not just in the first such subdirectory found. Worse, this is not just a problem for new users: it prevents *anyone* from easily splitting a package into separately-installable components. In Perl terms, it would be as if every possible ``Net::`` module on CPAN had to be bundled up and shipped in a single tarball! For that reason, various workarounds for this latter limitation exist, circulated under the term "namespace packages". The Python standard library has provided one such workaround since Python 2.3 (via the ``pkgutil.extend_path()`` function), and the "setuptools" package provides another (via ``pkg_resources.declare_namespace()``). The workarounds themselves, however, fall prey to a *third* issue with Python's way of laying out packages in the filesystem. Because a package *must* contain an ``__init__`` module, any attempt to distribute modules for that package must necessarily include that ``__init__`` module, if those modules are to be importable. However, the very fact that each distribution of modules for a package must contain this (duplicated) ``__init__`` module, means that OS vendors who package up these module distributions must somehow handle the conflict caused by several module distributions installing that ``__init__`` module to the same location in the filesystem. This led to the proposing of PEP 382 ("Namespace Packages") - a way to signal to Python's import machinery that a directory was importable, using unique filenames per module distribution. However, there was more than one downside to this approach. Performance for all import operations would be affected, and the process of designating a package became even more complex. New terminology had to be invented to explain the solution, and so on. As terminology discussions continued on the Import-SIG, it soon became apparent that the main reason it was so difficult to explain the concepts related to "namespace packages" was because Python's current way of handling packages is somewhat underpowered, when compared to other languages. That is, in other popular languages with package systems, no special term is needed to describe "namespace packages", because *all* packages generally behave in the desired fashion. Rather than being an isolated single directory with a special marker module (as in Python), packages in other languages are typically just the *union* of appropriately-named directories across the *entire* import or inclusion path. In Perl, for example, the module ``Foo`` is always found in a ``Foo.pm`` file, and a module ``Foo::Bar`` is always found in a ``Foo/Bar.pm`` file. (In other words, there is One Obvious Way to find the location of a particular module.) This is because Perl considers a module to be *different* from a package: the package is purely a *namespace* in which other modules may reside, and is only *coincidentally* the name of a module as well. In current versions of Python, however, the module and the package are more tightly bound together. ``Foo`` is always a module -- whether it is found in ``Foo.py`` or ``Foo/__init__.py`` -- and it is tightly linked to its submodules (if any), which *must* reside in the exact same directory where the ``__init__.py`` was found. On the positive side, this design choice means that a package is quite self-contained, and can be installed, copied, etc. as a unit just by performing an operation on the package's root directory. On the negative side, however, it is non-intuitive for beginners, and requires a more complex step to turn a module into a package. If ``Foo`` begins its life as ``Foo.py``, then it must be moved and renamed to ``Foo/__init__.py``. Conversely, if you intend to create a ``Foo.Bar`` module from the start, but have no particular module contents to put in ``Foo`` itself, then you have to create an empty and seemingly-irrelevant ``Foo/__init__.py`` file, just so that ``Foo.Bar`` can be imported. (And these issues don't just confuse newcomers to the language, either: they annoy many experienced developers as well.) So, after some discussion on the Import-SIG, this PEP was created as an alternative to PEP \382, in an attempt to solve *all* of the above problems, not just the "namespace package" use cases. And, as a delightful side effect, the solution proposed in this PEP does not affect the import performance of ordinary modules or self-contained (i.e. ``__init__``-based) packages. The Solution ============ In the past, various proposals have been made to allow more intuitive approaches to package directory layout. However, most of them failed because of an apparent backward-compatibility problem. That is, if the requirement for an ``__init__`` module were simply dropped, it would open up the possibility for a directory named, say, ``string`` on ``sys.path``, to block importing of the standard library ``string`` module. Paradoxically, however, the failure of this approach does *not* arise from the elimination of the ``__init__`` requirement! Rather, the failure arises because the underlying approach takes for granted that a package is just ONE thing, instead of two. In truth, a package comprises two separate, but related entities: a module (with its own, optional contents), and a *namespace* where *other* modules or packages can be found. In current versions of Python, however, the module part (found in ``__init__``) and the namespace for submodule imports (represented by the ``__path__`` attribute) are both initialized at the same time, when the package is first imported. And, if you assume this is the *only* way to initialize these two things, then there is no way to drop the need for an ``__init__`` module, while still being backwards-compatible with existing directory layouts. After all, as soon as you encounter a directory on ``sys.path`` matching the desired name, that means you've "found" the package, and must stop searching, right? Well, not quite. A Thought Experiment -------------------- Let's hop into the time machine for a moment, and pretend we're back in the early 1990s, shortly before Python packages and ``__init__.py`` have been invented. But, imagine that we *are* familiar with Perl-like package imports, and we want to implement a similar system in Python. We'd still have Python's *module* imports to build on, so we could certainly conceive of having ``Foo.py`` as a parent ``Foo`` module for a ``Foo`` package. But how would we implement submodule and subpackage imports? Well, if we didn't have the idea of ``__path__`` attributes yet, we'd probably just search ``sys.path`` looking for ``Foo/Bar.py``. But we'd *only* do it when someone actually tried to *import* ``Foo.Bar``. NOT when they imported ``Foo``. And *that* lets us get rid of the backwards-compatibility problem of dropping the ``__init__`` requirement, back here in 2011. How? Well, when we ``import Foo``, we're not even *looking* for ``Foo/`` directories on ``sys.path``, because we don't *care* yet. The only point at which we care, is the point when somebody tries to actually import a submodule or subpackage of ``Foo``. That means that if ``Foo`` is a standard library module (for example), and I happen to have a ``Foo`` directory on ``sys.path`` (without an ``__init__.py``, of course), then *nothing breaks*. The ``Foo`` module is still just a module, and it's still imported normally. Self-Contained vs. "Virtual" Packages ------------------------------------- Of course, in today's Python, trying to ``import Foo.Bar`` will fail if ``Foo`` is just a ``Foo.py`` module (and thus lacks a ``__path__`` attribute). So, this PEP proposes to *dynamically* create a ``__path__``, in the case where one is missing. That is, if I try to ``import Foo.Bar`` the proposed change to the import machinery will notice that the ``Foo`` module lacks a ``__path__``, and will therefore try to *build* one before proceeding. And it will do this by making a list of all the existing ``Foo/`` subdirectories of the directories listed in ``sys.path``. If the list is empty, the import will fail with ``ImportError``, just like today. But if the list is *not* empty, then it is saved in a new ``Foo.__path__`` attribute, making the module a "virtual package". That is, because it now has a valid ``__path__``, we can proceed to import submodules or subpackages in the normal way. Now, notice that this change does not affect "classic", self-contained packages that have an ``__init__`` module in them. Such packages already *have* a ``__path__`` attribute (initialized at import time) so the import machinery won't try to create another one later. This means that (for example) the standard library ``email`` package will not be affected in any way by you having a bunch of unrelated directories named ``email`` on ``sys.path``. (Even if they contain ``*.py`` files.) But it *does* mean that if you want to turn your ``Foo`` module into a ``Foo`` package, all you have to do is add a ``Foo/`` directory somewhere on ``sys.path``, and start adding modules to it. But what if you only want a "namespace package"? That is, a package that is *only* a namespace for various separately-distributed submodules and subpackages? For example, if you're Zope Corporation, distributing dozens of separate tools like ``zc.buildout``, each in packages under the ``zc`` namespace, you don't want to have to make and include an empty ``zc.py`` in every tool you ship. (And, if you're a Linux or other OS vendor, you don't want to deal with the package installation conflicts created by trying to install ten copies of ``zc.py`` to the same location!) No problem. All we have to do is make one more minor tweak to the import process: if the "classic" import process fails to find a self-contained module or package (e.g., if ``import zc`` fails to find a ``zc.py`` or ``zc/__init__.py``), then we once more try to build a ``__path__`` by searching for all the ``zc/`` directories on ``sys.path``, and putting them in a list. If this list is empty, we raise ``ImportError``. But if it's non-empty, we create an empty ``zc`` module, and put the list in ``zc.__path__``. Congratulations: ``zc`` is now a namespace-only, "pure virtual" package! It has no module contents, but you can still import submodules and subpackages from it, regardless of where they're located on ``sys.path``. (By the way, both of these additions to the import protocol (i.e. the dynamically-added ``__path__``, and dynamically-created modules) apply recursively to child packages, using the parent package's ``__path__`` in place of ``sys.path`` as a basis for generating a child ``__path__``. This means that self-contained and virtual packages can contain each other without limitation, with the caveat that if you put a virtual package inside a self-contained one, it's gonna have a really short ``__path__``!) Backwards Compatibility and Performance --------------------------------------- Notice that these two changes *only* affect import operations that today would result in ``ImportError``. As a result, the performance of imports that do not involve virtual packages is unaffected, and potential backward compatibility issues are very restricted. Today, if you try to import submodules or subpackages from a module with no ``__path__``, it's an immediate error. And of course, if you don't have a ``zc.py`` or ``zc/__init__.py`` somewhere on ``sys.path`` today, ``import zc`` would likewise fail. Thus, the only potential backwards-compatibility issues are: 1. Tools that expect package directories to have an ``__init__`` module, that expect directories without an ``__init__`` module to be unimportable, or that expect ``__path__`` attributes to be static, will not recognize virtual packages as packages. (In practice, this just means that tools will need updating to support virtual packages, e.g. by using ``pkgutil.walk_modules()`` instead of using hardcoded filesystem searches.) 2. Code that *expects* certain imports to fail may now do something unexpected. This should be fairly rare in practice, as most sane, non-test code does not import things that are expected not to exist! The biggest likely exception to the above would be when a piece of code tries to check whether some package is installed by importing it. If this is done *only* by importing a top-level module (i.e., not checking for a ``__version__`` or some other attribute), *and* there is a directory of the same name as the sought-for package on ``sys.path`` somewhere, *and* the package is not actually installed, then such code could *perhaps* be fooled into thinking a package is installed that really isn't. However, even in the rare case where all these conditions line up to happen at once, the failure is more likely to be annoying than damaging. In most cases, after all, the code will simply fail a little later on, when it actually tries to DO something with the imported (but empty) module. (And code that checks ``__version__`` attributes or for the presence of some desired function, class, or module in the package will not see a false positive result in the first place.) Meanwhile, tools that expect to locate packages and modules by walking a directory tree can be updated to use the existing ``pkgutil.walk_modules()`` API, and tools that need to inspect packages in memory should use the other APIs described in the `Standard Library Changes/Additions`_ section below. Specification ============= Two changes are made to the existing import process. First, the built-in ``__import__`` function must not raise an ``ImportError`` when importing a submodule of a module with no ``__path__``. Instead, it must attempt to *create* a ``__path__`` attribute for the parent module first, as described in `__path__ creation`_, below. Second, if searching ``sys.meta_path`` and ``sys.path`` (or a parent package ``__path__``) fails to find a module being imported, the import process must attempt to create a ``__path__`` attribute for the missing module. If the attempt succeeds, an empty module is created and its ``__path__`` is set. Otherwise, importing fails. In both of the above cases, if a non-empty ``__path__`` is created, the name of the module whose ``__path__`` was created is added to ``sys.virtual_packages`` -- an initially-empty ``set()`` of package names. (This way, code that extends ``sys.path`` at runtime can find out what virtual packages are currently imported, and thereby add any new subdirectories to those packages' ``__path__`` attributes. See `Standard Library Changes/Additions`_ below for more details.) Conversely, if an empty ``__path__`` results, an ``ImportError`` is immediately raised, and the module is not created or changed, nor is its name added to ``sys.virtual_packages``. ``__path__`` Creation --------------------- A virtual ``__path__`` is created by obtaining a PEP 302 "importer" object for each of the path entries found in ``sys.path`` (for a top-level module) or the parent ``__path__`` (for a submodule). (Note: because ``sys.meta_path`` importers are not associated with ``sys.path`` or ``__path__`` entry strings, such importers do *not* participate in this process.) Each importer is checked for a ``get_subpath()`` method, and if present, the method is called with the full name of the module/package the ``__path__`` is being constructed for. The return value is either a string representing a subdirectory for the requested package, or ``None`` if no such subdirectory exists. The strings returned by the importers are added to the ``__path__`` being built, in the same order as they are found. (``None`` values and missing ``get_subpath()`` methods are simply skipped.) In Python code, the algorithm would look something like this:: def get_virtual_path(modulename, parent_path=None): if parent_path is None: parent_path = sys.path path = [] for entry in parent_path: # Obtain a PEP 302 importer object - see pkgutil module importer = pkgutil.get_importer(entry) if hasattr(importer, 'get_subpath'): subpath = importer.get_subpath(modulename) if subpath is not None: path.append(subpath) return path And a function like this one should be exposed in the standard library as e.g. ``imp.get_virtual_path()``, so that people creating ``__import__`` replacements or ``sys.meta_path`` hooks can reuse it. Standard Library Changes/Additions ---------------------------------- The ``pkgutil`` module should be updated to handle this specification appropriately, including any necessary changes to ``extend_path()``, ``iter_modules()``, etc. Specifically the proposed changes and additions to ``pkgutil`` are: * A new ``extend_virtual_paths(path_entry)`` function, to extend existing, already-imported virtual packages' ``__path__`` attributes to include any portions found in a new ``sys.path`` entry. This function should be called by applications extending ``sys.path`` at runtime, e.g. when adding a plugin directory or an egg to the path. The implementation of this function does a simple top-down traversal of ``sys.virtual_packages``, and performs any necessary ``get_subpath()`` calls to identify what path entries need to be added to each package's ``__path__``, given that `path_entry` has been added to ``sys.path``. (Or, in the case of sub-packages, adding a derived subpath entry, based on their parent namespace's ``__path__``.) * A new ``iter_virtual_packages(parent='')`` function to allow top-down traversal of virtual packages in ``sys.virtual_packages``, by yielding the child virtual packages of `parent`. For example, calling ``iter_virtual_packages("zope")`` might yield ``zope.app`` and ``zope.products`` (if they are imported virtual packages listed in ``sys.virtual_packages``), but **not** ``zope.foo.bar``. (This function is needed to implement ``extend_virtual_paths()``, but is also potentially useful for other code that needs to inspect imported virtual packages.) * ``ImpImporter.iter_modules()`` should be changed to also detect and yield the names of modules found in virtual packages. In addition to the above changes, the ``zipimport`` importer should have its ``iter_modules()`` implementation similarly changed. (Note: current versions of Python implement this via a shim in ``pkgutil``, so technically this is also a change to ``pkgutil``.) Last, but not least, the ``imp`` module (or ``importlib``, if appropriate) should expose the algorithm described in the `__path__ creation`_ section above, as a ``get_virtual_path(modulename, parent_path=None)`` function, so that creators of ``__import__`` replacements can use it. Implementation Notes -------------------- For users, developers, and distributors of virtual packages: * While virtual packages are easy to set up and use, there is still a time and place for using self-contained packages. While it's not strictly necessary, adding an ``__init__`` module to your self-contained packages lets users of the package (and Python itself) know that *all* of the package's code will be found in that single subdirectory. In addition, it lets you define ``__all__``, expose a public API, provide a package-level docstring, and do other things that make more sense for a self-contained project than for a mere "namespace" package. * ``sys.virtual_packages`` is allowed to contain non-existent or not-yet-imported package names; code that uses its contents should not assume that every name in this set is also present in ``sys.modules`` or that importing the name will necessarily succeed. * If you are changing a currently self-contained package into a virtual one, it's important to note that you can no longer use its ``__file__`` attribute to locate data files stored in a package directory. Instead, you must search ``__path__`` or use the ``__file__`` of a submodule adjacent to the desired files, or of a self-contained subpackage that contains the desired files. (Note: this caveat is already true for existing users of "namespace packages" today. That is, it is an inherent result of being able to partition a package, that you must know *which* partition the desired data file lives in. We mention it here simply so that *new* users converting from self-contained to virtual packages will also be aware of it.) * XXX what is the __file__ of a "pure virtual" package? ``None``? Some arbitrary string? The path of the first directory with a trailing separator? No matter what we put, *some* code is going to break, but the last choice might allow some code to accidentally work. Is that good or bad? For those implementing PEP \302 importer objects: * Importers that support the ``iter_modules()`` method (used by ``pkgutil`` to locate importable modules and packages) and want to add virtual package support should modify their ``iter_modules()`` method so that it discovers and lists virtual packages as well as standard modules and packages. To do this, the importer should simply list all immediate subdirectory names in its jurisdiction that are valid Python identifiers. XXX This might list a lot of not-really-packages. Should we require importable contents to exist? If so, how deep do we search, and how do we prevent e.g. link loops, or traversing onto different filesystems, etc.? Ick. * "Meta" importers (i.e., importers placed on ``sys.meta_path``) do not need to implement ``get_subpath()``, because the method is only called on importers corresponding to ``sys.path`` entries and ``__path__`` entries. If a meta importer wishes to support virtual packages, it must do so entirely within its own ``find_module()`` implementation. Unfortunately, it is unlikely that any such implementation will be able to merge its package subpaths with those of other meta importers or ``sys.path`` importers, so the meaning of "supporting virtual packages" for a meta importer is currently undefined! (However, since the intended use case for meta importers is to replace Python's normal import process entirely for some subset of modules, and the number of such importers currently implemented is quite small, this seems unlikely to be a big issue in practice.) References ========== .. [1] "namespace" vs "module" packages (mailing list thread) (http://mail.zope.org/pipermail/zope3-dev/2002-December/004251.html) .. [2] "Dropping __init__.py requirement for subpackages" (http://mail.python.org/pipermail/python-dev/2006-April/064400.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
On Wed, Jul 20, 2011 at 1:58 PM, P.J. Eby <pje@telecommunity.com> wrote:
So, without further ado, here it is:
I pushed this version up to the PEPs repo, so it now has a number (402) and can be read in prettier HTML format: http://www.python.org/dev/peps/pep-0402/ Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
+1 (and yay!) -- Piotr Ożarowski Debian GNU/Linux Developer www.ozarowski.pl www.griffith.cc www.debian.org GPG Fingerprint: 1D2F A898 58DA AF62 1786 2DF7 AEF6 F1A2 A745 7645
On 7/19/2011 8:58 PM, P.J. Eby wrote:
Standard Library Changes/Additions ----------------------------------
The ``pkgutil`` module should be updated to handle this specification appropriately, including any necessary changes to ``extend_path()``, ``iter_modules()``, etc.
Specifically the proposed changes and additions to ``pkgutil`` are:
* A new ``extend_virtual_paths(path_entry)`` function, to extend existing, already-imported virtual packages' ``__path__`` attributes to include any portions found in a new ``sys.path`` entry. This function should be called by applications extending ``sys.path`` at runtime, e.g. when adding a plugin directory or an egg to the path.
The implementation of this function does a simple top-down traversal of ``sys.virtual_packages``, and performs any necessary ``get_subpath()`` calls to identify what path entries need to be added to each package's ``__path__``, given that `path_entry` has been added to ``sys.path``. (Or, in the case of sub-packages, adding a derived subpath entry, based on their parent namespace's ``__path__``.)
When I read about creating __path__ from sys.path, I immediately thought of the issue of programs that extend sys.path, and the above is the "workaround" for such programs. but it requires such programs to do work, and there are a lot of such programs (I, a relative newbie, have had to write some). As it turns out, I can't think of a situation where I have extended sys.path that would result in a problem for fancy namespace packages, because so far I've only written modules, not packages, and only modules are on the paths that I add to sys.path. But that does not make for a general solution. Is there some way to create a new __path__ that would reflect the fact that it has been dynamically created, rather than set from __init__.py, and then when it is referenced, calculate (and cache?) a new value of __path__ to actually search?
At 02:24 AM 7/20/2011 -0700, Glenn Linderman wrote:
When I read about creating __path__ from sys.path, I immediately thought of the issue of programs that extend sys.path, and the above is the "workaround" for such programs. but it requires such programs to do work, and there are a lot of such programs (I, a relative newbie, have had to write some). As it turns out, I can't think of a situation where I have extended sys.path that would result in a problem for fancy namespace packages, because so far I've only written modules, not packages, and only modules are on the paths that I add to sys.path. But that does not make for a general solution.
Most programs extend sys.path in order to import things. If those things aren't yet imported, they don't have a __path__ yet, and so don't need to be fixed. Only programs that modify sys.path *after* importing something that has a dynamic __path__ would need to do anything about that.
Is there some way to create a new __path__ that would reflect the fact that it has been dynamically created, rather than set from __init__.py, and then when it is referenced, calculate (and cache?) a new value of __path__ to actually search?
That's what extend_virtual_paths() is for. It updates the __path__ of all currently-imported virtual packages. Where before you wrote: sys.path.append('foo') You would now write: sys.path.append('foo') pkgutil.extend_virtual_paths('foo') ...assuming you have virtual packages you've already imported. If you don't, there's no reason to call extend_virtual_paths(). But it doesn't hurt anything if you call it unnecessarily, because it uses sys.virtual_packages to find out what to update, and if you haven't imported any virtual packages, there's nothing to update and the call will be a quick no-op.
On 7/20/2011 6:05 AM, P.J. Eby wrote:
At 02:24 AM 7/20/2011 -0700, Glenn Linderman wrote:
When I read about creating __path__ from sys.path, I immediately thought of the issue of programs that extend sys.path, and the above is the "workaround" for such programs. but it requires such programs to do work, and there are a lot of such programs (I, a relative newbie, have had to write some). As it turns out, I can't think of a situation where I have extended sys.path that would result in a problem for fancy namespace packages, because so far I've only written modules, not packages, and only modules are on the paths that I add to sys.path. But that does not make for a general solution.
Most programs extend sys.path in order to import things. If those things aren't yet imported, they don't have a __path__ yet, and so don't need to be fixed. Only programs that modify sys.path *after* importing something that has a dynamic __path__ would need to do anything about that.
Sure. But there are a lot of things already imported by Python itself, and if this mechanism gets used in the stdlib, a program wouldn't know whether it is safe or not, to not bother with the pkgutil.extend_virtual_paths() call or not. Plus, that requires importing pkgutil, which isn't necessarily done by every program that extends the sys.path ("import sys" is sufficient at present). Plus, if some 3rd party packages are imported before sys.path is extended, the knowledge of how they are implement is required to make a choice about whether it is needed to import pkgutil and call extend_virtual_paths or not. So I am still left with my original question:
Is there some way to create a new __path__ that would reflect the fact that it has been dynamically created, rather than set from __init__.py, and then when it is referenced, calculate (and cache?) a new value of __path__ to actually search?
That's what extend_virtual_paths() is for. It updates the __path__ of all currently-imported virtual packages. Where before you wrote:
sys.path.append('foo')
You would now write:
sys.path.append('foo') pkgutil.extend_virtual_paths('foo')
...assuming you have virtual packages you've already imported. If you don't, there's no reason to call extend_virtual_paths(). But it doesn't hurt anything if you call it unnecessarily, because it uses sys.virtual_packages to find out what to update, and if you haven't imported any virtual packages, there's nothing to update and the call will be a quick no-op.
I think I would have to write sys.path.append('foo') import pkgutil pkgutil.extend_virtual_paths('foo') or I'd get an error. And, in the absence of knowing (because I didn't write them) whether any of the packages I imported before extending sys.path are virtual packages or not, I would have to do this every time I extend sys.path. And so it becomes a burden on writing programs. If the code is so boilerplate as you describe, should sys.path become an object that acts like a list, instead of a list, and have its append method automatically do the pkgutil.extend_virtual_paths for me? Then I wouldn't have to worry about whether any of the packages I imported were virtual packages or not.
At 03:09 PM 7/20/2011 -0700, Glenn Linderman wrote:
On 7/20/2011 6:05 AM, P.J. Eby wrote:
At 02:24 AM 7/20/2011 -0700, Glenn Linderman wrote:
When I read about creating __path__ from sys.path, I immediately thought of the issue of programs that extend sys.path, and the above is the "workaround" for such programs.ÃÂ but it requires such programs to do work, and there are a lot of such programs (I, a relative newbie, have had to write some).ÃÂ As it turns out, I can't think of a situation where I have extended sys.path that would result in a problem for fancy namespace packages, because so far I've only written modules, not packages, and only modules are on the paths that I add to sys.path.ÃÂ But that does not make for a general solution.
Most programs extend sys.path in order to import things. If those things aren't yet imported, they don't have a __path__ yet, and so don't need to be fixed. Only programs that modify sys.path *after* importing something that has a dynamic __path__ would need to do anything about that.
Sure. But there are a lot of things already imported by Python itself, and if this mechanism gets used in the stdlib, a program wouldn't know whether it is safe or not, to not bother with the pkgutil.extend_virtual_paths() call or not.
I'm not sure I see how the mechanism could meaningfully be used in the stdlib, since IIUC we're not going for Perl-style package naming. So, all stdlib packages would be self-contained.
Plus, that requires importing pkgutil, which isn't necessarily done by every program that extends the sys.path ("import sys" is sufficient at present).
Plus, if some 3rd party packages are imported before sys.path is extended, the knowledge of how they are implement is required to make a choice about whether it is needed to import pkgutil and call extend_virtual_paths or not.
I'd recommend *always* using it, outside of simple startup code.
So I am still left with my original question:
Is there some way to create a new __path__ that would reflect the fact that it has been dynamically created, rather than set from __init__.py, and then when it is referenced, calculate (and cache?) a new value of __path__ to actually search?
Hm. Yes, there is a way to do something like that, but it would complicate things a bit. We'd need to: 1. Leave __path__ off of the modules, and always pull them from sys.virtual_package_paths, and 2. Before using a value in sys.virtual_package_paths, we'd need to check whether sys.path had changed since we last cached anything, and if so, clear sys.virtual_package_paths first, to force a refresh. This doesn't sound particularly forbidding, but there are various unpleasant consequences, like being unable to tell whether a module is a package or not, and whether it's a virtual package or not. We'd have to invent new ways to denote these things. On the bright side, though, it *would* allow transparent live updates to virtual package paths, so it might be worth considering. By the way, the reason we have to get rid of __path__ is that if we kept it, then code could change it, and then we wouldn't know if it was actually safe to change it automatically... even if no code had actually changed it. In principle, we could keep __path__ attributes around, and automatically update them in the case where sys.path has changed, so long as user code hasn't directly altered or replaced the __path__. But it seems to me to be a dangerous corner case; I'd rather that code which touches __path__ be taking responsibility for that path's correctness from then on, rather than having it get updated (possibly incorrectly) behind its back. So, I'd say that for this approach, we'd have to actually leave __path__ off of virtual packages' parent modules. Anyway, it seems worth considering. We just need to sort out what the downsides are for any current tools thinking that such modules aren't packages. (But hey, at least it'll be consistent with what such tools would think of the on-disk representation! That is, a tool that thinks foo.py alongside a foo/ subdirectory is just a module with no package, will also think that 'foo', once imported, is a module with no package.)
And, in the absence of knowing (because I didn't write them) whether any of the packages I imported before extending sys.path are virtual packages or not, I would have to do this every time I extend sys.path. And so it becomes a burden on writing programs.
If the code is so boilerplate as you describe, should sys.path become an object that acts like a list, instead of a list, and have its append method automatically do the pkgutil.extend_virtual_paths for me? Then I wouldn't have to worry about whether any of the packages I imported were virtual packages or not.
Well, then we'd have to worry about other mutation methods, and things like 'sys.path = [blah, blah]', as well. So if we're going to ditch the need for extend_virtual_paths(), we should probably do it via the absence of __path__ attributes.
On 7/20/2011 4:03 PM, P.J. Eby wrote:
I'd recommend *always* using it, outside of simple startup code.
So that is a burden on every program. Documentation would help, but it certainly makes updating sys.path much more complex -- 3 lines (counting import of pkgutil) instead of one, and the complexity of understanding why there is a need for it, when in simple cases the single line works fine, but it would be bug prone to have both ways.
So I am still left with my original question:
Is there some way to create a new __path__ that would reflect the fact that it has been dynamically created, rather than set from __init__.py, and then when it is referenced, calculate (and cache?) a new value of __path__ to actually search?
Hm. Yes, there is a way to do something like that, but it would complicate things a bit
From what you said, it would complicate the solution for complex packaging tasks, but would return simple extensions of sys.path to being simple again. Sounds like a good tradeoff, but I'll leave that to you and other more knowledgeable people to figure out the details and implementation... I snipped the explanation, because it is beyond my present knowledge base.
Anyway, it seems worth considering. We just need to sort out what the downsides are for any current tools thinking that such modules aren't packages. (But hey, at least it'll be consistent with what such tools would think of the on-disk representation! That is, a tool that thinks foo.py alongside a foo/ subdirectory is just a module with no package, will also think that 'foo', once imported, is a module with no package.)
Please consider it. I think your initial proposal solves some problems, but a version that doesn't complicate the normal, simple, extension of sys.path would be a much better solution, so I am happy to hear that you have ideas in that regard. Hopefully, they don't complicate things too much more. So far, I haven't gotten my head around packages as they presently exist (this __init__.py stuff seems much more complex than the simplicity of Perl imports that I was used to, although I certainly like many things about Python better than Perl, and have switched whole-heartedly, although I still have a fair bit of Perl code to port in the fullness of time). I think your proposal here, although maintaining some amount of backward-compatibility may require complexity of implementation, can simplify the requirements for creating new packages, to the extent I understand it.
On Thu, Jul 21, 2011 at 9:03 AM, P.J. Eby <pje@telecommunity.com> wrote:
Hm. Yes, there is a way to do something like that, but it would complicate things a bit. We'd need to:
1. Leave __path__ off of the modules, and always pull them from sys.virtual_package_paths, and
Setting __path__ to a sentinel value (imp.VirtualPath?) would break less code, as hasattr(mod, '__path__') checks would still work. Even better would be for these (and sys.path) to be list subclasses that did the right thing under the hood as Glenn suggested. Code that *replaces* rather than modifies these attributes would still potentially break virtual packages, but code that modifies them in place would do the right thing automatically. (Note that all code that manipulates sys.path and __path__ attributes requires explicit calls to correctly support current namespace package mechanisms, so this would actually be an improvement on the status quo rather than making anything worse). I'll note that this kind of thing is one of the key reasons the import state should some day move to a real class - state coherency is one of the major use cases for the descriptor protocol, which is unavailable when interdependent state is stored as module attributes. (Don't worry, that day is a very long way away, if it ever happens at all)
2. Before using a value in sys.virtual_package_paths, we'd need to check whether sys.path had changed since we last cached anything, and if so, clear sys.virtual_package_paths first, to force a refresh.
This doesn't sound particularly forbidding, but there are various unpleasant consequences, like being unable to tell whether a module is a package or not, and whether it's a virtual package or not. We'd have to invent new ways to denote these things.
Trying to change how packages are identified at the Python level makes PEP 382 sound positively appealing. __path__ needs to stay :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, Jul 20, 2011 at 7:52 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Even better would be for these (and sys.path) to be list subclasses that did the right thing under the hood as Glenn suggested. Code that *replaces* rather than modifies these attributes would still potentially break virtual packages, but code that modifies them in place would do the right thing automatically. (Note that all code that manipulates sys.path and __path__ attributes requires explicit calls to correctly support current namespace package mechanisms, so this would actually be an improvement on the status quo rather than making anything worse).
+1 as a solution to the problem Glenn brought up. However, I'm still not clear on how much code out there changes sys.path in the offending way, forcing the need to provide a more implicit solution in this PEP than extend_virtual_paths(). And in cases where sys.path *is* changed, and it impacts some virtual package, how many places is that going to happen in one project? My guess is not many (and so not many "boilerplate" calls). Is it worth adding implicit __path__ updates for that use case, rather than just the extend_virtual_paths() function? As an aside, my first reaction to Glenn's suggestion was "that would be cool". Would it be a pursuable option? We can take this over to import-sig if it is. -eric
On Tue, 19 Jul 2011 23:58:55 -0400, "P.J. Eby" <pje@telecommunity.com> wrote:
Worse, this is not just a problem for new users: it prevents *anyone* from easily splitting a package into separately-installable components. In Perl terms, it would be as if every possible ``Net::`` module on CPAN had to be bundled up and shipped in a single tarball!
In general the simplicity of the proposed mechanism and implementation is attractive. However, this bit of discussion struck me as sending the wrong message. We don't *want* something like the CPAN module hierarchy. I prefer to keep things as flat as practical. Namespace packages clearly have utility, but please let's not descend into java-esq package hierarchies. -- R. David Murray http://www.bitdance.com
I wonder if this fixes the long-standing issue in OS vendor's distributions. In Fedora, for example, there is both arch-specific and non-arch directories: /usr/lib/python2.7 + /usr/lib64/python2.7, for example. Pure python goes into /usr/lib/python2.7, and code including binaries goes into /usr/lib64/python2.7. But if a package has both, it all has to go into /usr/lib64/python2.7, because the current loader can't find pieces in 2 different directories. You can't have both /usr/lib/python2.7/site-packages/foo and /usr/lib64/python2.7/site-packages/foo. So if this PEP will allow pieces of foo to be found in 2 different places, that would be helpful, IMO.
At 10:40 AM 7/20/2011 -0400, Neal Becker wrote:
I wonder if this fixes the long-standing issue in OS vendor's distributions. In Fedora, for example, there is both arch-specific and non-arch directories: /usr/lib/python2.7 + /usr/lib64/python2.7, for example. Pure python goes into /usr/lib/python2.7, and code including binaries goes into /usr/lib64/python2.7. But if a package has both, it all has to go into /usr/lib64/python2.7, because the current loader can't find pieces in 2 different directories.
You can't have both /usr/lib/python2.7/site-packages/foo and /usr/lib64/python2.7/site-packages/foo.
So if this PEP will allow pieces of foo to be found in 2 different places, that would be helpful, IMO.
It's more of a long-term solution than a short-term one. In order for it to work the way you want, 'foo' would need to have its main code in foo.py rather than foo/__init__.py. You could of course make that change on the author's behalf for your distro, or remove it altogether if it doesn't contain any actual code. However, if you're going to make changes, you could change its __init__.py right now to append extra directories to the module __path__... and that's something you can do right now.
On Tue, Jul 19, 2011 at 8:58 PM, P.J. Eby <pje@telecommunity.com> wrote:
The biggest likely exception to the above would be when a piece of code tries to check whether some package is installed by importing it. If this is done *only* by importing a top-level module (i.e., not checking for a ``__version__`` or some other attribute), *and* there is a directory of the same name as the sought-for package on ``sys.path`` somewhere, *and* the package is not actually installed, then such code could *perhaps* be fooled into thinking a package is installed that really isn't.
This part worries me slightly. Imagine a program as such: datagen.py json/foo.js json/bar.js datagen.py uses the files in json/ to generate sample data for a database. In datagen.py is the following code: try: import json except ImportError: import simplejson as json Currently, this works just fine, but if will break (as I understand it) under the PEP because the json directory will become a virtual package and no ImportError will be raised. Is there a mitigation for this in the PEP that I've missed?
However, even in the rare case where all these conditions line up to happen at once, the failure is more likely to be annoying than damaging. In most cases, after all, the code will simply fail a little later on, when it actually tries to DO something with the imported (but empty) module. (And code that checks ``__version__`` attributes or for the presence of some desired function, class, or module in the package will not see a false positive result in the first place.)
It may only be annoying, but it's still a breaking change, and a subtle one at that. Checking __version__ is of course possible, but it's never been necessary before, so it's unlikely there's much code that does it. It also makes the fallback code significantly less neat. - Jeff
On Wed, Jul 20, 2011 at 11:56 AM, Jeff Hardy <jdhardy@gmail.com> wrote:
On Tue, Jul 19, 2011 at 8:58 PM, P.J. Eby <pje@telecommunity.com> wrote:
The biggest likely exception to the above would be when a piece of code tries to check whether some package is installed by importing it. If this is done *only* by importing a top-level module (i.e., not checking for a ``__version__`` or some other attribute), *and* there is a directory of the same name as the sought-for package on ``sys.path`` somewhere, *and* the package is not actually installed, then such code could *perhaps* be fooled into thinking a package is installed that really isn't.
This part worries me slightly. Imagine a program as such:
datagen.py json/foo.js json/bar.js
datagen.py uses the files in json/ to generate sample data for a database. In datagen.py is the following code:
try: import json except ImportError: import simplejson as json
Currently, this works just fine, but if will break (as I understand it) under the PEP because the json directory will become a virtual package and no ImportError will be raised. Is there a mitigation for this in the PEP that I've missed?
This problem was brought up a few times on import-sig, but I don't think a solution was ever decided on. The best solution I can think of would be to have a way for a module to mark itself as "finalized" (I'm not sure if that's the best term--just the first that popped into my head). This would prevent its __path__ from being created or extended in any way. For example, if the json module contains `__finalized__ = True` or something of the like, any `import json.foo` would immediately fail. Of course, this would put all the onus on the json module to solve this problem, and other modules might actually wish to be extendable into packages, in which case you'd still have this problem. In that case there would need to be a way to mark a directory as not containing importable code. Not sure what the best approach to that would be, especially since one of the goals of this PEP seems to be to avoid marker files. Erik
On Tue, 19 Jul 2011 23:58:55 -0400 "P.J. Eby" <pje@telecommunity.com> wrote:
Anyway, to make a long story short, we came up with an alternative implementation plan that actually solves some other problems besides the one that PEP 382 sets out to solve, and whose implementation a bit is easier to explain. (In fact, for users coming from various other languages, it hardly needs any explanation at all.)
I have a question. If I have (on sys.path) a module "x.py" containing, say: y = 5 and (also on sys.path), a directory "x" containing a "y.py" module. What is "from x import y" supposed to do? (currently, it would bind "y" to its value in x.py) Regards Antoine.
On Fri, Jul 22, 2011 at 9:35 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 19 Jul 2011 23:58:55 -0400 "P.J. Eby" <pje@telecommunity.com> wrote:
Anyway, to make a long story short, we came up with an alternative implementation plan that actually solves some other problems besides the one that PEP 382 sets out to solve, and whose implementation a bit is easier to explain. (In fact, for users coming from various other languages, it hardly needs any explanation at all.)
I have a question.
If I have (on sys.path) a module "x.py" containing, say:
y = 5
and (also on sys.path), a directory "x" containing a "y.py" module.
What is "from x import y" supposed to do?
(currently, it would bind "y" to its value in x.py)
It would behave the same as it does today: the imported value of 'y' would be 5. Virtual packages only kick in if an import would otherwise fail. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Le vendredi 22 juillet 2011 à 09:53 +1000, Nick Coghlan a écrit :
On Fri, Jul 22, 2011 at 9:35 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 19 Jul 2011 23:58:55 -0400 "P.J. Eby" <pje@telecommunity.com> wrote:
Anyway, to make a long story short, we came up with an alternative implementation plan that actually solves some other problems besides the one that PEP 382 sets out to solve, and whose implementation a bit is easier to explain. (In fact, for users coming from various other languages, it hardly needs any explanation at all.)
I have a question.
If I have (on sys.path) a module "x.py" containing, say:
y = 5
and (also on sys.path), a directory "x" containing a "y.py" module.
What is "from x import y" supposed to do?
(currently, it would bind "y" to its value in x.py)
It would behave the same as it does today: the imported value of 'y' would be 5.
Virtual packages only kick in if an import would otherwise fail.
Wouldn't it produce confusing situations like the above example? Regards Antoine.
On 7/21/2011 5:00 PM, Antoine Pitrou wrote:
Le vendredi 22 juillet 2011 à 09:53 +1000, Nick Coghlan a écrit :
On Fri, Jul 22, 2011 at 9:35 AM, Antoine Pitrou<solipsis@pitrou.net> wrote:
On Tue, 19 Jul 2011 23:58:55 -0400 "P.J. Eby"<pje@telecommunity.com> wrote:
Anyway, to make a long story short, we came up with an alternative implementation plan that actually solves some other problems besides the one that PEP 382 sets out to solve, and whose implementation a bit is easier to explain. (In fact, for users coming from various other languages, it hardly needs any explanation at all.) I have a question.
If I have (on sys.path) a module "x.py" containing, say:
y = 5
and (also on sys.path), a directory "x" containing a "y.py" module.
What is "from x import y" supposed to do?
(currently, it would bind "y" to its value in x.py) It would behave the same as it does today: the imported value of 'y' would be 5.
Virtual packages only kick in if an import would otherwise fail. Wouldn't it produce confusing situations like the above example?
Regards
Antoine.
If I have (on sys.path), a directory "x" containing a "y.py" module, and later (on sys.path), another directory "x" containing a "y.py" module, what is "from x import y" supposed to do? OR If I have (on sys.path), a module "x.py" containing, say: y = 5 and later (on sys.path), another module "x.py" containing, say: y = 6 what is "from x import y" supposed to do? I guess I don't see how this new proposal makes anything more confusing than it already is?
On Thu, 21 Jul 2011 17:31:04 -0700 Glenn Linderman <v+python@g.nevcal.com> wrote:
If I have (on sys.path), a directory "x" containing a "y.py" module, and later (on sys.path), another directory "x" containing a "y.py" module, what is "from x import y" supposed to do?
OR
If I have (on sys.path), a module "x.py" containing, say:
y = 5
and later (on sys.path), another module "x.py" containing, say:
y = 6
what is "from x import y" supposed to do?
I guess I don't see how this new proposal makes anything more confusing than it already is?
It does. In your two examples, the "x.py" files (or the "x" directories) live in two different base directories; imports are then resolved in sys.path order, which is expected and intuitive. However, you can have a "x.py" file and a "x" directory *in the same base directory which is present in sys.path*, meaning sys.path can't help disambiguate in this case. Regards Antoine.
On 7/21/2011 5:38 PM, Antoine Pitrou wrote:
However, you can have a "x.py" file and a "x" directory *in the same base directory which is present in sys.path*, meaning sys.path can't help disambiguate in this case.
Ah yes. It means there has to be one more rule for disambiguation, which Nick supplied. Your case wasn't clear to me from your first description, however. As long as there is an ordering, and it is documented, it is not particularly confusing, however.
On Fri, Jul 22, 2011 at 10:53 AM, Glenn Linderman <v+python@g.nevcal.com> wrote:
Ah yes. It means there has to be one more rule for disambiguation, which Nick supplied. Your case wasn't clear to me from your first description, however. As long as there is an ordering, and it is documented, it is not particularly confusing, however.
The genuinely confusing part is that x.py still takes precedence, even if it appears on sys.path *after* x/y.py. However, we're forced into that behaviour by backwards compatibility requirements. The alternative of allowing x/y.py to take precedence has been rejected on those grounds more than once. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri, Jul 22, 2011 at 10:00 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Wouldn't it produce confusing situations like the above example?
I don't see how it is any more confusing than any other form of module shadowing. For backwards compatibility reasons, the precedence model will be: 1. Modules and self-contained packages that can satisfy the import request are checked for first (along the whole length of sys.path). 2. If that fails, the virtual package mechanism is checked PEP 402 eliminates some cases of package shadowing by making __init__.py files optional, so your scenario will actually *work*, so long as the submodule name doesn't conflict with a module attribute. *Today* if you have: x.py x.pyd x.so x/__init__.py in the same sys.path directory, x.py wins (search order is controlled by the internal order of checks within the import system - and source files are first on that list). With PEP 302, x.py still wins, but the submodules within the x directory become accessible so long as they don't conflict with *actual* attributes set in the x module. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Le vendredi 22 juillet 2011 à 10:58 +1000, Nick Coghlan a écrit :
On Fri, Jul 22, 2011 at 10:00 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Wouldn't it produce confusing situations like the above example?
I don't see how it is any more confusing than any other form of module shadowing.
The additional confusion lies in the fact that a module can be shadowed by something which is not a module (a mere global variable). I find it rather baffling. Regards Antoine.
At 03:04 AM 7/22/2011 +0200, Antoine Pitrou wrote:
The additional confusion lies in the fact that a module can be shadowed by something which is not a module (a mere global variable). I find it rather baffling.
If you move x.py to x/__init__.py, it does *exactly the same thing* in current versions of Python: Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
from x import y import x.y x.y <module 'x.y' from 'x\y.py'> y 5
The PEP does nothing new or different here. If something is baffling you, it's the behavior of "from ... import", not the actual importing process. "from x import y" means "import x; y = x.y". The PEP does not propose we change this. ;-)
On Fri, Jul 22, 2011 at 11:04 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Le vendredi 22 juillet 2011 à 10:58 +1000, Nick Coghlan a écrit :
On Fri, Jul 22, 2011 at 10:00 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Wouldn't it produce confusing situations like the above example?
I don't see how it is any more confusing than any other form of module shadowing.
The additional confusion lies in the fact that a module can be shadowed by something which is not a module (a mere global variable). I find it rather baffling.
It's still an improvement on current Python. There a submodule can be shadowed uselessly by something that doesn't even exist. For example: x.py <-- No 'y' attribute x/__init__.py <-- not needed in PEP 402 x/y.py from x import y <-- ImportError now, but would work in PEP 402 However, this does highlight an interesting corner case not yet covered by the PEP: when building a virtual path to add to an existing module, what do we do with directories that contain __init__.py[co] files? 1. Ignore the entire directory (i.e leave it out of the created path)? (always emit ImportWarning) 2. Ignore the file and add the directory to the created path anyway? (never emit ImportWarning) 3. Ignore the file and add the directory to the created path anyway? (emit ImportWarning if __init__.py is not empty) 4. Ignore the file only if it is empty, otherwise ignore the whole directory? (emit ImportWarning if __init__.py is not empty) 5. Execute the file in the namespace of the existing module? I suspect option 1 will lead to the fewest quirks, since it preserves current shadowing behaviour for modules and self-contained packages. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Antoine Pitrou wrote:
The additional confusion lies in the fact that a module can be shadowed by something which is not a module (a mere global variable). I find it rather baffling.
I think we're stuck with that as long as we use the same syntax for importing a submodule and importing a non-module name from a module, i.e. 'from x import y'. -- Greg
Hi, sorry for nitpicking, but... On Wed, Jul 20, 2011 at 05:58, P.J. Eby <pje@telecommunity.com> wrote: ...
For those implementing PEP \302 importer objects:
the '\' should be removed, right? Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi
Hi Sandro,
On Wed, Jul 20, 2011 at 05:58, P.J. Eby <pje@telecommunity.com> wrote:
For those implementing PEP \302 importer objects:
the '\' should be removed, right?
No. Philip used backslashes to prevent the HTML conversion to transform each and every instance of “PEP \d+” to a link, which gets annoying after the few first hundred times. (It was discussed a few months ago probably on web-sig or python-dev for PEP 333 or 3333, if memory serves.) Cheers
On Sat, Jul 30, 2011 at 14:57, Éric Araujo <merwok@netwok.org> wrote:
On Wed, Jul 20, 2011 at 05:58, P.J. Eby <pje@telecommunity.com> wrote:
For those implementing PEP \302 importer objects:
the '\' should be removed, right?
No. Philip used backslashes to prevent the HTML conversion to transform each and every instance of “PEP \d+” to a link, which gets annoying after the few first hundred times. (It was discussed a few months ago probably on web-sig or python-dev for PEP 333 or 3333, if memory serves.)
Gaah, sorry for the noise then! (but at least I learnt a new thing!) Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi
participants (13)
-
Antoine Pitrou -
Eric Snow -
Erik -
Glenn Linderman -
Greg Ewing -
Jeff Hardy -
Neal Becker -
Nick Coghlan -
P.J. Eby -
Piotr Ożarowski -
R. David Murray -
Sandro Tosi -
Éric Araujo