Mailman 3 October 2011 - Python-ideas

Concurrent safety?
by Mike Meyer Nov. 1, 2011

Nov. 1, 2011

This is a blue sky idea. It can't happen in Python 3.x, and possibly not ever in cPython. I'm mostly hoping to get smarter people than me considering the issue. Synopsis Python, as a general rule, tries to be "safe" about things. If something isn't obviously correct, it tends to throw runtime errors to let you know that you need to be explicit about what you want. When there's not an obvious choice for type conversions, it raises an exception. You generally don't have to worry about … [View More]resource allocation if you stay in python. And so on. The one glaring exception is in concurrent programs. While the tools python has for dealing with such are ok, there isn't anything to warn you when you fail to use those tools and should be. The goal of this proposal is to fix that, and get the Python interpreter to help locate code that isn't safe to use in concurrent programs. Existence Proof This is possible. Clojure is a dynamic language in the LISP family that will throw exceptions if you try mutating variables without properly protecting them against concurrent access. This is not to say that the Clojure solution is the solution, or even the right solution for Python. It's just to demonstrate that this can be done. Object Changes Object semantics don't need to change very much. The existing immutable types will work well in this environment exactly as is. The mutable types - well, we can no longer go changing them willy-nilly. But any language needs mutable types, and there's nothing wrong with the ones we have. Since immutable types don't require access protection *at all*, it might be worthwhile to add a new child of object, "immutable". Instances of this type would be immutable after creation. Presumably, the __new__ method of Python classes inheriting from immutable would be used to set the initial attributes, but the __init__ method might also be able to handle that role. However, this is a performance tweak, allowing user-written classes to skip any runtime checks for being mutated. Binding Changes One of the way objects are mutated is by changing their bindings. As such, some of the bindings might need to be protected. Local variables are fine. We normally can't export those bindings to other functions, just the values bound to them. So changing the binding can stay the same. The bound object can be exported to other threads of execution, but changing it will fall under the rules for changing objects. Ditto for nonlocals. On the other hand, rebindings of module and class and instance variables can be visible in other threads of execution, so they require protection, just like changing mutable objects. New Syntax The protection mechanism is the change to the language. I propose a single new keyword, "locking", that acts similar to the "try" keyword. The syntax is: 'locking' value [',' value]*':' suite The list of values are the objects that can be mutated in this lock. An immutable object showing up in the list of values is a TypeError. It's not clear that function calls should be allowed in the list of values. On the other hand, indexing and attributes clearly should be, and those can turn into function calls, so it's not clear they shouldn't be allowed, either. The locked values can be mutated during the body of the locking suite. For the builtin mutable types, this means invoking their mutating methods. For modules, classes and object instances, it means rebinding their attributes. Locked objects stay locked during function invocations in the suite. This means you can write utility functions that expect to be passed locked objects to mutate. A locking statement can be used inside of another locking statement. See the Implementation section for possible restrictions on this. Any attempt to mutate an object that isn't currently locked will raise an exception. Possibly ValueError, possibly a new exception class just for this purpose. This includes rebinding attributes of objects that aren't locked. Implementation There are at least two ways this can be implemented, both with different restrictions on the suite. While both of them can probably be optimized if it's know that there are no other threads of execution, checking for attempts to mutate unlocked objects should still happen. 1) Conventional locking All the objects being locked have locks attached to them, which are locked when before entering the suite. The implementation must order the locked object in some repeatable way, so that two locking statements that have more than one locked object in common will obtain the locks on those objects in the same order. This will prevent deadlocks. This method will require that the initial locking statement lock all objects that may be locked during the execution of it's suite. This may be a reason for allowing functions as locking values, as a way to get locks on objects that code called in the suite is going to need. Another downside is that the programmer needs to handle exceptions raised during the suite to insure that a set of related changes leaves the relevant objects in a consistent state. In this case, an optional 'except' clause should be added to the locking statement to hold such code. 2) Software Transactional Memory In an STM implementation, copies of the locked objects are created by the locking statement, and they original are "fingerprinted" in some way. The locking suite then runs. When the suite completes, the fingerprints of the originals are checked to see if some other thread of execution has changed them. If they haven't changed, they are replaced by the copies, and execution continues. If the originals have changed, the entire process starts over. In this implementation, the only actual locking is during the original fingerprinting process (to insure that a consistent state is captured) and at the end of the suite. FWIW, this is one of the models provided by Clojure. The restriction on the suite in this case is that running it twice - except for changes to the locked objects - needs to be acceptable. In this case, exceptions don't need to be handled by the programmer to insure consistency. If an exception happens during the execution of the suite, the original values are never replaced. <mike -- Mike Meyer <mwm(a)mired.org> http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org [View Less]

8 16

PEP 395 - Module Aliasing
by Nick Coghlan Oct. 31, 2011

Oct. 31, 2011

I've updated the module aliasing PEP to be based on the terminology in Antoine's qualified names PEP. The full text is included below, or you can read it on python.org: http://www.python.org/dev/peps/pep-0395/ Cheers, Nick. ================================ PEP: 395 Title: Module Aliasing Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan <ncoghlan(a)gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-Mar-2011 Python-Version: 3.3 Post-History: … [View More]5-Mar-2011, 30-Oct-2011 Abstract ======== This PEP proposes new mechanisms that eliminate some longstanding traps for the unwary when dealing with Python's import system, the pickle module and introspection interfaces. It builds on the "Qualified Name" concept defined in PEP 3155. What's in a ``__name__``? ========================= Over time, a module's ``__name__`` attribute has come to be used to handle a number of different tasks. The key use cases identified for this module attribute are: 1. Flagging the main module in a program, using the ``if __name__ == "__main__":`` convention. 2. As the starting point for relative imports 3. To identify the location of function and class definitions within the running application 4. To identify the location of classes for serialisation into pickle objects which may be shared with other interpreter instances Traps for the Unwary ==================== The overloading of the semantics of ``__name__`` have resulted in several traps for the unwary. These traps can be quite annoying in practice, as they are highly unobvious and can cause quite confusing behaviour. A lot of the time, you won't even notice them, which just makes them all the more surprising when they do come up. Importing the main module twice ------------------------------- The most venerable of these traps is the issue of (effectively) importing ``__main__`` twice. This occurs when the main module is also imported under its real name, effectively creating two instances of the same module under different names. This problem used to be significantly worse due to implicit relative imports from the main module, but the switch to allowing only absolute imports and explicit relative imports means this issue is now restricted to affecting the main module itself. Why are my relative imports broken? ----------------------------------- PEP 366 defines a mechanism that allows relative imports to work correctly when a module inside a package is executed via the ``-m`` switch. Unfortunately, many users still attempt to directly execute scripts inside packages. While this no longer silently does the wrong thing by creating duplicate copies of peer modules due to implicit relative imports, it now fails noisily at the first explicit relative import, even though the interpreter actually has sufficient information available on the filesystem to make it work properly. <TODO: Anyone want to place bets on how many Stack Overflow links I could find to put here if I really went looking?> In a bit of a pickle -------------------- Something many users may not realise is that the ``pickle`` module serialises objects based on the ``__name__`` of the containing module. So objects defined in ``__main__`` are pickled that way, and won't be unpickled correctly by another python instance that only imported that module instead of running it directly. This behaviour is the underlying reason for the advice from many Python veterans to do as little as possible in the ``__main__`` module in any application that involves any form of object serialisation and persistence. Similarly, when creating a pseudo-module\*, pickles rely on the name of the module where a class is actually defined, rather than the officially documented location for that class in the module hierarchy. While this PEP focuses specifically on ``pickle`` as the principal serialisation scheme in the standard library, this issue may also affect other mechanisms that support serialisation of arbitrary class instances. \*For the purposes of this PEP, a "pseudo-module" is a package designed like the Python 3.2 ``unittest`` and ``concurrent.futures`` packages. These packages are documented as if they were single modules, but are in fact internally implemented as a package. This is *supposed* to be an implementation detail that users and other implementations don't need to worry about, but, thanks to ``pickle`` (and serialisation in general), the details are exposed and effectively become part of the public API. Where's the source? ------------------- Some sophisticated users of the pseudo-module technique described above recognise the problem with implementation details leaking out via the ``pickle`` module, and choose to address it by altering ``__name__`` to refer to the public location for the module before defining any functions or classes (or else by modifying the ``__module__`` attributes of those objects after they have been defined). This approach is effective at eliminating the leakage of information via pickling, but comes at the cost of breaking introspection for functions and classes (as their ``__module__`` attribute now points to the wrong place). Forkless Windows ---------------- To get around the lack of ``os.fork`` on Windows, the ``multiprocessing`` module attempts to re-execute Python with the same main module, but skipping over any code guarded by ``if __name__ == "__main__":`` checks. It does the best it can with the information it has, but is forced to make assumptions that simply aren't valid whenever the main module isn't an ordinary directly executed script or top-level module. Packages and non-top-level modules executed via the ``-m`` switch, as well as directly executed zipfiles or directories, are likely to make multiprocessing on Windows do the wrong thing (either quietly or noisily) when spawning a new process. While this issue currently only affects Windows directly, it also impacts any proposals to provide Windows-style "clean process" invocation via the multiprocessing module on other platforms. Proposed Changes ================ The following changes are interrelated and make the most sense when considered together. They collectively either completely eliminate the traps for the unwary noted above, or else provide straightforward mechanisms for dealing with them. A rough draft of some of the concepts presented here was first posted on the python-ideas list [1], but they have evolved considerably since first being discussed in that thread. Fixing dual imports of the main module -------------------------------------- Two simple changes are proposed to fix this problem: 1. In ``runpy``, modify the implementation of the ``-m`` switch handling to install the specified module in ``sys.modules`` under both its real name and the name ``__main__``. (Currently it is only installed as the latter) 2. When directly executing a module, install it in ``sys.modules`` under ``os.path.splitext(os.path.basename(__file__))[0]`` as well as under ``__main__``. With the main module also stored under its "real" name, attempts to import it will pick it up from the ``sys.modules`` cache rather than reimporting it under the new name. Fixing direct execution inside packages --------------------------------------- To fix this problem, it is proposed that an additional filesystem check be performed before proceeding with direct execution of a ``PY_SOURCE`` or ``PY_COMPILED`` file that has been named on the command line. This additional check would look for an ``__init__`` file that is a peer to the specified file with a matching extension (either ``.py``, ``.pyc`` or ``.pyo``, depending what was passed on the command line). If this check fails to find anything, direct execution proceeds as usual. If, however, it finds something, execution is handed over to a helper function in the ``runpy`` module that ``runpy.run_path`` also invokes in the same circumstances. That function will walk back up the directory hierarchy from the supplied path, looking for the first directory that doesn't contain an ``__init__`` file. Once that directory is found, it will be set to ``sys.path[0]``, ``sys.argv[0]`` will be set to ``-m`` and ``runpy._run_module_as_main`` will be invoked with the appropriate module name (as calculated based on the original filename and the directories traversed while looking for a directory without an ``__init__`` file). The two current PEPs for namespace packages (PEP 382 and PEP 402) would both affect this part of the proposal. For PEP 382 (with its current suggestion of "*.pyp" package directories, this check would instead just walk up the supplied path, looking for the first non-package directory (this would not require any filesystem stat calls). Since PEP 402 deliberately omits explicit directory markers, it would need an alternative approach, based on checking the supplied path against the contents of ``sys.path``. In both cases, the direct execution behaviour can still be corrected. Fixing pickling without breaking introspection ---------------------------------------------- To fix this problem, it is proposed to add a new optional module level attribute: ``__qname__``. This abbreviation of "qualified name" is taken from PEP 3155, where it is used to store the naming path to a nested class or function definition relative to the top level module. By default, ``__qname__`` will be the same as ``__name__``, which covers the typical case where there is a one-to-one correspondence between the documented API and the actual module implementation. Functions and classes will gain a corresponding ``__qmodule__`` attribute that refers to their module's ``__qname__``. Pseudo-modules that adjust ``__name__`` to point to the public namespace will leave ``__qname__`` untouched, so the implementation location remains readily accessible for introspection. In the main module, ``__qname__`` will automatically be set to the main module's "real" name (as described above under the fix to prevent duplicate imports of the main module) by the interpreter. At the interactive prompt, both ``__name__`` and ``__qname__`` will be set to ``"__main__"``. These changes on their own will fix most pickling and serialisation problems, but one additional change is needed to fix the problem with serialisation of items in ``__main__``: as a slight adjustment to the definition process for functions and classes, in the ``__name__ == "__main__"`` case, the module ``__qname__`` attribute will be used to set ``__module__``. ``pydoc`` and ``inspect`` would also be updated appropriately to: - use ``__qname__`` instead of ``__name__`` and ``__qmodule__`` instead of ``__module__``where appropriate (e.g. ``inspect.getsource()`` would prefer the qualified variants) - report both the public names and the qualified names for affected objects Fixing multiprocessing on Windows --------------------------------- With ``__qname__`` now available to tell ``multiprocessing`` the real name of the main module, it should be able to simply include it in the serialised information passed to the child process, eliminating the need for dubious reverse engineering of the ``__file__`` attribute. Reference Implementation ======================== None as yet. References ========== .. [1] Module aliases and/or "real names" (http://mail.python.org/pipermail/python-ideas/2011-January/008983.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: -- Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia [View Less]

2 1

When I use Python under Windows. I found some file handles are not closed,
by 罗勇刚(Yonggang Luo) Oct. 31, 2011

Oct. 31, 2011

How did detecting where those handlers are created to tracing it and close it. Mainly because I was using C binding library(subvertpy) and file is not closed. -- 此致礼罗勇刚 Yours sincerely, Yonggang Luo

1 0

Should this be considered a bug?
by Mike Meyer Oct. 31, 2011

Oct. 31, 2011

h = [1, 2, 3] d = dict(a=1, b=2) h += d # works h = h + d # exception -- Sent from my Android tablet with K-9 Mail. Please excuse my brevity.

6 14

Allow iterable argument to os.walk()
by John O'Connor Oct. 31, 2011

Oct. 31, 2011

Given the push towards iterators in 3.0, is anyone in support of allowing an iterable for the "top" argument in os.walk? It seems like it would be common to look in more than one directory at once. - John

4 4

Draft PEP for virtualenv in the stdlib
by Carl Meyer Oct. 28, 2011

Oct. 28, 2011

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all, Vinay Sajip and I are working on a PEP for making "virtual Python environments" a la virtualenv [1] a built-in feature of Python 3.3. This idea was first proposed on python-dev by Ian Bicking in February 2010 [2]. It was revived at PyCon 2011 and has seen discussion on distutils-sig [3] and more recently again on python-dev [4] [5]. Given all this (mostly positive) prior discussion, we may be at a point where further discussion should … [View More]happen on python-dev rather than python-ideas. But in order to observe the proper PEP 1 process, I'm posting the draft PEP here first for pre-review and comment before I send it to the PEP editors and post it on python-dev. Full text of the draft PEP is pasted below, and also available on Bitbucket [6]. [1] http://virtualenv.org [2] http://mail.python.org/pipermail/python-dev/2010-February/097787.html [3] http://mail.python.org/pipermail/distutils-sig/2011-March/017498.html [4] http://mail.python.org/pipermail/python-dev/2011-June/111903.html [5] http://mail.python.org/pipermail/python-dev/2011-October/113883.html [6] https://bitbucket.org/carljm/pythonv-pep/src/ PEP: XXX Title: Python Virtual Environments Version: $Revision$ Last-Modified: $Date$ Author: Carl Meyer <carl(a)oddbird.net> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Jun-2011 Python-Version: 3.3 Post-History: 14-Jun-2011 Abstract ======== This PEP proposes to add to Python a mechanism for lightweight "virtual environments" with their own site directories, optionally isolated from system site directories. Each virtual environment has its own Python binary (allowing creation of environments with various Python versions) and can have its own independent set of installed Python packages in its site directories. Motivation ========== The utility of Python virtual environments has already been well established by the popularity of existing third-party virtual-environment tools, primarily Ian Bicking's `virtualenv`_. Virtual environments are already widely used for dependency management and isolation, ease of installing and using Python packages without system-administrator access, and automated testing of Python software across multiple Python versions, among other uses. Existing virtual environment tools suffer from lack of support from the behavior of Python itself. Tools such as `rvirtualenv`_, which do not copy the Python binary into the virtual environment, cannot provide reliable isolation from system site directories. Virtualenv, which does copy the Python binary, is forced to duplicate much of Python's ``site`` module and manually copy an ever-changing set of standard-library modules into the virtual environment in order to perform a delicate boot-strapping dance at every startup. The ``PYTHONHOME`` environment variable, Python's only existing built-in solution for virtual environments, requires copying the entire standard library into every environment; not a lightweight solution. A virtual environment mechanism integrated with Python and drawing on years of experience with existing third-party tools can be lower maintenance, more reliable, and more easily available to all Python users. .. _virtualenv: http://www.virtualenv.org .. _rvirtualenv: https://github.com/kvbik/rvirtualenv Specification ============= When the Python binary is executed, it attempts to determine its prefix (which it stores in ``sys.prefix``), which is then used to find the standard library and other key files, and by the ``site`` module to determine the location of the site-package directories. Currently the prefix is found (assuming ``PYTHONHOME`` is not set) by first walking up the filesystem tree looking for a marker file (``os.py``) that signifies the presence of the standard library, and if none is found, falling back to the build-time prefix hardcoded in the binary. This PEP proposes to add a new first step to this search. If an ``env.cfg`` file is found either adjacent to the Python executable, or one directory above it, this file is scanned for lines of the form ``key = value``. If a ``home`` key is found, this signifies that the Python binary belongs to a virtual environment, and the value of the ``home`` key is the directory containing the Python executable used to create this virtual environment. In this case, prefix-finding continues as normal using the value of the ``home`` key as the effective Python binary location, which results in ``sys.prefix`` being set to the system installation prefix, while ``sys.site_prefix`` is set to the directory containing ``env.cfg``. (If ``env.cfg`` is not found or does not contain the ``home`` key, prefix-finding continues normally, and ``sys.site_prefix`` will be equal to ``sys.prefix``.) The ``site`` and ``sysconfig`` standard-library modules are modified such that site-package directories ("purelib" and "platlib", in ``sysconfig`` terms) are found relative to ``sys.site_prefix``, while other directories (the standard library, include files) are still found relative to ``sys.prefix``. Thus, a Python virtual environment in its simplest form would consist of nothing more than a copy of the Python binary accompanied by an ``env.cfg`` file and a site-packages directory. Since the ``env.cfg`` file can be located one directory above the executable, a typical virtual environment layout, mimicking a system install layout, might be:: env.cfg bin/python3 lib/python3.3/site-packages/ Isolation from system site-packages - ----------------------------------- In a virtual environment, the ``site`` module will normally still add the system site directories to ``sys.path`` after the virtual environment site directories. Thus system-installed packages will still be importable, but a package of the same name installed in the virtual environment will take precedence. If the ``env.cfg`` file also contains a key ``include-system-site`` with a value of ``false`` (not case sensitive), the ``site`` module will omit the system site directories entirely. This allows the virtual environment to be entirely isolated from system site-packages. Creating virtual environments - ----------------------------- This PEP also proposes adding a new ``venv`` module to the standard library which implements the creation of virtual environments. This module would typically be executed using the ``-m`` flag:: python3 -m venv /path/to/new/virtual/environment Running this command creates the target directory (creating any parent directories that don't exist already) and places an ``env.cfg`` file in it with a ``home`` key pointing to the Python installation the command was run from. It also creates a ``bin/`` (or ``Scripts`` on Windows) subdirectory containing a copy of the ``python3`` executable, and the ``pysetup3`` script from the ``packaging`` standard library module (to facilitate easy installation of packages from PyPI into the new virtualenv). And it creates an (initially empty) ``lib/pythonX.Y/site-packages`` subdirectory. If the target directory already exists an error will be raised, unless the ``--clear`` option was provided, in which case the target directory will be deleted and virtual environment creation will proceed as usual. If ``venv`` is run with the ``--no-site-packages`` option, the key ``include-system-site = false`` is also included in the created ``env.cfg`` file. Multiple paths can be given to ``venv``, in which case an identical virtualenv will be created, according to the given options, at each provided path. API - --- The high-level method described above will make use of a simple API which provides mechanisms for third-party virtual environment creators to customize environment creation according to their needs. The ``venv`` module will contain an ``EnvBuilder`` class which accepts the following keyword arguments on instantiation:: * ``nosite`` - A Boolean value indicating that isolation of the environment from the system Python is required (defaults to ``False``). * ``clear`` - A Boolean value which, if True, will delete any existing target directory instead of raising an exception (defaults to ``False``). The returned env-builder is an object which is expected to have a single method, ``create``, which takes as required argument the path (absolute or relative to the current directory) of the target directory which is to contain the virtual environment. The ``create`` method will either create the environment in the specified directory, or raise an appropriate exception. Creators of third-party virtual environment tools will be free to use the provided ``EnvBuilder`` class as a base class. The ``venv`` module will also provide a module-level function as a convenience:: def create(env_dir, nosite=False, clear=False): builder = EnvBuilder(nosite=nosite, clear=clear) builder.create(env_dir) The ``create`` method of the ``EnvBuilder`` class illustrates the hooks available for customization: def create(self, env_dir): """ Create a virtualized Python environment in a directory. :param env_dir: The target directory to create an environment in. """ env_dir = os.path.abspath(env_dir) context = self.create_directories(env_dir) self.create_configuration(context) self.setup_python(context) self.setup_packages(context) self.setup_scripts(context) Each of the methods ``create_directories``, ``create_configuration``, ``setup_python``, ``setup_packages`` and ``setup_scripts`` can be overridden. The functions of these methods are:: * ``create_directories`` - creates the environment directory and all necessary directories, and returns a context object. This is just a holder for attributes (such as paths), for use by the other methods. * ``create_configuration`` - creates the ``env.cfg`` configuration file in the environment. * ``setup_python`` - creates a copy of the Python executable (and, under Windows, DLLs) in the environment. * ``setup_packages`` - A placeholder method which can be overridden in third party implementations to pre-install packages in the virtual environment. * ``setup_scripts`` - A placeholder methd which can be overridden in third party implementations to pre-install scripts (such as activation and deactivation scripts) in the virtual environment. The ``DistributeEnvBuilder`` subclass in the reference implementation illustrates how these last two methods can be used in practice. It's not envisaged that ``DistributeEnvBuilder`` will be actually added to Python core, but it makes the reference implementation more immediately useful for testing and exploratory purposes. * The ``setup_packages`` method installs Distribute in the target environment. This is needed at the moment in order to actually install most packages in an environment, since most packages are not yet packaging / setup.cfg based. * The ``setup_scripts`` method installs activation and pysetup3 scripts in the environment. This is also done in a configurable way: A ``scripts`` property on the builder is expected to provide a buffer which is a base64-encoded zip file. The zip file contains directories "common", "linux2", "darwin", "win32", each containing scripts destined for the bin directory in the environment. The contents of "common" and the directory corresponding to ``sys.platform`` are copied after doing some text replacement of placeholders: * ``__VIRTUAL_ENV__`` is replaced with absolute path of the environment directory. * ``__VIRTUAL_PROMPT__`` is replaced with the environment prompt prefix. * ``__BIN_NAME__`` is replaced with the name of the bin directory. * ``__ENV_PYTHON__`` is replaced with the absolute path of the environment's executable. No doubt the process of PEP review will show up any customization requirements which have not yet been considered. Open Questions ============== Why not modify sys.prefix? - -------------------------- Any virtual environment tool along these lines is proposing a split between two different meanings (among others) that are currently both wrapped up in ``sys.prefix``: the answers to the questions "Where is the standard library?" and "Where is the site-packages location where third-party modules should be installed?" This split could be handled by introducing a new value for either the former question or the latter question. Either option potentially introduces some backwards-incompatibility with software written to assume the other meaning for ``sys.prefix``. Since it was unable to modify `distutils`, `virtualenv`_ has to re-point ``sys.prefix`` at the virtual environment, which requires that it also provide a symlink from inside the virtual environment to the Python header files, and that it copy some portions of the standard library into the virtual environment. The `documentation`__ for ``sys.prefix`` describes it as "A string giving the site-specific directory prefix where the platform independent Python files are installed," and specifically mentions the standard library and header files as found under ``sys.prefix``. It does not mention ``site-packages``. __ http://docs.python.org/dev/library/sys.html#sys.prefix It is more true to this documented definition of ``sys.prefix`` to leave it pointing to the system installation (which is where the standard library and header files are found), and introduce a new value in ``sys`` (``sys.site_prefix``) to point to the prefix for ``site-packages``. The justification for reversing this choice would be if it can be demonstrated that the bulk of third-party code referencing ``sys.prefix`` is, in fact, using it to find ``site-packages``, and not the standard library or header files or anything else. The most notable case is probably `setuptools`_ and its fork `distribute`_, which do use ``sys.prefix`` to build up a list of site directories for pre-flight checking where ``pth`` files can usefully be placed. It would be trivial to modify these tools (currently only `distribute`_ is Python 3 compatible) to check ``sys.site_prefix`` and fall back to ``sys.prefix`` if it doesn't exist. If Distribute is modified in this way and released before Python 3.3 is released with the ``venv`` module, there would be no likely reason for an older version of Distribute to ever be installed in a virtual environment. In terms of other third-party usage, a `Google Code Search`_ turns up what appears to be a roughly even mix of usage between packages using ``sys.prefix`` to build up a site-packages path and packages using it to e.g. eliminate the standard-library from code-execution tracing. Either choice that's made here will require one or the other of these uses to be updated. Another argument for reversing this choice and modifying ``sys.prefix`` to point at the virtual environment is that virtualenv currently does this, and it doesn't appear to have caused major problems. .. _setuptools: http://peak.telecommunity.com/DevCenter/setuptools .. _distribute: http://packages.python.org/distribute/ .. _Google Code Search: http://www.google.com/codesearch#search/&q=sys\.prefix&p=1&type=cs What about include files? - ------------------------- For example, ZeroMQ installs zmq.h and zmq_utils.h in $VE/include, whereas SIP (part of PyQt4) installs sip.h by default in $VE/include/pythonX.Y. With virtualenv, everything works because the PythonX.Y include is symlinked, so everything that's needed is in $VE/include. At the moment pythonv doesn't do anything with include files, besides creating the include directory; this might need to change, to copy/symlink $VE/include/pythonX.Y. I guess this would go into ``venv.py``. As in Python there's no abstraction for a site-specific include directory, other than for platform-specific stuff, then the user expectation would seem to be that all include files anyone could ever want should be found in one of just two locations, with sysconfig labels "include" & "platinclude". There's another issue: what if includes are Python-version-specific? For example, SIP installs by default into $VE/include/pythonX.Y rather than $VE/include, presumably because there's version-specific stuff in there - but even if that's not the case with SIP, it could be the case with some other package. And the problem that gives is that you can't just symlink the include/pythonX.Y directory, but actually have to provide a writable directory and symlink/copy the contents from the system include/pythonX.Y. Of course this is not hard to do, but it does seem inelegant. OTOH it's really because there's no supporting concept in Python/sysconfig. Interface with packaging tools - ------------------------------ Some work will be needed in packaging tools (Python 3.3 packaging, Distribute) to support implementation of this PEP. For example: * How Distribute and packaging use sys.prefix and/or sys.site_prefix. Clearly, in practice we'll need to use Distribute for a while, until packages have migrated over to usage of setup.cfg. * How packaging and Distribute set up shebang lines in scripts which they install in virtual environments. Add a script? - ------------- Perhaps a ``pyvenv`` script should be added as a more convienent and discoverable alternative to ``python -m venv``. Testability and Source Build Issues - ----------------------------------- In order to be able to test the ``venv`` module in the Python regression test suite, some anomalies in how sysconfig data is configured in source builds will need to be removed. For example, sysconfig.get_paths() in a source build gives (partial output): { 'include': '/home/vinay/tools/pythonv/Include', 'libdir': '/usr/lib ; or /usr/lib64 on a multilib system', 'platinclude': '/home/vinay/tools/pythonv', 'platlib': '/usr/local/lib/python3.3/site-packages', 'platstdlib': '/usr/local/lib/python3.3', 'purelib': '/usr/local/lib/python3.3/site-packages', 'stdlib': '/usr/local/lib/python3.3' } Activation and Utility Scripts - ------------------------------ Virtualenv currently provides shell "activation" scripts as a user convenience, to put the virtual environment's Python binary first on the shell PATH. This is a maintenance burden, as separate activation scripts need to be provided and maintained for every supported shell. For this reason, this PEP proposes to leave such scripts to be provided by third-party extensions; virtual environments created by the core functionality would be used by directly invoking the environment's Python binary. If we are going to rely on external code to provide these conveniences, we need to check with existing third-party projects in this space (virtualenv, zc.buildout) and ensure that the proposed API meets their needs. (Virtualenv would be fine with the proposed API; it would become a relatively thin wrapper with a subclass of the env builder that adds shell activation and automatic installation of ``pip`` inside the virtual environment). Ensuring that sys.site_prefix and sys.site_exec_prefix are always set? - ---------------------------------------------------------------------- Currently the reference implementation's modifications to standard library code use the idiom ``getattr(sys, "site_prefix", sys.prefix)``. Do we want this to be the long-term pattern, or should the sys module ensure that the ``site_*`` attributes are always set to something (by default the same as the regular prefix attributes), even if ``site.py`` does not run? Reference Implementation ======================== The in-progress reference implementation is found in `a clone of the CPython Mercurial repository`_. To test it, build and install it (the virtual environment tool currently does not run from a source tree). - From the installed Python, run ``bin/python3 -m venv /path/to/new/virtualenv`` to create a virtual environment. The reference implementation (like this PEP!) is a work in progress. .. _a clone of the CPython Mercurial repository: https://bitbucket.org/vinay.sajip/pythonv References ========== Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6lrJMACgkQ8W4rlRKtE2dz4wCgqxtiHQr3ZEH/s1h069e15bu7 c70AoOSTd7drIp1g6z2QiuDKoTok6TRw =9XEL -----END PGP SIGNATURE----- [View Less]

14 44

Re: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)
by Steven D'Aprano Oct. 24, 2011

Oct. 24, 2011

Mike Meyer wrote: >> - And testing. If code isn't tested, you should assume it is buggy. >> In an ideal world, there should never be any such thing as code >> that's used once: it should always be used at least twice, once in >> the application and once in the test suite. I realise that in >> practice we often fall short of that ideal, but we don't need more >> syntax that *encourages* developers to fail to test non-trivial >> code blocks. > > … [View More]

1 0

PEP 355 (overloading boolean operations) and chained comparisons
by Nick Coghlan Oct. 24, 2011

Oct. 24, 2011

I had a chance to speak to Travis Oliphant (NumPy core dev) at PyCodeConf and asked him his opinion of PEP 355. His answer was that he didn't really care about overloading boolean operations in general (the bitwise operation overloads with the appropriate objects in the arrays were adequate for most purposes), but the fact that chained comparisons don't work for NumPy arrays was genuinely annoying. That is, if you have a NumPy array, you cannot write: x = A < B < C Since, under the … [View More]

9 14

raise EXC from None (issue #6210)
by Jan Kaliszewski Oct. 23, 2011

Oct. 23, 2011

Hello. Some time ago I encountered the problem described in PEP 3134 as "Open Issue: Suppressing Context" ("this PEP makes it impossible to suppress '__context__', since setting exc.__context__ to None in an 'except' or 'finally' clause will only result in it being set again when exc is raised."). An idea that then appeared in my brain was: raise SomeException(some, arguments) from None ...and I see the same idea has been proposed by Patrick Westerhoff here: http://bugs.python.org/… [View More]

4 3

An illustrative example for PEP 3150's statement local namespaces
by Nick Coghlan Oct. 21, 2011

Oct. 21, 2011

A task I actually ran into at work this week: get a filtered list of subdirectories, exclude some based on a list of names to be ignored, sort the remainder by their modification times. (This problem was actually also the origin of my recent filter_walk recipe: http://code.activestate.com/recipes/577913-selective-directory-walking/) Translated to 3.x (i.e. the generator's .next() method is replaced by the next() builtin), the code would look roughly like this: # Generate list of … [View More]candidate directories sorted by modification time candidates = next(filter_walk(base_dir, dir_pattern=dir_filter, depth=0)).subdirs candidates = (subdir for subdir in candidates if not any(d in subdir for d in dirs_to_ignore)) def get_mtime(path): stat_path = os.path.join(base_dir, path) return os.stat(stat_path).st_mtime candidates = sorted(candidates, key=get_mtime) Now, that could theoretically be split out to a separate function (passing base_dir, dir_filter and dirs_to_ignore as arguments), but the details are going to vary too much from use case to use case to make reusing it practical. Even factoring out "get_mtime" would be a waste, since you end up with a combinatorial explosion of functions if you try to do things like that (it's the local code base equivalent of "not every 3 line function needs to be in the standard library"). I can (and do) use vertical white space to give some indication that the calculation is complete, but PEP 3150 would allow me to be even more explicit by indenting every step in the calculation except the last one: candidate_dirs = sorted(candidate_dirs, key=get_mtime) given: candidate_dirs = next(filter_walk(base_dir, dir_pattern=dir_filter, depth=0)).subdirs candidate_dirs = (subdir for subdir in candidates if not any(d in subdir for d in dirs_to_ignore)) def get_mtime(path): stat_path = os.path.join(base_dir, path) return os.stat(stat_path).st_mtime Notice how the comment from the original version becomes redundant in the second version? It's just repeating what the actual header line right below it says, so I got rid of it. In the original version it was necessary because there was no indentation in the code to indicate that this was all just different stages of one internal calculation leading up to that final step to create the sorted list. Cheers, Nick. -- Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia [View Less]

3 2