On Tue, Oct 25, 2011 at 4:21 AM, Carl Meyer
Existing virtual environment tools suffer from lack of support from the behavior of Python itself. Tools such as `rvirtualenv`_, which do not copy the Python binary into the virtual environment, cannot provide reliable isolation from system site directories. Virtualenv, which does copy the Python binary, is forced to duplicate much of Python's ``site`` module and manually copy an ever-changing set of standard-library modules into the virtual environment in order to perform a delicate boot-strapping dance at every startup. The ``PYTHONHOME`` environment variable, Python's only existing built-in solution for virtual environments, requires copying the entire standard library into every environment; not a lightweight solution.
The repeated references to copying binaries and Python files throughout the PEP is annoying, and needs to be justified. Python 3.2+ supports symlinking on Windows Vista and above as well as on *nix systems, so there needs to be a short section somewhere explaining why symlinks are not an adequate lightweight solution (pointing out the fact that creating symlinks on Windows often requires administrator privileges would be sufficient justification for me). (Obviously, 3rd party virtual environment solutions generally *do* have to copy things around, since Python 2.x doesn't expose symlink support on Windows at all)
A virtual environment mechanism integrated with Python and drawing on years of experience with existing third-party tools can be lower maintenance, more reliable, and more easily available to all Python users.
It can also take advantage of the native symlink support to minimise copying.
Specification =============
When the Python binary is executed, it attempts to determine its prefix (which it stores in ``sys.prefix``), which is then used to find the standard library and other key files, and by the ``site`` module to determine the location of the site-package directories. Currently the prefix is found (assuming ``PYTHONHOME`` is not set) by first walking up the filesystem tree looking for a marker file (``os.py``) that signifies the presence of the standard library, and if none is found, falling back to the build-time prefix hardcoded in the binary.
This PEP proposes to add a new first step to this search. If an ``env.cfg`` file is found either adjacent to the Python executable, or one directory above it, this file is scanned for lines of the form ``key = value``. If a ``home`` key is found, this signifies that the Python binary belongs to a virtual environment, and the value of the ``home`` key is the directory containing the Python executable used to create this virtual environment.
Currently, the PEP uses a mish-mash of 'env', 'venv', 'pyvenv' and 'pythonv' and 'site' to refer to different aspects of the proposed feature. I suggest standardising on 'venv' wherever the Python relationship is implied, and 'pyvenv' wherever the Python relationship needs to be made explicit. So I think the name of the configuration file should be "pyvenv.cfg" (tangent: 'setup' lost its Python connection when it went from 'setup.py' to 'setup.cfg'. I wish the latter has been called 'pysetup.cfg' instead)
In this case, prefix-finding continues as normal using the value of the ``home`` key as the effective Python binary location, which results in ``sys.prefix`` being set to the system installation prefix, while ``sys.site_prefix`` is set to the directory containing ``env.cfg``.
(If ``env.cfg`` is not found or does not contain the ``home`` key, prefix-finding continues normally, and ``sys.site_prefix`` will be equal to ``sys.prefix``.)
'site' is *way* too overloaded already, let's not make it worse. I suggest "sys.venv_prefix".
The ``site`` and ``sysconfig`` standard-library modules are modified such that site-package directories ("purelib" and "platlib", in ``sysconfig`` terms) are found relative to ``sys.site_prefix``, while other directories (the standard library, include files) are still found relative to ``sys.prefix``.
Thus, a Python virtual environment in its simplest form would consist of nothing more than a copy of the Python binary accompanied by an ``env.cfg`` file and a site-packages directory. Since the ``env.cfg`` file can be located one directory above the executable, a typical virtual environment layout, mimicking a system install layout, might be::
env.cfg bin/python3 lib/python3.3/site-packages/
The builtin virtual environment mechanism should be specified to symlink things by default, and only copy things if the user specifically requests it. System administrators rightly fear the proliferation of multiple copies of binaries, since it can cause major hassles when it comes time to install security updates.
Isolation from system site-packages - -----------------------------------
In a virtual environment, the ``site`` module will normally still add the system site directories to ``sys.path`` after the virtual environment site directories. Thus system-installed packages will still be importable, but a package of the same name installed in the virtual environment will take precedence.
If the ``env.cfg`` file also contains a key ``include-system-site`` with a value of ``false`` (not case sensitive), the ``site`` module will omit the system site directories entirely. This allows the virtual environment to be entirely isolated from system site-packages.
"site" is ambiguous here - rather than abbreviating, I suggest making the option "include-system-site-packages".
Creating virtual environments - -----------------------------
This PEP also proposes adding a new ``venv`` module to the standard library which implements the creation of virtual environments. This module would typically be executed using the ``-m`` flag::
python3 -m venv /path/to/new/virtual/environment
Running this command creates the target directory (creating any parent directories that don't exist already) and places an ``env.cfg`` file in it with a ``home`` key pointing to the Python installation the command was run from. It also creates a ``bin/`` (or ``Scripts`` on Windows) subdirectory containing a copy of the ``python3`` executable, and the ``pysetup3`` script from the ``packaging`` standard library module (to facilitate easy installation of packages from PyPI into the new virtualenv). And it creates an (initially empty) ``lib/pythonX.Y/site-packages`` subdirectory.
As noted above, those should be symlinks rather than copies, with copying behaviour explicitly requested via a command line option. Also, why "Scripts" rather than "bin" on Windows? The Python binary isn't a script. I'm actually not seeing the rationale for the obfuscated FHS inspired layout in the first place - why not dump the binaries adjacent to the config file, with a simple "site-packages" directory immediately below that? If there are reasons for a more complex default layout, they need to be articulated in the PEP. If the problem is wanting to allow cross platform computation of things like the site-packages directory location and other paths, then the answer to that seems to lie in better helper methods (whether in sysconfig, site, venv or elsewhere) rather than Linux specific layouts inside language level virtual environments.
The ``venv`` module will contain an ``EnvBuilder`` class which accepts the following keyword arguments on instantiation::
* ``nosite`` - A Boolean value indicating that isolation of the environment from the system Python is required (defaults to ``False``).
Yikes, double negatives in APIs are bad news (especially when the corresponding config file option is expressed positively) I suggest this parameter should be declared as "system_site_packages=True".
* ``clear`` - A Boolean value which, if True, will delete any existing target directory instead of raising an exception (defaults to ``False``).
In line with my above comments, I think there should be a third parameter here declared as "use_symlinks=True". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia