[Python-ideas] Draft PEP for virtualenv in the stdlib

Nick Coghlan ncoghlan at gmail.com
Tue Oct 25 00:11:30 CEST 2011


On Tue, Oct 25, 2011 at 4:21 AM, Carl Meyer <carl at oddbird.net> wrote:
> Existing virtual environment tools suffer from lack of support from
> the behavior of Python itself.  Tools such as `rvirtualenv`_, which do
> not copy the Python binary into the virtual environment, cannot
> provide reliable isolation from system site directories.  Virtualenv,
> which does copy the Python binary, is forced to duplicate much of
> Python's ``site`` module and manually copy an ever-changing set of
> standard-library modules into the virtual environment in order to
> perform a delicate boot-strapping dance at every startup. The
> ``PYTHONHOME`` environment variable, Python's only existing built-in
> solution for virtual environments, requires copying the entire
> standard library into every environment; not a lightweight solution.

The repeated references to copying binaries and Python files
throughout the PEP is annoying, and needs to be justified. Python 3.2+
supports symlinking on Windows Vista and above as well as on *nix
systems, so there needs to be a short section somewhere explaining why
symlinks are not an adequate lightweight solution (pointing out the
fact that creating symlinks on Windows often requires administrator
privileges would be sufficient justification for me).

(Obviously, 3rd party virtual environment solutions generally *do*
have to copy things around, since Python 2.x doesn't expose symlink
support on Windows at all)

> A virtual environment mechanism integrated with Python and drawing on
> years of experience with existing third-party tools can be lower
> maintenance, more reliable, and more easily available to all Python
> users.

It can also take advantage of the native symlink support to minimise copying.

> Specification
> =============
>
> When the Python binary is executed, it attempts to determine its
> prefix (which it stores in ``sys.prefix``), which is then used to find
> the standard library and other key files, and by the ``site`` module
> to determine the location of the site-package directories.  Currently
> the prefix is found (assuming ``PYTHONHOME`` is not set) by first
> walking up the filesystem tree looking for a marker file (``os.py``)
> that signifies the presence of the standard library, and if none is
> found, falling back to the build-time prefix hardcoded in the binary.
>
> This PEP proposes to add a new first step to this search.  If an
> ``env.cfg`` file is found either adjacent to the Python executable, or
> one directory above it, this file is scanned for lines of the form
> ``key = value``. If a ``home`` key is found, this signifies that the
> Python binary belongs to a virtual environment, and the value of the
> ``home`` key is the directory containing the Python executable used to
> create this virtual environment.

Currently, the PEP uses a mish-mash of 'env', 'venv', 'pyvenv' and
'pythonv' and 'site' to refer to different aspects of the proposed
feature. I suggest standardising on 'venv' wherever the Python
relationship is implied, and 'pyvenv' wherever the Python relationship
needs to be made explicit.

So I think the name of the configuration file should be "pyvenv.cfg"
(tangent: 'setup' lost its Python connection when it went from
'setup.py' to 'setup.cfg'. I wish the latter has been called
'pysetup.cfg' instead)

> In this case, prefix-finding continues as normal using the value of
> the ``home`` key as the effective Python binary location, which
> results in ``sys.prefix`` being set to the system installation prefix,
> while ``sys.site_prefix`` is set to the directory containing
> ``env.cfg``.
>
> (If ``env.cfg`` is not found or does not contain the ``home`` key,
> prefix-finding continues normally, and ``sys.site_prefix`` will be
> equal to ``sys.prefix``.)

'site' is *way* too overloaded already, let's not make it worse. I
suggest "sys.venv_prefix".

> The ``site`` and ``sysconfig`` standard-library modules are modified
> such that site-package directories ("purelib" and "platlib", in
> ``sysconfig`` terms) are found relative to ``sys.site_prefix``, while
> other directories (the standard library, include files) are still
> found relative to ``sys.prefix``.
>
> Thus, a Python virtual environment in its simplest form would consist
> of nothing more than a copy of the Python binary accompanied by an
> ``env.cfg`` file and a site-packages directory.  Since the ``env.cfg``
> file can be located one directory above the executable, a typical
> virtual environment layout, mimicking a system install layout, might
> be::
>
>    env.cfg
>    bin/python3
>    lib/python3.3/site-packages/

The builtin virtual environment mechanism should be specified to
symlink things by default, and only copy things if the user
specifically requests it. System administrators rightly fear the
proliferation of multiple copies of binaries, since it can cause major
hassles when it comes time to install security updates.

> Isolation from system site-packages
> - -----------------------------------
>
> In a virtual environment, the ``site`` module will normally still add
> the system site directories to ``sys.path`` after the virtual
> environment site directories.  Thus system-installed packages will
> still be importable, but a package of the same name installed in the
> virtual environment will take precedence.
>
> If the ``env.cfg`` file also contains a key ``include-system-site``
> with a value of ``false`` (not case sensitive), the ``site`` module
> will omit the system site directories entirely. This allows the
> virtual environment to be entirely isolated from system site-packages.

"site" is ambiguous here - rather than abbreviating, I suggest making
the option "include-system-site-packages".

> Creating virtual environments
> - -----------------------------
>
> This PEP also proposes adding a new ``venv`` module to the standard
> library which implements the creation of virtual environments.  This
> module would typically be executed using the ``-m`` flag::
>
>    python3 -m venv /path/to/new/virtual/environment
>
> Running this command creates the target directory (creating any parent
> directories that don't exist already) and places an ``env.cfg`` file
> in it with a ``home`` key pointing to the Python installation the
> command was run from.  It also creates a ``bin/`` (or ``Scripts`` on
> Windows) subdirectory containing a copy of the ``python3`` executable,
> and the ``pysetup3`` script from the ``packaging`` standard library
> module (to facilitate easy installation of packages from PyPI into the
> new virtualenv).  And it creates an (initially empty)
> ``lib/pythonX.Y/site-packages`` subdirectory.

As noted above, those should be symlinks rather than copies, with
copying behaviour explicitly requested via a command line option.

Also, why "Scripts" rather than "bin" on Windows? The Python binary
isn't a script.

I'm actually not seeing the rationale for the obfuscated FHS inspired
layout in the first place - why not dump the binaries adjacent to the
config file, with a simple "site-packages" directory immediately below
that? If there are reasons for a more complex default layout, they
need to be articulated in the PEP.

If the problem is wanting to allow cross platform computation of
things like the site-packages directory location and other paths, then
the answer to that seems to lie in better helper methods (whether in
sysconfig, site, venv or elsewhere) rather than Linux specific layouts
inside language level virtual environments.

> The ``venv`` module will contain an ``EnvBuilder`` class which accepts
> the following keyword arguments on instantiation::
>
>   * ``nosite`` - A Boolean value indicating that isolation of the
>     environment from the system Python is required (defaults to
>     ``False``).

Yikes, double negatives in APIs are bad news (especially when the
corresponding config file option is expressed positively)

I suggest this parameter should be declared as "system_site_packages=True".

>   * ``clear`` - A Boolean value which, if True, will delete any
>     existing target directory instead of raising an exception
>     (defaults to ``False``).

In line with my above comments, I think there should be a third
parameter here declared as "use_symlinks=True".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list