[Python-ideas] Draft PEP for virtualenv in the stdlib

Carl Meyer carl at oddbird.net
Tue Oct 25 19:37:44 CEST 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Nick,

Thanks for the feedback, replies below.

On 10/24/2011 04:11 PM, Nick Coghlan wrote:
> The repeated references to copying binaries and Python files
> throughout the PEP is annoying, and needs to be justified. Python 3.2+
> supports symlinking on Windows Vista and above as well as on *nix
> systems, so there needs to be a short section somewhere explaining why
> symlinks are not an adequate lightweight solution (pointing out the
> fact that creating symlinks on Windows often requires administrator
> privileges would be sufficient justification for me).

Do you mean pointing out why symlinks are not in themselves an adequate
solution for virtual Python environments, or why they aren't used in
this implementation?

I've updated the introductory paragraph you quoted to provide a bit more
detail on the first question.

I think the answer to the latter question is just that we were trying to
keep things simple and consistent, and make it work the same way on as
wide a variety of platforms as possible (e.g. Windows XP).

Also, in earlier discussions on distutils-sig some people considered it
a _feature_ to have the virtual environment's Python binary copied,
making the virtual environment more isolated from system changes.
Obviously, this is an area where programmers and sysadmins often don't
see eye to eye ;)

The technique in this PEP works just as well with a symlinked binary,
though, and I don't see much reason not to provide a symlink option.
Whether it is on by default where supported is something that may need
more discussion (I don't personally have a strong opinion either way).

The reason virtualenv copies rather than symlinks the binary has nothing
to do with lack of symlink support in Python 2, it's because getpath.c
dereferences a symlinked binary in finding sys.prefix, so virtualenv's
isolation technique simply doesn't work at all with a symlinked binary.
Our version with the config file and Vinay's changes in getpath.c does
work with a symlinked binary.

>> A virtual environment mechanism integrated with Python and drawing on
>> years of experience with existing third-party tools can be lower
>> maintenance, more reliable, and more easily available to all Python
>> users.
> 
> It can also take advantage of the native symlink support to minimise copying.

I don't think this is a significant enough difference to warrant mention
here. Existing virtualenv on Python 2 already can and does use symlinks
(for the bits of the stdlib it needs) on platforms that have os.symlink.

IOW, the details and supported platforms differ, but on both Python 2
and 3 the best you can do is try to use os.symlink where available, and
be prepared to fall back to a copy.

> Currently, the PEP uses a mish-mash of 'env', 'venv', 'pyvenv' and
> 'pythonv' and 'site' to refer to different aspects of the proposed
> feature. I suggest standardising on 'venv' wherever the Python
> relationship is implied, and 'pyvenv' wherever the Python relationship
> needs to be made explicit.

Good point. We were already attempting to standardize just as you
suggest, but hadn't renamed "env.cfg", and I missed one remaining
instance of "pythonv" in the text (it's also used for legacy reasons in
the reference implementation bitbucket repo name, but that doesn't seem
worth changing).

> So I think the name of the configuration file should be "pyvenv.cfg"
> (tangent: 'setup' lost its Python connection when it went from
> 'setup.py' to 'setup.cfg'. I wish the latter has been called
> 'pysetup.cfg' instead)

This makes sense to me. I've updated the PEP accordingly and created an
issue to remind me to update the reference implementation as well.

> 'site' is *way* too overloaded already, let's not make it worse. I
> suggest "sys.venv_prefix".

My original thinking here was that sys.site_prefix is an attribute that
should always exist, and always point to "where stuff should be
installed to site-packages", whether or not you are in a venv (if you
are not, it would have the same value as sys.prefix). It's a little odd
to use an attribute named "sys.venv_prefix" in that way, even if your
code doesn't know or care whether its actually in a venv (and in general
we should be encouraging code that doesn't know or care). (The attribute
doesn't currently always-exist in the reference implementation, but I'd
like to change that).

I agree that "site" is overloaded, though. Any ideas for a name that
doesn't further overload that term, but still communicates "this
attribute is a standard part of Python that always has the same meaning
whether or not you are currently in a venv"?

>>    env.cfg
>>    bin/python3
>>    lib/python3.3/site-packages/
> 
> The builtin virtual environment mechanism should be specified to
> symlink things by default, and only copy things if the user
> specifically requests it. 

To be clear, "things" here refers only to the Python binary itself. The
only other things that might be installed in a new environment are
scripts (e.g. pysetup3), and those must be created anew, neither
symlinked nor copied, as their shebang line needs to point to the venv's
Python (or more complicated chicanery with .exe wrappers on Windows,
unless we get PEP 397 in time).

> System administrators rightly fear the
> proliferation of multiple copies of binaries, since it can cause major
> hassles when it comes time to install security updates.

I think both options should be allowed, and I don't have a strong
feeling about the default.

>> Isolation from system site-packages
>> - -----------------------------------
>>
>> In a virtual environment, the ``site`` module will normally still add
>> the system site directories to ``sys.path`` after the virtual
>> environment site directories.  Thus system-installed packages will
>> still be importable, but a package of the same name installed in the
>> virtual environment will take precedence.
>>
>> If the ``env.cfg`` file also contains a key ``include-system-site``
>> with a value of ``false`` (not case sensitive), the ``site`` module
>> will omit the system site directories entirely. This allows the
>> virtual environment to be entirely isolated from system site-packages.
> 
> "site" is ambiguous here - rather than abbreviating, I suggest making
> the option "include-system-site-packages".

Ok - updated the draft PEP, will update reference implementation.

> Also, why "Scripts" rather than "bin" on Windows? The Python binary
> isn't a script.

No, but in real-world usage, scripts from installed packages in the
virtualenv will be installed there. Putting the Python binary in the
same location as the destination for installed scripts is a pretty
important convenience, as it means you only need to add a single
directory to the beginning of your shell path to effectively "activate"
the venv.

> I'm actually not seeing the rationale for the obfuscated FHS inspired
> layout in the first place - why not dump the binaries adjacent to the
> config file, with a simple "site-packages" directory immediately below
> that? If there are reasons for a more complex default layout, they
> need to be articulated in the PEP.

The historical reason is that it emulates the layout found under
sys.prefix in a regular Python installation (note that it's not actually
Linux-specific, in virtualenv it matches the appropriate platform; i.e.
on Windows it's "Lib\" rather than "lib\pythonX.X"). This was necessary
for virtualenv because it couldn't make changes in distutils/sysconfig).
I think there may be good reason to continue to follow this approach,
simply because it makes the necessary changes to distutils/sysconfig
less invasive, reducing the need for special-casing of the venv case.
But I do need to look into this a bit more and update the PEP with
further rationale in any case.

Regardless, I would not be in favor of dumping binaries directly next to
pyvenv.cfg. It feels cleaner to keep scripts and binaries in a directory
specifically named and intended for that purpose, which can be added to
the shell PATH.

I also think there is some value, all else being roughly equal, in
maintaining consistency with virtualenv's layout. This is not an
overriding concern, but it will make a big difference in how much
existing code that deals with virtual environments has to change.

> If the problem is wanting to allow cross platform computation of
> things like the site-packages directory location and other paths, then
> the answer to that seems to lie in better helper methods (whether in
> sysconfig, site, venv or elsewhere) rather than Linux specific layouts
> inside language level virtual environments.

>> The ``venv`` module will contain an ``EnvBuilder`` class which accepts
>> the following keyword arguments on instantiation::
>>
>>   * ``nosite`` - A Boolean value indicating that isolation of the
>>     environment from the system Python is required (defaults to
>>     ``False``).
> 
> Yikes, double negatives in APIs are bad news (especially when the
> corresponding config file option is expressed positively)
> 
> I suggest this parameter should be declared as "system_site_packages=True".

Fair enough, updated in draft PEP.

>>   * ``clear`` - A Boolean value which, if True, will delete any
>>     existing target directory instead of raising an exception
>>     (defaults to ``False``).
> 
> In line with my above comments, I think there should be a third
> parameter here declared as "use_symlinks=True".

Thanks again for the review!

The updated draft is available on Bitbucket [1], and the open issues for
the reference implementation (which should reflect outstanding
differences from the draft PEP) are as well [2].

Carl

[1] https://bitbucket.org/carljm/pythonv-pep/src
[2] https://bitbucket.org/vinay.sajip/pythonv/issues?status=new&status=open
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk6m8+gACgkQ8W4rlRKtE2f1EgCfZNLBXSI08UQdLCRQMYwxwAp3
ByoAn3cVYvQXWMc1xkoO6mMSmNBQbEAD
=FzBA
-----END PGP SIGNATURE-----



More information about the Python-ideas mailing list