Proposal for virtualenv functionality in Python
This is a proto-proposal for including some functionality from virtualenv in Python itself. I'm not entirely confident about what I'm proposing, so it's not really PEP-ready, but I wanted to get feedback... First, a bit about how virtualenv works (this will use Linux conventions; Windows and some Mac installations are slightly different): * Let's say you are creating an environment in ~/env/ * /usr/bin/python is *copied* to ~/env/bin/python * This alone sets sys.prefix to ~/env/ (via existing code in Python) * At this point things are broken because the standard library is not available * virtualenv creates ~/env/lib/pythonX.Y/site.py, which adds the system standard library location (/usr/lib/pythonX.Y) to sys.path * site.py itself requires several modules to work, and each of these modules (from a pre-determined list of modules) is symlinked over from the standard library into ~/env/lib/pythonX.Y/ * site.py may or may not add /usr/lib/pythonX.Y/site-packages to sys.path * *Any* time you use ~/env/bin/python you'll get sys.prefix of ~/env/, and the appropriate path. No environmental variable is required. * No compiler is used; this is a fairly light tool There are some tweaks to this that could be made, but I believe virtualenv basically does things The Right Way. By setting sys.prefix All Tools Work (there are some virtualenv alternatives that do isolation without setting sys.prefix, but they typically break more often than virtualenv, or only support a limited number of workflows). Also by using a distinct interpreter (~/env/bin/python) it works fairly consistently and reliably compared to techniques like an environmental variable. The one serious alternative is what buildout (and virtualenv --relocatable) does, which is to use the system Python and change the path at the beginning of all scripts (it requires its own installer to accomplish this consistently). But virtualenv is kind of a hack, and I believe with a little support from Python this could be avoided. virtualenv can continue to exist to support the equivalent workflows on earlier versions of Python, but it would not exist (or would become much much simpler) on further Python versions. The specific parts of virtualenv that are a hack that I would like to replace with built-in functionality: * I'd rather ~/env/bin/python be a symlink instead of copying it. * I'd rather not copy (or symlink) *any* of the standard library. * I'd rather site.py support this functionality natively (and in turn that OS packagers support this when they make other modifications) * Compiling extensions can be tricky because code may not find headers (because they are installed in /usr, not ~/env/). I think this can be handled better if virtualenv is slightly less intrusive, or distutils is patched, or generally tools are more aware of this layout. * This gets more complicated with a Mac framework build of Python, and hopefully those hacks could go away too. I am not sure what the best way to do this is, but I will offer at least one suggestion (other suggestions welcome): In my (proto-)proposal, a new binary pythonv is created. This is slightly like pythonw.exe, which provides a Python interpreter on Windows which doesn't open a new window. This binary is primarily for creating new environments. It doesn't even need to be on $PATH, so it would be largely invisible to people unless they use it. If you symlink pythonv to a new location, it will effect sys.prefix (currently sys.prefix is calculated after dereferencing the symlink). Additionally, the binary will look for a configuration file. I'm not sure where this file should go; perhaps directly alongside the binary, or in some location based on sys.prefix. The configuration file would be a simple set of assignments; some I might imagine: * Maybe override sys.prefix * Control if the global site-packages is placed on sys.path * On some operating systems there are other locations for packages installed with the system packager; probably these should be possible to enable or disable * Maybe control installations or point to a file like distutils.cfg I got some feedback from the Debian/Ubuntu maintainer that he would like functionality that might be like this; for instance, if you have /usr/bin/python2.6 and /usr/bin/python2.6-dbg, he'd like them to work slightly different (e.g., /usr/bin/python2.6-dbg would look in a different place for libraries). So the configuration file location should be based on sys.prefix *and* the name of the binary itself (e.g., /usr/lib/python2.6/python-config-dbg.conf). I have no strong opinion on the location of the file itself, only that it can be specific to the directory and name of the interpreter. In addition to all this, I think sys would grow another prefixy value, e.g., sys.build_prefix, that points to the place where Python was actually built (virtualenv calls this sys.real_prefix, but that's not a very good name). Some code, especially in distutils, might need to be aware of this to compile extensions properly (we can be somewhat aware of these cases by looking at places where virtualenv already has problems compiling extensions). Some people have argued for something like sys.prefixes, a list of locations you might look at, which would allow a kind of nesting of these environments (where sys.prefixes[-1] == sys.prefix; or maybe reversed). Personally this seems like it would be hard to keep mental track of this, but I can understand the purpose -- you could for instance create a kind of template prefix that has *most* of what you want installed in it, then create sub-environments that contain for instance an actual application, or a checkout (to test just one new piece of code). I'm not sure how this should best work on Windows (without symlinks, and where things generally work differently), but I would hope if this idea is more visible that someone more opinionated than I would propose the appropriate analog on Windows. -- Ian Bicking | http://blog.ianbicking.org | http://twitter.com/ianbicking
This sounds like a great idea (especially since I proposed something a little bit like it in yesterday's language summit :-). I have to admit I cannot remember what uses are made of sys.prefix; it would be good to explicitly enumerate these in the PEP when you write it. Regarding the Windows question, does virtualenv work on Windows? If so, its approach might be adopted. If not, maybe we shouldn't care (in the first version anyway)? --Guido On Fri, Feb 19, 2010 at 1:49 PM, Ian Bicking <ianb@colorstudy.com> wrote:
-- --Guido van Rossum (python.org/~guido)
Le Fri, 19 Feb 2010 13:49:23 -0500, Ian Bicking a écrit :
* I'd rather ~/env/bin/python be a symlink instead of copying it.
How about simply adding a --prefix argument to the interpreter. Then virtualenv can create a "python" script that simply adds --prefix and forwards all the arguments to the real python executable. Or am I missing something?
Ian Bicking wrote:
I have a tool that also creates a virtualized Python environment, but doesn't solve the problem as thoroughly as virtualenv. I limp along by tweaking PYTHONUSERBASE and PYTHONPATH. I'm very interested in seeing something like this make it in to Python. A few inline comments:
* I'd rather ~/env/bin/python be a symlink instead of copying it.
The thread discussing Windows suggests that we shouldn't use symlinks there. I'd say either copying or symlinking pythonv should be supported, and on Windows we recommend copying pythonv.exe.
Conversely, headers may be installed in ~/env and not /usr. The compiler should probably look in both places. But IIUC telling the compiler how to do that is only vaguely standardized--Microsoft's CL.EXE doesn't seem to support any environment variable containing an include /path/. I suspect solving this in a general way is out-of-band for pythonv, but I'm willing to have my mind changed. Certainly pythonv should add its prefix directory to LD_LIBRARY_PATH on Linux.
I'm unexcited by this; I think simpler is better. pythonv should virtualize environments layered on top of python, and should have one obvious predictable behavior. Certainly if it supports a configuration file pythonv should run without it and pick sensible defaults. What are the use cases where you need these things to be configurable? Let me propose further about python and pythonv: * As Antoine suggested, the CPython interpreter should sprout a new command-line switch, "--prefix", which adds a new prefix directory. * pythonv's purpose in life is to infer your prefix directory and run "pythonX.X --prefix <prefixdir> [ all args it got ... ]". * Should pythonv should be tied to the specific Python executable? If you run install pythonv as "python", should it look for "python" or explicitly look for the specific Python it shipped with, like "python3.2"? I suspect the latter though I'm no longer sure. I'm one of those folks who'd like to see this be stackable. If we tweak the semantics just a bit I think it works: * pythonv should inspect its --prefix arguments, as well as passing them on to the child python process it runs. * When pythonv wants to run the next python process in line, it scans the path looking for the pythonX.X interpreter but /ignores/ all the interpreters that are in in a --prefix bin directory it's already seen. * python handles multiple --prefix options, and later ones take precedence over earlier ones. * What should sys.interpreter be? Explicit is better than implicit: the first pythonv to run also adds a --interpreter <argv[0]> to the front of the command-line. Or they could all add it and python only uses the last one. This is one area where "python" vs "python3.2" makes things a little complicated. I'm at PyCon and would be interested in debating / sprinting on this if there's interest. /larry/
On Sun, Feb 21, 2010 at 12:32 PM, Larry Hastings <larry@hastings.org> wrote:
Sure, on Windows this is clearly the case. I'm not sure if it's worth supporting elsewhere. One problem with copying is that (a) you don't know where it is copied from (requiring extra information somewhere) and (b) if there is a minor Python release then things break (because you've copied an old interpreter). Probably there should be a check to catch (b) and print an appropriate (helpful) error.
Yes, it might be possible to change distutils to be aware of this, and some things will work okay as a result. Some things will really require changes to the problematic project's setup.py to support this.
* Override sys.prefix: allow you to put the binary in someplace other than, say, ~/env/bin/python and still support an environment in ~/env/. Also the use case of looking for libraries in a location based on the interpreter name (not the containing directory), like supporting /usr/bin/python2.7 and /usr/bin/python2.7-dbg. * Control global site-packages: people use this all the time with virtualenv. * Other locations: well, since Ubuntu/Debian are using dist-packages and whatnot, to get *full* isolation you might want to avoid this. This is really handy when testing setup instructions. * Control installations: right now distutils only really looks in /usr/lib/pythonX.Y/distutils/distutils.cfg for settings. virtualenv monkeypatches distutils to look in <sys.prefix>/lib/pythonX.Y/distutils/distutils.cfg in addition, and several people use this feature to control virtualenv-local installation.
OK; or at least, it seems fine that this would be equivalent.
* pythonv's purpose in life is to infer your prefix directory and run "pythonX.X --prefix <prefixdir> [ all args it got ... ]".
I don't see any reason to call the other Python binary, it might as well just act like it was changed. sys.executable *must* point to the originally called interpreter anyway.
Experience shows the latter, plus this would only really make sense if you really called the other interpreter (which I guess you could if you also added an --executable option or something to fix sys.executable). If you did that, then maybe it would be possible to do with PEP 3147 ( http://www.python.org/dev/peps/pep-3147/) since that makes it more feasible to support multiple Python versions with a single set of installed libraries. 3147 is important (and that it be backported to 2.7). I have to think about it a bit... but maybe with this it would be possible to move these environments around without breaking things. That would be compelling.
With a config file I'd just expect a list of prefixes being allowed; directly nesting feels unnecessarily awkward. You could use a : (or Windows-semicolon) list just like with PYTHONPATH.
Ah, yes, the same problem I note above. It should definitely be the thing the person actually typed, or what is in the #! line.
I'm at PyCon and would be interested in debating / sprinting on this if there's interest.
Yeah, if you see me around, please catch me! -- Ian Bicking | http://blog.ianbicking.org | http://twitter.com/ianbicking
Ian Bicking wrote:
I'm new to this: why would you want to change sys.prefix in the first place? Its documentation implies that it's where Python itself is installed. I see two uses in the standard library (trace and gettext) and they both look like they'd get confused if sys.prefix pointed at a virtualized directory.
Okey-doke, I defer to your experience. Obviously if this is going into Python we can do better than monkeypatching distutils.
If by this you mean pythonv should load the Python shared library / DLL directly, that would make it impossible to stack environments. Which I'm still angling for. /larry/
Larry Hastings:
The INCLUDE environment variable is a list of ';' separated paths http://msdn.microsoft.com/en-us/library/36k2cdd4%28VS.100%29.aspx Neil
participants (6)
-
Antoine Pitrou
-
Guido van Rossum
-
Ian Bicking
-
Larry Hastings
-
Neil Hodgson
-
Philip Jenvey