So, coming back to the idea of a working environment, an isolated and more-or-less self-contained environment for holding installed packages. Sorry if this is a little scattered. I'm just summarizing my thoughts and the open issues I see, in no particular order (because I'm not sure what order to approach these things in). I'm assuming such an environment will be encapsulated in a single directory, looking something like: env/ bin/ lib/python2.4/ src/ conf/ The conf/ directory doesn't really relate to much of this specifically, but in many situations it would be useful. Depending on the situation, other subdirectories may exist. Each of the scripts in bin/ should know what their working environment is. This is slightly tricky, depending on what that means. If it is a totally isolated environment -- no site-packages on sys.path -- then I feel like the script wrappers have to be shell scripts, to invoke Python with -S (which is hard to do portably on the #! line). I don't know the details of doing the same thing on Windows, but I assume it is possible. The actual directory location should be portable -- all paths should be relative, and you should be able to move the directory around. lib/python2.4/ is for packages. I'm almost inclined to say that --single-version-externally-managed makes sense on some level, with a record kept in some standard place (lib/python2.4/install-record.txt?) -- but I'm basically indifferent. I at least don't see a need for multiple simultaneous versions in this setup, and multiple versions do lead to confusion. Version metadata is still nice, of course. src/ is for checkouts, where each package is installed with setup.py develop. These are naturally single-version, which is part of why I like the idea of only using single-version setups. I'm a little unsure of how src/ should be layed out. In practice I want all "my" packages to be installed in src/ as checkouts, either from tags or the trunk (or a branch or whatever). So I'm not sure if I should name the subdirectories after the package, or maybe even the package plus a tag name. One of the things SwitchTower (now "Cappucino", I think) does in Rails land is it makes a dated checkout, then activates that checkout (it does it with a symlink, we'd do it with setup.py develop). It then rolls back by switching to an existing checkout. Of course svn switch + svn up does this in place, and with less checkout trash laying around, even if rollbacks aren't as fast as a result. So, I'm thinking just src/PackageName/ There's an installation issue there -- it would be nice if I could say "these are the packages I want to install as editable" and easy_install would pick those up (maybe detecting based on what package index the package was found in) and install them in src/ as editable. sys.path would contain /usr/lib/python2.4, optionally /usr/lib/python2.4/site-packages, and env/lib/python2.4/, and all the similar directories. Unfortunately figuring out what "similar" directories there are is hard. sys.path on my machine now has 63 entries normally and 12 with python -S. I guess I'd really like to start with 12 and build up, instead of 63 and try to strip them down. Installation as a whole is an open issue. Putting in env/setup.cfg with the setting specific to that working environment works to a degree -- easy_install will pick it up if invoked from there. But that doesn't work with setup.py develop, or setup.py install, or some other scenarios. The system distutils.cfg doesn't really work, because the only expansion it knows how to do is of user directories, so there's little way to pass interesting information in (like a "this is my setup.cfg" environmental variable or something). Maybe with PYTHONPATH to indicate the working environment, and a distutils monkeypatch put into lib/python2.4/distutils/__init__.py? I played around with putting the path setup in sitecustomize, but that runs after site.py, and doesn't run at all if python -S is used, so it seems like it brings in too much before it can remove stuff. Another option is a completely new python interpreter bound to the environment. Basically the virtual-python.py option (http://peak.telecommunity.com/DevCenter/EasyInstall#creating-a-virtual-pytho...). In this model using env/bin/python indicate the proper environment, and you'd have local installs of *everything* including easy_install. This fixes so many problems without crazy hacks that it strongly appeals to me, especially if we can make it somewhat lighter. I get this /usr/lib/python2.4.zip on my path, that doesn't usually exist (does it ever get created by default); if we could create that somehow on demand and use such a big-bundle-zip, that seems lighter and faster and nicer. If we just put .pyc files in it, and those .pyc files refer back to the actual module source (in /usr/lib/python2.4/), then tracebacks should also still work, right? No actual symlinks either, so it should work on Windows. I'm not entirely sure where I'm going with this, though. Sorry for the length. I've been stewing on this without a lot of progress since PyCon so I thought I'd just throw out my current thoughts. Maybe what I really want to do is hack on virtual-python.py some more and see where that gets me. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org
Ian Bicking <ianb@colorstudy.com> writes:
Another option is a completely new python interpreter bound to the environment. Basically the virtual-python.py option (http://peak.telecommunity.com/DevCenter/EasyInstall#creating-a-virtual-pytho...). In this model using env/bin/python indicate the proper environment, and you'd have local installs of *everything* including easy_install. This fixes so many problems without crazy hacks that it strongly appeals to me, especially if we can make it somewhat lighter. I get this /usr/lib/python2.4.zip on my path, that doesn't usually exist (does it ever get created by default); if we could create that somehow on demand and use such a big-bundle-zip, that seems lighter and faster and nicer. If we just put .pyc files in it, and those .pyc files refer back to the actual module source (in /usr/lib/python2.4/), then tracebacks should also still work, right? No actual symlinks either, so it should work on Windows. I'm not entirely sure where I'm going with this, though.
Just a thought... This sounds like a clean approach and powerful. What if, for example, you work on a machine with python 2.4 installed, but you want an environment with python 2.5? With what you're proposing here you could go totally heavyweight and have the whole python distribution installed in your working invironment only. Or distro's could have a python-dev type package which installs in a location other than their official python installation, so that you can still do your more lightweight thing. Reminds me of a database 'installation' with seperate 'instances' of it on the same machine, each with its own data. I'm actually thinking that this is useful especially in the case where you're working on a distribution which is tightly bound to an older version of python - but you want to develop on a newer version. -i
On Wednesday 08 March 2006 18:15, Iwan Vosloo wrote:
Just a thought... This sounds like a clean approach and powerful. What if, for example, you work on a machine with python 2.4 installed, but you want an environment with python 2.5? With what you're proposing here you could go totally heavyweight and have the whole python distribution installed in your working invironment only.
Ick. I'm sure I'm not the only one who remembers telling perl's CPAN module to auto-build all dependencies, only to come back and find it building a new version of Perl to install in a local directory. I really don't like this idea at all. I should point out that I don't mind the working environment idea, just not the idea of a version of Python being built in there. You can already install multiple versions of Python with "make altinstall" (or your distro's packaging system). Anthony -- Anthony Baxter <anthony@interlink.com.au> It's never too late to have a happy childhood.
Ian Bicking wrote:
So, coming back to the idea of a working environment, an isolated and more-or-less self-contained environment for holding installed packages. Sorry if this is a little scattered. I'm just summarizing my thoughts and the open issues I see, in no particular order (because I'm not sure what order to approach these things in).
I'm assuming such an environment will be encapsulated in a single directory, looking something like:
env/ bin/ lib/python2.4/ src/ conf/
The conf/ directory doesn't really relate to much of this specifically, but in many situations it would be useful. Depending on the situation, other subdirectories may exist.
OK, I'll pretend that it doesn't exist. ;) Actually, I'll have a bit to say about configuration below.
Each of the scripts in bin/ should know what their working environment is. This is slightly tricky, depending on what that means. If it is a totally isolated environment -- no site-packages on sys.path -- then I feel like the script wrappers have to be shell scripts, to invoke Python with -S (which is hard to do portably on the #! line).
I'll note that this isn't important to me at all. I'm not opposed to allowing it, but I don't need it. I never use the python that comes with my system. I always build Python from source and I can therefore keep that install clean, complete, and relatively predictable. (I am often annoyed by Python's *implicit* decision of which extension modules to install. Hopefully, someday, the presense of a packaging system will enable saner approaches to optional extensions.) In fact, I usually have multiple versions of Python available. I then will have many working environments that use these custtom Pythons. (The custom Python's are not part of any working environment.) Similarly, in production environments, we install custom builds of Python that our applications use. ...
lib/python2.4/ is for packages.
Minor note: this needs to be flexible. I'd be more inclined to go with something shallower and simpler, like just "lib",
I'm almost inclined to say that --single-version-externally-managed makes sense on some level, with a record kept in some standard place (lib/python2.4/install-record.txt?) -- but I'm basically indifferent. I at least don't see a need for multiple simultaneous versions in this setup, and multiple versions do lead to confusion. Version metadata is still nice, of course.
Increasingly, projects I work on involve multiple Python applications with each application requiring different sets of packages and, sometimes, different package versions. Basically, I want a single version for each application, but multiple versions in the environment. What appeals to me is a compromise between the single-version and multi-version install. With a single-version install, you end up with an easy-install.pth that names the packages (and package versions) being used. This has the advantage of being deterministic. You always get the packages listed there. This affects all applications running in the environment. In a multi-version install, multiple versions are stored in the environment. Different applications can use different versions. When an application is run, a root package is specified in a script (the package containing the entry point) and pkg_resources tries to find a suitable combination of packages that satisfy the closure of the root package's dependencies. This process is somewhat non-deterministic. Installation of a new package could change the set of packages used beyond that individual package, causing implicit and unintended "upgrades" that you refered to in your talk at PyCon. What I want (or think I want :) is per-application (per-script) .pth files. When I install a script, I want the packages needed by the script to be determined at that time and written to a script-specific .pth file. So, for example, if I install a script "foo" (foo.exe, whatever), I also get foo.pth and have the generated foo script prepend the contents of that file to sys.path on startup. I don't mind if that file is required to be in the same directory as the script. I also wouldn't mind if the contents if the path entries were simply embedded in the script. If I want to upgrade the packages being used, I simply update the scripts. There are veriopus ways that this could be made easy. Even better, the upgrade process could optionally make backups of the .pth files or scripts, allowing easy rollbacks of upgrades. I think that this approach combines the benefits of single- and multi-version install. Different applications in an enevironment can require different package versions. The choice of packages for an application is determined at discrete explicit install times, rather than at run time. ...
Installation as a whole is an open issue. Putting in env/setup.cfg with the setting specific to that working environment works to a degree -- easy_install will pick it up if invoked from there. But that doesn't work with setup.py develop, or setup.py install, or some other scenarios.
I don't follow this. It seems to work for us, at least for setup.py develop. The main lamosity is depending on the current working directory. Personally, I'd like (at least optionally) the generated easy_install script to include information about the environment that was present when it was installed. That is, I'd be happy to see setup.cfg consuled when easy_install is installed and then ignored thereafter. If you wanted to change the environment definition, then simply update setup.cfg amd reinstall easy_install. Or, better yet, when installing easy_install, have the generated script remember the full path of the setup.cfg file used and reread that file when run, regardless of where easy_install is run from. This brings me to the topic of configuration. Today, I write wrapper scripts "by hand", I may have some application like Zope, or ZEO or our test runner that is implemented by a an entry point in a module. Then there's a wrapper script that imports the module and calls the entry point. The wrapper script is written (manually or with some custom installation script) to include the path to be used and configuration data, which may be the location of a configuration file. I really like the fact that easy_install will generate wrapper scripts for me, but I really need more control over how these scripts are generated to include *both* path and configuration information. ...
Another option is a completely new python interpreter bound to the environment. Basically the virtual-python.py option (http://peak.telecommunity.com/DevCenter/EasyInstall#creating-a-virtual-pytho...). In this model using env/bin/python indicate the proper environment, and you'd have local installs of *everything* including easy_install. This fixes so many problems without crazy hacks that it strongly appeals to me, especially if we can make it somewhat lighter. I get this /usr/lib/python2.4.zip on my path, that doesn't usually exist (does it ever get created by default); if we could create that somehow on demand and use such a big-bundle-zip, that seems lighter and faster and nicer. If we just put .pyc files in it, and those .pyc files refer back to the actual module source (in /usr/lib/python2.4/), then tracebacks should also still work, right? No actual symlinks either, so it should work on Windows. I'm not entirely sure where I'm going with this, though.
I think that something much simpler can be made to work. I think a little more control over how scripts get generated would go a long way. (There is still the question of the interactive interpreter...) Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
Jim Fulton wrote:
Each of the scripts in bin/ should know what their working environment is. This is slightly tricky, depending on what that means. If it is a totally isolated environment -- no site-packages on sys.path -- then I feel like the script wrappers have to be shell scripts, to invoke Python with -S (which is hard to do portably on the #! line).
I'll note that this isn't important to me at all. I'm not opposed to allowing it, but I don't need it.
I can go both ways. I think this should be configurable when you set up the working environment, hopefully someplace where it is easy to change.
lib/python2.4/ is for packages.
Minor note: this needs to be flexible. I'd be more inclined to go with something shallower and simpler, like just "lib",
Why? Top-level packages aren't portable, since .pyc files aren't portable. Eggs are portable, since they contain the Python version.
I'm almost inclined to say that --single-version-externally-managed makes sense on some level, with a record kept in some standard place (lib/python2.4/install-record.txt?) -- but I'm basically indifferent. I at least don't see a need for multiple simultaneous versions in this setup, and multiple versions do lead to confusion. Version metadata is still nice, of course.
Increasingly, projects I work on involve multiple Python applications with each application requiring different sets of packages and, sometimes, different package versions. Basically, I want a single version for each application, but multiple versions in the environment. What appeals to me is a compromise between the single-version and multi-version install.
It's probably not important to get rid of an existing feature of setuptools. The problems I have are better served with better tool support. Mostly I get a huge number of old and expired eggs sitting around, and there needs to be a collection process, probably a process that is run on every installation. I guess the collection would start from all the packages listed in .pth files, and maybe a collection of scripts, and then anything those packages require. And anything left over is garbage. In part I'm personally moving to using setup.py develop installs for more software, and that's more naturally single-version (though I suppose it doesn't have to be).
Installation as a whole is an open issue. Putting in env/setup.cfg with the setting specific to that working environment works to a degree -- easy_install will pick it up if invoked from there. But that doesn't work with setup.py develop, or setup.py install, or some other scenarios.
I don't follow this. It seems to work for us, at least for setup.py develop. The main lamosity is depending on the current working directory.
I don't want users to have to give particular options to manage the installation. I want them to activate a specific environment, and for everything to just work. working-env.py mostly works like this.
This brings me to the topic of configuration. Today, I write wrapper scripts "by hand", I may have some application like Zope, or ZEO or our test runner that is implemented by a an entry point in a module. Then there's a wrapper script that imports the module and calls the entry point. The wrapper script is written (manually or with some custom installation script) to include the path to be used and configuration data, which may be the location of a configuration file. I really like the fact that easy_install will generate wrapper scripts for me, but I really need more control over how these scripts are generated to include *both* path and configuration information.
I'm not sure what to think of this. I don't think of it as a script. It's like a specific invocation of the script. A shell script. Maybe we can improve on shell scripts, but I think it's a different idea than the script alone. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org
Ian Bicking wrote:
Jim Fulton wrote:
...
lib/python2.4/ is for packages.
Minor note: this needs to be flexible. I'd be more inclined to go with something shallower and simpler, like just "lib",
Why? Top-level packages aren't portable, since .pyc files aren't portable. Eggs are portable, since they contain the Python version.
I have no idea what you are saying or how it relates to whether or not packages go in lib/python2.4 or lib. ...
This brings me to the topic of configuration. Today, I write wrapper scripts "by hand", I may have some application like Zope, or ZEO or our test runner that is implemented by a an entry point in a module. Then there's a wrapper script that imports the module and calls the entry point. The wrapper script is written (manually or with some custom installation script) to include the path to be used and configuration data, which may be the location of a configuration file. I really like the fact that easy_install will generate wrapper scripts for me, but I really need more control over how these scripts are generated to include *both* path and configuration information.
I'm not sure what to think of this. I don't think of it as a script. It's like a specific invocation of the script. A shell script. Maybe we can improve on shell scripts, but I think it's a different idea than the script alone.
What "it" are you talking about? Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
Jim Fulton wrote:
Ian Bicking wrote:
Jim Fulton wrote:
...
lib/python2.4/ is for packages.
Minor note: this needs to be flexible. I'd be more inclined to go with something shallower and simpler, like just "lib",
Why? Top-level packages aren't portable, since .pyc files aren't portable. Eggs are portable, since they contain the Python version.
I have no idea what you are saying or how it relates to whether or not packages go in lib/python2.4 or lib.
lib/foo/__init__.pyc is a file that is specific to a version of Python. lib/python2.4/foo/__init__.pyc removes any possibility of conflict. Though I suppose it is arguable that a working environment should only support one major version of Python.
This brings me to the topic of configuration. Today, I write wrapper scripts "by hand", I may have some application like Zope, or ZEO or our test runner that is implemented by a an entry point in a module. Then there's a wrapper script that imports the module and calls the entry point. The wrapper script is written (manually or with some custom installation script) to include the path to be used and configuration data, which may be the location of a configuration file. I really like the fact that easy_install will generate wrapper scripts for me, but I really need more control over how these scripts are generated to include *both* path and configuration information.
I'm not sure what to think of this. I don't think of it as a script. It's like a specific invocation of the script. A shell script. Maybe we can improve on shell scripts, but I think it's a different idea than the script alone.
What "it" are you talking about?
This script+config invocation. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org
Ian Bicking wrote:
Jim Fulton wrote:
Ian Bicking wrote:
Jim Fulton wrote:
...
lib/python2.4/ is for packages.
Minor note: this needs to be flexible. I'd be more inclined to go with something shallower and simpler, like just "lib",
Why? Top-level packages aren't portable, since .pyc files aren't portable. Eggs are portable, since they contain the Python version.
I have no idea what you are saying or how it relates to whether or not packages go in lib/python2.4 or lib.
lib/foo/__init__.pyc is a file that is specific to a version of Python. lib/python2.4/foo/__init__.pyc removes any possibility of conflict. Though I suppose it is arguable that a working environment should only support one major version of Python.
Yup, at least most of the time.
This brings me to the topic of configuration. Today, I write wrapper scripts "by hand", I may have some application like Zope, or ZEO or our test runner that is implemented by a an entry point in a module. Then there's a wrapper script that imports the module and calls the entry point. The wrapper script is written (manually or with some custom installation script) to include the path to be used and configuration data, which may be the location of a configuration file. I really like the fact that easy_install will generate wrapper scripts for me, but I really need more control over how these scripts are generated to include *both* path and configuration information.
I'm not sure what to think of this. I don't think of it as a script. It's like a specific invocation of the script. A shell script. Maybe we can improve on shell scripts, but I think it's a different idea than the script alone.
What "it" are you talking about?
This script+config invocation.
OK. Do you realize that for us, these are already combined? That is, if you install Zope and make an instance, the scripts that are installed have paths and configuration baked into them. For us, these are just installed scripts. There's nothing particularly exciting going on here. I want the same thing with the scripts generated by easy_install (or something like it that we build, if necessary). Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
Jim Fulton wrote:
I'm not sure what to think of this. I don't think of it as a script. It's like a specific invocation of the script. A shell script. Maybe we can improve on shell scripts, but I think it's a different idea than the script alone.
What "it" are you talking about?
This script+config invocation.
OK. Do you realize that for us, these are already combined? That is, if you install Zope and make an instance, the scripts that are installed have paths and configuration baked into them. For us, these are just installed scripts. There's nothing particularly exciting going on here. I want the same thing with the scripts generated by easy_install (or something like it that we build, if necessary).
Well, I can certainly see that it would be useful to give the script access to the location of the working environment, and from there it can load default configuration files. That the script can detect intelligent default locations for files -- entirely reasonable -- isn't the same as coding the configuration file in the script. That might be okay too, but config+script isn't the same as a script, it's something else. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org
Ian Bicking wrote:
Jim Fulton wrote:
I'm not sure what to think of this. I don't think of it as a script. It's like a specific invocation of the script. A shell script. Maybe we can improve on shell scripts, but I think it's a different idea than the script alone.
What "it" are you talking about?
This script+config invocation.
OK. Do you realize that for us, these are already combined? That is, if you install Zope and make an instance, the scripts that are installed have paths and configuration baked into them. For us, these are just installed scripts. There's nothing particularly exciting going on here. I want the same thing with the scripts generated by easy_install (or something like it that we build, if necessary).
Well, I can certainly see that it would be useful to give the script access to the location of the working environment, and from there it can load default configuration files.
That could be a start.
That the script can detect intelligent default locations for files -- entirely reasonable --
But probably too implicit, especially as a general approach.
isn't the same as coding the configuration file in the script. That might be okay too, but config+script isn't the same as a script, it's something else.
OK, we disagree. People encode this sort of information in scripts now, Saying it is something else doesn't make it so. If no one else is interested in this, we'll try to figure something out and share what we've learned. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org
At 09:56 AM 3/12/2006 -0500, Jim Fulton wrote:
OK, we disagree. People encode this sort of information in scripts now, Saying it is something else doesn't make it so. If no one else is interested in this, we'll try to figure something out and share what we've learned.
FWIW, I'm probably going to generalize script writing to be entry-point based. Right now, the various entry point groups for scripts are hard-coded, but I want to make it extensible. That way, you'll be able to effectively have custom install-time hooks, within certain restrictions that I haven't yet figured out. (Apart from obvious restrictions like not being interactive, not writing stuff to arbitrary locations, etc.) The complex thing about doing this is that when an egg's scripts are written, its dependencies may not yet be installed. (Especially in the system package manager case.) So, I may have to refactor the whole installation process to queue scripts for installation only at the end of installing all dependencies. Not-so-coincidentally, a similar refactoring may be needed to implement path freezing for scripts. :) (Because until the dependencies are installed, you don't know what paths to freeze.)
Jim Fulton wrote:
isn't the same as coding the configuration file in the script. That might be okay too, but config+script isn't the same as a script, it's something else.
OK, we disagree. People encode this sort of information in scripts now, Saying it is something else doesn't make it so. If no one else is interested in this, we'll try to figure something out and share what we've learned.
Maybe I should clarify what I mean. The script that setuptools generates, as enumerated in the setup.py file, isn't something that would naturally be bound to any configuration file. There's not a one-to-one mapping between those scripts, an installation, and a single configuration file. That *a* script might be bound to a configuration file, sure. Such a binding might also include any number of environment setups; the typical items in a shell script. Unless you are thinking about a different kind of configuration than I am. I suppose configuration like an enumeration of activated plugins would seem to fit more closely with a working environment. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org
participants (5)
-
Anthony Baxter -
Ian Bicking -
Iwan Vosloo -
Jim Fulton -
Phillip J. Eby