setuptools: package management and explicit version numbers
Has there been any progress on some of the package management tools people were talking about? I.e., list and delete packages? (At the moment no other features come to mind, but I'm definitely seeing a purpose for these functions) Someone said they were going to give it a go, but I can't remember who. Another option that I think would be useful is to modify requirements in a package to make version numbers explicit. So I might have something in development and it requires a bunch of packages. It just requires the Most Recent version of those packages; I haven't tested anything but that, so I don't know if older versions would work, and I can't know if newer versions will work. When I get ready to release, I want to be conservative. Which means that I know that the exact versions I am using work well, but nothing else; I don't know that past or future versions of the packages I depend on will work. So I'd like to change my requirements so that they specify the exact versions I'm using at the time of release. As I think about it, it sounds pretty easy really. I just find all the requirements, see what version is installed, and rewrite requires.txt with those versions. At first I thought it should be a package management function, but should this be part of setuptools? A new command, or maybe an option to egg_info? It seems like it fits into egg_info well. This means requirements should be specified only in requires.txt, not in setup(). There's some other features I can think of specifically using subversion together with setuptools, but maybe I'll mull on those a bit longer. Nothing fancy, but codifying a specific pattern, something like: easy_install.py -e -b dev package_name cd dev/package_name # I would like "in-development" packages to have some specific version # but I don't know what that version should be...? sudo python setup.py develop -m # time passes, development is done, release is ready... python setup.py svntag 0.5 # changes PKG-INFO with new version, copies trunk to tags/0.5, does an # svn switch huh... maybe I can add such a command with an entry point? -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org
At 01:00 PM 8/10/2005 -0500, Ian Bicking wrote:
Has there been any progress on some of the package management tools people were talking about? I.e., list and delete packages? (At the moment no other features come to mind, but I'm definitely seeing a purpose for these functions) Someone said they were going to give it a go, but I can't remember who.
It was Paul Moore. I recently had a look at Ruby's "gems" system, which is surprisingly similar to setuptools and Python Eggs, except that they also have the list and delete functions, functions to verify signatures and check installation integrity, and a command to run a package index server off the information in the gems you have installed! I'm thinking that in the long run (i.e. after I get the pkg_resources manual finished), I'll make a 'nest' script along the lines of their 'gem' program, and use that as the place for adding the list/delete/upgrade stuff. One interesting benefit to the server concept is that you could use it to create a browser-based interface to browse through eggs' contents. I suppose you could also include options to install/upgrade/delete eggs as well, at least if you have an appropriate way to restrict access. It would let us implement a cross-platform GUI for setuptools. Of course, I doubt I'd bundle it directly with setuptools. Instead, I'd just define an entry point for the server command in the setuptools egg, with an "extra" set to download the egg with the actual server functionality. In fact, everything except the "nest" script itself and the "install" command (which would be an alias for easy_install, such that 'nest install foo' == 'easy_install foo') could be in separate "nest eggs". :)
Another option that I think would be useful is to modify requirements in a package to make version numbers explicit. So I might have something in development and it requires a bunch of packages. It just requires the Most Recent version of those packages; I haven't tested anything but that, so I don't know if older versions would work, and I can't know if newer versions will work.
When I get ready to release, I want to be conservative. Which means that I know that the exact versions I am using work well, but nothing else; I don't know that past or future versions of the packages I depend on will work. So I'd like to change my requirements so that they specify the exact versions I'm using at the time of release.
As I think about it, it sounds pretty easy really. I just find all the requirements, see what version is installed, and rewrite requires.txt with those versions. At first I thought it should be a package management function, but should this be part of setuptools? A new command, or maybe an option to egg_info? It seems like it fits into egg_info well. This means requirements should be specified only in requires.txt, not in setup().
Ugh. Well, you could do that, but you're going to get a warning about requires.txt being defined but no requirements specified in setup(). It probably makes more sense to have a function that takes a set of requirement strings and returns new requirement strings, and put that in setup().
There's some other features I can think of specifically using subversion together with setuptools, but maybe I'll mull on those a bit longer. Nothing fancy, but codifying a specific pattern, something like:
easy_install.py -e -b dev package_name cd dev/package_name # I would like "in-development" packages to have some specific version # but I don't know what that version should be...? sudo python setup.py develop -m # time passes, development is done, release is ready... python setup.py svntag 0.5 # changes PKG-INFO with new version, copies trunk to tags/0.5, does an # svn switch
huh... maybe I can add such a command with an entry point?
You can add whatever commands you want with entry points. Further, you can define aliases (per project, user, or Python installation) that group related commands and options into a single virtual command, using the 'setup.py alias' command.
On 8/10/05, Phillip J. Eby
At 01:00 PM 8/10/2005 -0500, Ian Bicking wrote:
Has there been any progress on some of the package management tools people were talking about? I.e., list and delete packages? (At the moment no other features come to mind, but I'm definitely seeing a purpose for these functions) Someone said they were going to give it a go, but I can't remember who.
It was Paul Moore.
It was indeed. I haven't forgotten, but a number of things have kept me from making progress yet. The first is the usual lack of time, but that's not a limiting factor here, really. I've also been waiting for the reorganisation of the API that Phillip promised, but again that's not a real problem - I can use CVS if I need to. My main issue is one of perspective, I guess. I'm not a package maintainer, so much of the setuptools stuff (resource APIs, entry points etc, etc) is not directly relevant to me. I'm a package *user*, and so my viewpoint is of eggs as an alternative to Windows bdist_wininst installers. Hence my interest in management tools - installers come with a number of benefits "free", as part of the Windows installer architecture - you get an uninstall "command", you can see a listing of installed packages, and you can see the version numbers of those packages. Until I can do the same with eggs, they are a step backwards for me. So that's why I want to build these features. OK, so given that I have to write some code to bring eggs onto a par with installers, where's the benefit of eggs, *to me*, which would make me want to switch to them in the first place? And that's where I get stalled. I can see lots of benefits to eggs - for example Trac's new feature allowing you to just drop an egg containing a plugin into a predefined directory is really exciting - but they don't apply to my simple requirement to install packages like cx_Oracle, pywin32, ctypes, Cheetah, into my standard Python installation. So - a question for Phillip. Do you see eggs supplanting bdist_wininst installers on Windows as the standard way of distributing Python packages? If so, what do you see as the benefit to the end user, which would prompt that switch? *Not* to the package developer, please note - ultimately a developer can just as easily run python setup.py bdist_wininst as python setup.py bdist_egg, and I doubt that users will switch enthusiastically to using eggs simply because developers switch to only offering them... (Switch maybe, but I doubt the enthusiasm :-)) This is more than just a philosophical issue, in that a lot of the egg infrastructure (development installs, for example) don't seem to me to be relevant to the end user scenario. So, from my perspective, I don't understand the imperatives which make them useful, and consequently I can't do a good job of designing list/uninstall commands around them. And I have no need for the facilities, so no incentive to cater for them (beyond the desire to do a good job...) As I say, I still intend to do this, but I'm waiting to get a clearer picture of how eggs fit into the end-user environment (as opposed to the developer environment) before I do so. Ian - how this fits in with your requirements, I'm not sure. I really don't follow your comments at all well, as I don't have the relevant background (at the level I develop stuff, the prerequisite version compatibility issues you describe just don't come into it - I only distribute to a very limited set of targets, all of which are closely controlled in terms of what is installed, so I have the luxury of expecting things to "just work"). It's on my list of things to do, to go back over the various recent postings and try to understand the developer perspective better. But I still view the end-user side of things as the first priority. Hope this helps, Paul.
At 11:26 PM 8/10/2005 +0100, Paul Moore wrote:
Hence my interest in management tools - installers come with a number of benefits "free", as part of the Windows installer architecture - you get an uninstall "command", you can see a listing of installed packages, and you can see the version numbers of those packages. Until I can do the same with eggs, they are a step backwards for me. So that's why I want to build these features.
OK, so given that I have to write some code to bring eggs onto a par with installers, where's the benefit of eggs, *to me*, which would make me want to switch to them in the first place? And that's where I get stalled. I can see lots of benefits to eggs - for example Trac's new feature allowing you to just drop an egg containing a plugin into a predefined directory is really exciting - but they don't apply to my simple requirement to install packages like cx_Oracle, pywin32, ctypes, Cheetah, into my standard Python installation.
So - a question for Phillip. Do you see eggs supplanting bdist_wininst installers on Windows as the standard way of distributing Python packages?
Guardedly, I'd say yes -- once the tools mature, I think they'll become a popular distribution mechanism. (OTOH, EasyInstall already converts bdist_wininst installers to almost-perfect eggs already, so this is sort of moot.) However, the aim is somewhat more at providing a better way for application developers to ship applications, plugins, and needed libraries.
If so, what do you see as the benefit to the end user, which would prompt that switch? *Not* to the package developer, please note
Better applications and packages, because with eggs you can specify dependencies. This means that people will develop smaller more reusable packages rather than reinventing wheels, and the overall ecosystem will improve in software quality. Yeah, I know that's not what you meant, but ultimately it is the real benefit to doing all this. :)
- ultimately a developer can just as easily run python setup.py bdist_wininst as python setup.py bdist_egg, and I doubt that users will switch enthusiastically to using eggs simply because developers switch to only offering them... (Switch maybe, but I doubt the enthusiasm :-))
This is more than just a philosophical issue, in that a lot of the egg infrastructure (development installs, for example) don't seem to me to be relevant to the end user scenario.
Who is an "end user" here? To me, if you're installing stuff into your Python installation, you're obviously a developer. Eggs are especially meant to help install stuff for applications whose users don't even know what Python *is* - i.e., "real" end-users in my book. :) Now, if by "end user" you mean a developer who isn't distributing anything of their own, then yeah, having some egg management tools besides Windows Explorer might be nice. :) For that matter, even if you are doing more sophisticated things with eggs, better management tools would be nice.
So, from my perspective, I don't understand the imperatives which make them useful,
If you look at it from the POV of a non-distributing developer using Windows, then the main benefit is going to be having multiple versions of something installed at the same time. A second benefit is dependency resolution via EasyInstall; i.e. that you can get all the packages something needs to work, without having to track them down one by one. Future additional benefits may include signature checking and the like (as soon as we come up with a signature format).
Paul Moore wrote:
On 8/10/05, Phillip J. Eby
wrote: At 01:00 PM 8/10/2005 -0500, Ian Bicking wrote:
Has there been any progress on some of the package management tools people were talking about? I.e., list and delete packages? (At the moment no other features come to mind, but I'm definitely seeing a purpose for these functions) Someone said they were going to give it a go, but I can't remember who.
It was Paul Moore.
It was indeed. I haven't forgotten, but a number of things have kept me from making progress yet. The first is the usual lack of time, but that's not a limiting factor here, really. I've also been waiting for the reorganisation of the API that Phillip promised, but again that's not a real problem - I can use CVS if I need to.
My main issue is one of perspective, I guess. I'm not a package maintainer, so much of the setuptools stuff (resource APIs, entry points etc, etc) is not directly relevant to me. I'm a package *user*, and so my viewpoint is of eggs as an alternative to Windows bdist_wininst installers.
Hence my interest in management tools - installers come with a number of benefits "free", as part of the Windows installer architecture - you get an uninstall "command", you can see a listing of installed packages, and you can see the version numbers of those packages. Until I can do the same with eggs, they are a step backwards for me. So that's why I want to build these features.
I suspect the management tool has two parts -- one is extending pkg_resources so these operations are easy (list, delete). Then you want to integrate that into Windows' installer architecture -- registering the packages and uninstallers with Windows, maybe making a GUI frontend (though you could almost get away without it). But since I'm not using Windows, the frontend I imagine is obviously very different. My frontend is probably really easy, but I have no idea what a GUI frontend would entail.
OK, so given that I have to write some code to bring eggs onto a par with installers, where's the benefit of eggs, *to me*, which would make me want to switch to them in the first place? And that's where I get stalled. I can see lots of benefits to eggs - for example Trac's new feature allowing you to just drop an egg containing a plugin into a predefined directory is really exciting - but they don't apply to my simple requirement to install packages like cx_Oracle, pywin32, ctypes, Cheetah, into my standard Python installation.
I think eggs are sometimes a distraction when considering features. Or at least they are for me. I know there's some eggs produced when I install stuff with easy_install, but I don't really touch them or care about them much. Right now I don't have any incentive to distribute anything but source packages (from sdist); easy_install and setuptools do a fine job of using these, and for the users I interact with (other programmers) the source package is more usable. But anyway, the feature for me is the installation of different package versions. This makes me less worried about installing things globally. This in turn makes management a lot easier. I suspect Windows usage patterns (or just your patterns) aren't the same, so you aren't as concerned about this. Another issue is installation of dependencies. This is still just potential for me; I haven't really had the satisfying experience of getting a package to install lots of dependencies for me. But I think I'm getting closer. This is something I'm used to with software installation, but not Python. Windows doesn't do this anyway, so you probably aren't clamouring for it. I think it changes the larger ecosystem and encourages sharing; but the payoff isn't as immediate because there isn't as much sharing as there could be. And last is formal metadata on a package, which is what makes plugins workable. You'll only see this as people actually start using that metadata, so there's not much appeal there yet.
This is more than just a philosophical issue, in that a lot of the egg infrastructure (development installs, for example) don't seem to me to be relevant to the end user scenario. So, from my perspective, I don't understand the imperatives which make them useful, and consequently I can't do a good job of designing list/uninstall commands around them. And I have no need for the facilities, so no incentive to cater for them (beyond the desire to do a good job...)
The development installs are there for developers. Software doesn't appear out of thin air!
Ian - how this fits in with your requirements, I'm not sure. I really don't follow your comments at all well, as I don't have the relevant background (at the level I develop stuff, the prerequisite version compatibility issues you describe just don't come into it - I only distribute to a very limited set of targets, all of which are closely controlled in terms of what is installed, so I have the luxury of expecting things to "just work"). It's on my list of things to do, to go back over the various recent postings and try to understand the developer perspective better. But I still view the end-user side of things as the first priority.
I'm not surprised most of my comments haven't made sense to you -- I'm just plodding along and trying to figure details out. I'm hoping when I get all the details figured out that the larger picture will make more sense to me, but I haven't quite gotten there yet. Mostly in terms of how this stuff *should* be used, as opposed to how it *can* be used. Plus you only see my questions so far, anything I understand I've kept to myself ;) I'm doing a presentation on this stuff tomorrow, though, so hopefully I'll get closer today. It'll feature lots of command-line transcripts you probably will find rather unexciting ;) -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org
On 8/10/05, Ian Bicking
Paul Moore wrote:
Hence my interest in management tools - installers come with a number of benefits "free", as part of the Windows installer architecture - you get an uninstall "command", you can see a listing of installed packages, and you can see the version numbers of those packages. Until I can do the same with eggs, they are a step backwards for me. So that's why I want to build these features.
I suspect the management tool has two parts -- one is extending pkg_resources so these operations are easy (list, delete). Then you want to integrate that into Windows' installer architecture -- registering the packages and uninstallers with Windows, maybe making a GUI frontend (though you could almost get away without it). But since I'm not using Windows, the frontend I imagine is obviously very different. My frontend is probably really easy, but I have no idea what a GUI frontend would entail.
I'm not interested in integrating with the Windows installer stuff per se. My key concern here is that C:\Python24 (or wherever Python is installed) is owned by Python - I want no reason or need to use OS commands directly on its contents. As long as I have a way of getting the information I'm after, and performing the actions I need to, I'm happy. On that basis, something like easy_install.py list (with command line output) is perfectly acceptable. Indeed, it's probably an improvement, to the extent that it can be used in code, not just read by the user. BUT (and it's a big but) when easy_install.py list doesn't show me everything installed in Python's directory, I have a problem. I exclude stuff installed via python setup.py install (I never do that, because it's unmanaged) or "just dropped in there" (same reason). But there's a transitional problem with bdist_wininst installers, which needs some thinking about (and no, easy_install's "convert a wininst installer to an egg" feature isn't an answer - it loses things like cx_Oracle's documentation, and I suspect that it breaks horribly for pywin32, which uses a postinstall script). As this issue is transitional only, it's not the end of the world, as long as eggs really do become the package distribution medium of choice for Python. I'd like to see some sign that eggs are making inroads against bdist_wininst, first, though.
I think eggs are sometimes a distraction when considering features. Or at least they are for me. I know there's some eggs produced when I install stuff with easy_install, but I don't really touch them or care about them much. Right now I don't have any incentive to distribute anything but source packages (from sdist); easy_install and setuptools do a fine job of using these, and for the users I interact with (other programmers) the source package is more usable.
Source distributions, I currently immediately build into Windows installers (python setup.py bdist_wininst) and install. If they contain C extensions that I can't build myself (for whatever reason) I bleat pathetically at the author in the hope that he'll take pity on me and provide Windows builds, and if that doesn't work, I generally (have to) give up. So unless you are building packages with C extensions having complex build requirements, you'll get no incentives from me - the source package is fine. But I'll use it to build Windows installers, and never bother with eggs. (Is there ever going to be a situation where code *won't work" unless run from an egg? I can see that being an issue, unless there's a way availabe to wrap eggs in Windows installers, RPMs, debs, etc - there will always be people who value "integration with platform standards" over other benefits).
But anyway, the feature for me is the installation of different package versions. This makes me less worried about installing things globally. This in turn makes management a lot easier.
I suspect Windows usage patterns (or just your patterns) aren't the same, so you aren't as concerned about this.
Correct. I don't worry about package versions much, beyond getting the latest one. I tend to treat applications which don't work with "the latest" versions of dependent packages as broken, and either attempt to fix them, or avoid using them, depending on time and the importance of the package to me.
Another issue is installation of dependencies. This is still just potential for me; I haven't really had the satisfying experience of getting a package to install lots of dependencies for me. But I think I'm getting closer. This is something I'm used to with software installation, but not Python. Windows doesn't do this anyway, so you probably aren't clamouring for it. I think it changes the larger ecosystem and encourages sharing; but the payoff isn't as immediate because there isn't as much sharing as there could be.
As I've hinted in previous messages, I have no interest whatsoever in automatic location and download of packages - both for environmental (firewall) reasons, and personal ones (security concerns, plus a dislike of giving up control...) So, for me, getting a package to list its dependencies is all I need - I'm still going to impose on myself the requirement to manually download and install those dependencies. This self-imposed manual process doesn't handle packages with conflicting dependencies well, but that's never been a practical issue for me so far (keeping my fingers crossed...)
And last is formal metadata on a package, which is what makes plugins workable. You'll only see this as people actually start using that metadata, so there's not much appeal there yet.
And that's the real killer benefit, to me, but it affects plugins much more than standalone packages, so it isn't clear to me how relevant it is to these discussions.
The development installs are there for developers. Software doesn't appear out of thin air!
Agreed, and I understand the benefits. But I don't have the experience to understand the trade-offs and/or risks, so I have to be very cautious in what I promise. (For example, if I code an "uninstall" command, and someone runs it on a development install, does that delete all of that person's working code? How can I make sure that doesn't happen without breaking the "normal" uninstall functionality? Given that I can't really test this, am I better just saying "not supported if run against development installs"?)
I'm doing a presentation on this stuff tomorrow, though, so hopefully I'll get closer today. It'll feature lots of command-line transcripts you probably will find rather unexciting ;)
That comment would amuse my colleagues in the office - I'm the "command line geek" round here :-) Paul.
At 10:40 AM 8/11/2005 +0100, Paul Moore wrote:
there's a transitional problem with bdist_wininst installers, which needs some thinking about (and no, easy_install's "convert a wininst installer to an egg" feature isn't an answer - it loses things like cx_Oracle's documentation,
FYI, if there's a source distribution, the new --editable option (in CVS) allows you to download and extract the source for editing, without building it or anything.
and I suspect that it breaks horribly for pywin32, which uses a postinstall script).
Hm. It's true that I haven't done anything to handle postinstall scripts; but I was under the impression that pywin32 self-registers when you try to use it. I'll have to look into that. I could probably actually add postinstall hooks to EasyInstall, except that it sort of goes against the concept of eggs being a "zero install" format. It's worth thinking about/investigating though.
As this issue is transitional only, it's not the end of the world, as long as eggs really do become the package distribution medium of choice for Python. I'd like to see some sign that eggs are making inroads against bdist_wininst, first, though.
That's not going to happen real soon; only a relatively tiny number of people even know eggs exist, and as long as they have a reasonably-usable bdist_wininst available then it's certainly a valid choice to just distribute that, thereby pleasing EasyInstall users and non-users alike.
So unless you are building packages with C extensions having complex build requirements, you'll get no incentives from me - the source package is fine. But I'll use it to build Windows installers, and never bother with eggs.
Some packages of course may be eventually only be distributed as eggs. For example, I'm switching all of my win32 binary distributions to eggs, which means you'll have to compile from source if you want a bdist_wininst. But I'm likely to be in the minority for some time to come. Transitions like this don't happen overnight, especially not based on an 0.6 alpha infrastructure. :)
(Is there ever going to be a situation where code *won't work" unless run from an egg?
Plugins, definitely. And over time, the definition of what constitutes a "plugin" is likely to be ever-expanding. For example, setuptools now supports plugins to add distutils commands, setup() arguments, and so on. The *only* way to leverage these features is with an egg, even if it's a "development-mode" egg.
I can see that being an issue, unless there's a way availabe to wrap eggs in Windows installers, RPMs, debs, etc
There is in principle, but the respective bdist commands would need some updating in order to work that way. Currently, you'd have to run EasyInstall first, and then package the resulting egg file or directory tree. This is an area that needs some actual tool development, to produce some scripts like 'egg2rpm', 'egg2wininst', etc. Or better yet, setuptools should probably grow replacement bdist commands, although these could also be distributed as extension eggs at first.
Agreed, and I understand the benefits. But I don't have the experience to understand the trade-offs and/or risks, so I have to be very cautious in what I promise. (For example, if I code an "uninstall" command, and someone runs it on a development install, does that delete all of that person's working code? How can I make sure that doesn't happen without breaking the "normal" uninstall functionality?
Don't delete anything that's not in the directory or directories your tool is managing. Development installations are outside the normal site-packages area, and deleting the '.egg-link' file from the site-packages directory "uninstalls" it.
On 8/11/05, Phillip J. Eby
At 10:40 AM 8/11/2005 +0100, Paul Moore wrote:
there's a transitional problem with bdist_wininst installers, which needs some thinking about (and no, easy_install's "convert a wininst installer to an egg" feature isn't an answer - it loses things like cx_Oracle's documentation,
FYI, if there's a source distribution, the new --editable option (in CVS) allows you to download and extract the source for editing, without building it or anything.
I'm not sure how that helps. If there's a source, I can just run setup.py bdist_wininst anyway, so I don't see the difference.
I could probably actually add postinstall hooks to EasyInstall, except that it sort of goes against the concept of eggs being a "zero install" format. It's worth thinking about/investigating though.
Given that the trend seems to be to install eggs via easy_install, I'm not sure "zero install" still applies. But if I can still just drop eggs into sys.path, maybe it does. Must review this stuff again.
That's not going to happen real soon; only a relatively tiny number of people even know eggs exist, and as long as they have a reasonably-usable bdist_wininst available then it's certainly a valid choice to just distribute that, thereby pleasing EasyInstall users and non-users alike.
That's what I thought. And it makes me reluctant to bother with eggs for standard packages at all, sadly. (Plugins etc are a completely different matter - for them, I think it's a wonderful technology!)
Some packages of course may be eventually only be distributed as eggs. For example, I'm switching all of my win32 binary distributions to eggs, which means you'll have to compile from source if you want a bdist_wininst.
Which is probably what I'll do. But I thought setuptools no longer works if it's installed via bdist_wininst? Sorry, I have to run now (shouldn't have started this email...) I'll comment more later. Paul.
Paul Moore wrote:
FYI, if there's a source distribution, the new --editable option (in CVS) allows you to download and extract the source for editing, without building it or anything.
I'm not sure how that helps. If there's a source, I can just run setup.py bdist_wininst anyway, so I don't see the difference.
The only real reason to use --editable is to get access to easy_install's ability to find, download, and unpack packages. If you aren't interested in that, then it isn't really important. I imagine at some time in the future easy_install will also read and confirm signatures, and may have things like GUI frontends. But I think that's a ways off, and some things require a larger discussion (like signatures).
I could probably actually add postinstall hooks to EasyInstall, except that it sort of goes against the concept of eggs being a "zero install" format. It's worth thinking about/investigating though.
Given that the trend seems to be to install eggs via easy_install, I'm not sure "zero install" still applies. But if I can still just drop eggs into sys.path, maybe it does. Must review this stuff again.
If you drop an egg in sys.path, you have to use pkg_resources.require('PackageName') to actually load it. easy_install also manipulates a .pth file, so require() isn't needed (but won't hurt). For egg-aware applications this isn't an issue -- which almost the same as saying that for apps using eggs as plugins it isn't an issue.
That's not going to happen real soon; only a relatively tiny number of people even know eggs exist, and as long as they have a reasonably-usable bdist_wininst available then it's certainly a valid choice to just distribute that, thereby pleasing EasyInstall users and non-users alike.
That's what I thought. And it makes me reluctant to bother with eggs for standard packages at all, sadly. (Plugins etc are a completely different matter - for them, I think it's a wonderful technology!)
Sure; as a developer setuptools has useful features besides eggs, and is a superset of distutils. I don't think there's any real reason *not* to use setuptools; but it's also fine to ignore setuptools' extra features. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org
At 12:10 PM 8/11/2005 -0500, Ian Bicking wrote:
I imagine at some time in the future easy_install will also read and confirm signatures, and may have things like GUI frontends. But I think that's a ways off, and some things require a larger discussion (like signatures).
It's certainly possible for people to sign eggs now with the setuptools 'upload' command (the --sign option invokes GPG), it's just that easy_install doesn't do any signature verification yet. I have no real idea as to how that should work with respect to setting up policies or trust chains or any of that stuff. Also, I'm not sure as yet how to retrieve signature info from PyPI, because I've never used it. However, if somebody wants to sign their eggs and send them to PyPI using "upload --sign", and can then also suggest what should be done to verify the signatures (preferably including what GPG commands to run to do the verification!), then I'll certainly take a look at it. Ideally, if this were done right it would work for source distributions and bdist_wininst installers as well as eggs, as long as EasyInstall can find the associated signature.
Sorry, I have to run now (shouldn't have started this email...) I'll comment more later.
After a lot of thinking, and some experimentation, and an abortive attempt at an email which ended up being *far* more negative than I want, I have decided not to comment any more at this stage. I'll go back to lurking until I have a better feel for eggs, their benefits, and how they fit into the overall package distribution equation. Sorry, but you guys are doing good work, and I don't want to spend my time moaning. Paul. PS I will make one comment - I really do think that ez_setup should have an option to disable downloads. When I tried installing the PyProtocols egg, ez_setup happily grabbed the setuptools egg off the web, installed it, and ran code from it. I know it needs to, but that's a huge security risk - I'm not particularly obsessed by security, but even I found that a bit scary. Arguably, no-download should be the default, and auto-download should be the optional behaviour.
At 09:39 PM 8/11/2005 +0100, Paul Moore wrote:
PS I will make one comment - I really do think that ez_setup should have an option to disable downloads. When I tried installing the PyProtocols egg, ez_setup happily grabbed the setuptools egg off the web, installed it, and ran code from it. I know it needs to, but that's a huge security risk - I'm not particularly obsessed by security, but even I found that a bit scary.
Note that you can trivially prevent this behavior by ensuring that you always have the latest setuptools egg on your machine ahead of time. ;) Also note that when you run *any* setup script, it already has the opportunity to execute arbitrary code on your machine, and could download whatever else it wanted to anyway, so I consider this mostly a non-issue, securitywise. Even reading the setup script source doesn't help; a malicious author (or tamperer) could easily bury the evil code in a seemingly innocuous import deep in the code being distributed. Anyway, what I could possibly do for the setuptools egg itself is start building up a table of MD5 hashes in ez_setup.py to allow verification of the downloaded egg. ez_setup isn't part of the egg, so I could maybe do that with an external tool of some kind. Of course, the list of hashes would get ever-longer with each release, and it wouldn't work for installing new versions of setuptools with an older version of ez_setup.
Arguably, no-download should be the default, and auto-download should be the optional behaviour.
Not really. The entire point of ez_setup is to make the process hands-free. I'm fine with trying to make it more tamper-proof in case of a compromise of python.org or your personal network connection, but if one of those was compromised, who's to say that ez_setup itself wasn't compromised too? Or the code you're installing? At some point, you're just plain screwed, and installing things by hand isn't actually an improvement unless you're going to personally vet every single line of code or have a signature whose certificate chain you trust. And once we have such signature chains available, then easy_install can become just as capable of validating them as you are. But even then, it will likely be a false sense of security, because I expect that 1) most authors won't bother to sign their packages to start with, and 2) of those that do, most will likely use self-signed certificates, so knowing what signatures are "valid" will be difficult. If you have no way to know if a particular cert is really the author's, then you are right back to square zero. That is, if you trust the source of the package enough to run its setup script in the first place, you're trusting them to run arbitrary code on your computer. So, if the package author wants to download and install the package's dependencies, at that point you're just quibbling over details; you already gave them permission to do as they wish with your machine. At least when you install their code as a pre-built egg you don't have to run any code you don't want to, and easy_install runs source package setup scripts in a sandbox to prevent ill-mannered (but not malicious) setup scripts from throwing crap on your system wherever they want to. (And yes, such setup scripts *do* exist, which is what prompted me to add the sandboxing code.) I think that in the long-enough-run, we'll solve the cert chain problems through social means, and eventually signing PyPI uploads will be de rigeur. But at the moment, most package authors can't be bothered to fill out all their PyPI metadata and links or upload their packages to PyPI, so it's not realistic to expect there's going to be much signature support out there any time soon. (Ignoring also issues like projects already married to SourceForge or other download systems that don't have an easy way to distribute the signatures.) And without those signatures, your hand-installation procedure provides you with *zero* additional security unless you're personally inspecting every single line of code you install. Heck, you're running downloaded .exe files with unsigned code, for heaven's sake! And you're worried because ez_setup downloads the setuptools egg? Crikey. :) (Of course, as with anything else, I could be completely off-base here, and I'm sure someone here will straighten me out if I am.)
Phillip J. Eby wrote:
Arguably, no-download should be the default, and auto-download should be the optional behaviour.
Not really. The entire point of ez_setup is to make the process hands-free. I'm fine with trying to make it more tamper-proof in case of a compromise of python.org or your personal network connection, but if one of those was compromised, who's to say that ez_setup itself wasn't compromised too? Or the code you're installing? At some point, you're just plain screwed, and installing things by hand isn't actually an improvement unless you're going to personally vet every single line of code or have a signature whose certificate chain you trust. And once we have such signature chains available, then easy_install can become just as capable of validating them as you are.
I think from a make-people-feel-comfortable perspective, it might be better if ez_setup informed the user of what it's doing (installing a build dependency) and get a confirmation. For instance, it can be disconcerting to do something that shouldn't require any privilege (e.g., setup.py --help-commands) and end up triggering something that does require privilege (global installation of a package). Just an "I'm going to do this; OK?" question would be reassuring.
And without those signatures, your hand-installation procedure provides you with *zero* additional security unless you're personally inspecting every single line of code you install. Heck, you're running downloaded .exe files with unsigned code, for heaven's sake! And you're worried because ez_setup downloads the setuptools egg? Crikey. :)
Realistically it's very hard to do coordinated attacks; it's as things get automated that larger holes can become exploitable and dangerous. One scary one is if someone uses a Wiki page as a package index, and other people reference that without understanding the (considerable) danger. I *almost* did that myself, then I thought again and realized what a bad idea it would be. But anyway, I think lots of attacks can be foiled by checking consistency. E.g., check the file you downloaded against an MD5 checksum stored elsewhere. This doesn't prevent someone from uploading a completely bogus file, if they are able to get access to someone's PyPI account, for instance. But other layers of consistency are possible. For instance, for a package to be "trusted" by PyPI (on some level), maybe an email confirmation of substantive package updates would be required (like new releases, new versions of files, etc). This is just another consistency check -- make sure that the person on the other end of the registered email address approves what the person with the login account is doing (of course usually those are the same person). Anyway, those are just a few idea off the top of my head. I find cryptographic standards of security can be a little misguided at times. When I read this article I found it to be a quite refreshing way to think about security: http://iang.org/ssl/wytm.html -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org
At 06:09 PM 8/11/2005 -0500, Ian Bicking wrote:
I think from a make-people-feel-comfortable perspective, it might be better if ez_setup informed the user of what it's doing (installing a build dependency) and get a confirmation. For instance, it can be disconcerting to do something that shouldn't require any privilege (e.g., setup.py --help-commands) and end up triggering something that does require privilege (global installation of a package). Just an "I'm going to do this; OK?" question would be reassuring.
But then, how do you do that in such a way that an automated installation process (other than EasyInstall) won't hang? I suppose I could have the download function display a message followed by a countdown timer that would allow you to abort by hitting ^C. That way, an unattended process or lazy user (or slow reader :) could just proceed without needing to do anything. The only problem I see with that is that drawing the user's attention to something that 99% of the time is going to be okay seems like a bad idea. It's like "WARNING: I'm about to do something exactly like what you'd do yourself by hand!" I'll have to find a suitable way to spin the message, something like: """Hello! ez_setup has detected that you don't have a recent-enough version of setuptools on your computer to be able to run this script. I'll be happy to download and install it for you (along with any other packages this script might need), but some firewalls may not allow programs like me to download software from the Internet. So I'll pause for a few seconds before starting the first download, to give you a chance to read this message, so you'll know you need to grant me access if something pops up asking if I should be allowed to connect to python.org. Thanks! Beginning download in 20... 19... 18... """ Hopefully, something like that could be made friendly enough so that most people just ignore it.
One scary one is if someone uses a Wiki page as a package index, and other people reference that without understanding the (considerable) danger. I *almost* did that myself, then I thought again and realized what a bad idea it would be.
That's why the best thing is to publish to PyPI if you can; source checkout links can always go in URLs embedded in 'long_description', and easy_install will still find them.
But other layers of consistency are possible. For instance, for a package to be "trusted" by PyPI (on some level), maybe an email confirmation of substantive package updates would be required (like new releases, new versions of files, etc). This is just another consistency check -- make sure that the person on the other end of the registered email address approves what the person with the login account is doing (of course usually those are the same person).
At the very least, sending them emails about stuff that's happening would ensure they find out their account has been hacked. Assuming the address is still valid, of course, which isn't always the case. :(
Phillip J. Eby wrote:
At 06:09 PM 8/11/2005 -0500, Ian Bicking wrote:
I think from a make-people-feel-comfortable perspective, it might be better if ez_setup informed the user of what it's doing (installing a build dependency) and get a confirmation. For instance, it can be disconcerting to do something that shouldn't require any privilege (e.g., setup.py --help-commands) and end up triggering something that does require privilege (global installation of a package). Just an "I'm going to do this; OK?" question would be reassuring.
But then, how do you do that in such a way that an automated installation process (other than EasyInstall) won't hang?
I suppose I could have the download function display a message followed by a countdown timer that would allow you to abort by hitting ^C. That way, an unattended process or lazy user (or slow reader :) could just proceed without needing to do anything.
The only problem I see with that is that drawing the user's attention to something that 99% of the time is going to be okay seems like a bad idea. It's like "WARNING: I'm about to do something exactly like what you'd do yourself by hand!"
Hopefully setuptools won't get installed 99% of the time, just once or twice per machine. Because setuptools installation can happen even when nothing installation-related is being requested, it's a bit out of the norm. Hence the confirmation, or at least prominent notification. I also, like most unix users, don't usually start by running a command as root, so ez_setup will fail in that situation. At least by putting up the interactive message it's not going to be as surprising when that happens.
But other layers of consistency are possible. For instance, for a package to be "trusted" by PyPI (on some level), maybe an email confirmation of substantive package updates would be required (like new releases, new versions of files, etc). This is just another consistency check -- make sure that the person on the other end of the registered email address approves what the person with the login account is doing (of course usually those are the same person).
At the very least, sending them emails about stuff that's happening would ensure they find out their account has been hacked. Assuming the address is still valid, of course, which isn't always the case. :(
Until you start getting phishing emails trying to pretend that your account is hacked. Ah, life on the internet... ): -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org
At 11:32 PM 8/11/2005 -0500, Ian Bicking wrote:
Hopefully setuptools won't get installed 99% of the time, just once or twice per machine. Because setuptools installation can happen even when nothing installation-related is being requested, it's a bit out of the norm. Hence the confirmation, or at least prominent notification.
Technically, only the download can happen when nothing installation-related is being requested.
I also, like most unix users, don't usually start by running a command as root, so ez_setup will fail in that situation. At least by putting up the interactive message it's not going to be as surprising when that happens.
You only let root connect to the internet? :) Seriously, if you run setup.py --help-commands or some such, all that's going to happen is that setuptools.egg gets downloaded into the current directory and stuck on sys.path for the duration of the script. The piggyback installation only takes place as part of a "setup.py install". So, you shouldn't need root just to run the setup script if you're not installing. Similarly, if the setup script has any 'setup_requires' eggs, those eggs just get downloaded to the current directory and put on sys.path as well - and they do not do a piggyback install, because you might need them only once. In any case, piggyback installation only happens when you "setup.py install", not for any other operation whatsoever. Download of setup-time dependencies, however, occurs on the first run of setup, no matter what the command. I suppose I need to add an explanation of the setup_requires downloads, too.
On 8/11/05, Phillip J. Eby
And without those signatures, your hand-installation procedure provides you with *zero* additional security unless you're personally inspecting every single line of code you install. Heck, you're running downloaded .exe files with unsigned code, for heaven's sake! And you're worried because ez_setup downloads the setuptools egg? Crikey. :)
Told you I'm not security-conscious (hey, I'm not conscious most of the time! :-)) I'm a naive user who knows the Internet's a scary place, but doesn't really think people are going to bother mocking up a website just to pick on users of Python's PIL module. So if I go to the website and *see* that it looks OK, I trust it. But ez_setup just went off and got something, from somewhere. I never saw the page with the link on it, so what if the link ez_setup used was wrong? I never got to see a nice reassuring webpage with Fredrik's name on it, so how can I be sure I got the right place? I'm not *actually* that naive, but I do tend to prefer to be very "manual" when I interact with the internet, just because I trust myself (probably incorrectly!) more than I trust an automated program... OK, I retract the suggestion that no download be the default, but I'd still like a "manual download" option, which doesn't grab stuff automatically. After all, ez_setup has the option to go to a local cache (I can't recall how it works, but I know you mentioned it before). Why can't I say that I trust the cache (it's been vetted, virus scanned, whatever) so use that, but *don't* go elsewhere? Then I download what I think I need, do the install, and get messages reporting any eggs I missed. I grab those, vet them, and try again. Repeat as needed.... Paul.
At 12:02 PM 8/12/2005 +0100, Paul Moore wrote:
OK, I retract the suggestion that no download be the default, but I'd still like a "manual download" option, which doesn't grab stuff automatically.
I can't really do this for ez_setup (which doesn't have access to command line parameters or distutils config settings), but it should be possible for easy_install. I could maybe have a --local-only option that refuses to do downloads from any URL other than file: URLs. However, for it to take effect when you're running another package's "setup.py install", you'd need to set it in your per-user or sitewide distutils config file, because it won't be usable on the command line.
After all, ez_setup has the option to go to a local cache (I can't recall how it works, but I know you mentioned it before).
--find-links=/some/directory But that's an easy_install option, not an ez_setup option. ez_setup is all about downloading setuptools itself, and the only "local caches" it recognizes are the current directory, and an installed egg on sys.path. So really, your vetting process for installing a package would be to read its setup script to see what version of setuptools it uses, so you can download and install the setuptools egg before proceeding. You can also read the setup script to find out what dependencies the package has, keeping in mind of course that any package that uses entry points, require(), etc. is not going to be happy if you install its dependencies in non-egg form.
participants (3)
-
Ian Bicking
-
Paul Moore
-
Phillip J. Eby