
Something that's come up recently in the Debian Python mailing list is setuptools/distribute's habit of downloading *_requires packages (e.g. install_requires) when they are not available locally.
This causes us problems because dependencies are defined in two places. They are defined in setup.py by the upstream package author, and in the debian/control file by the OS packager. Generally, this is okay because we can generate debian/control from setup.py -- though it does take some manual intervention to keep things in sync.
This came up in the context of always enabling tests when we build the OS package. The problem arises if the two dependency lists are out of sync. For example, your setup.py depends on 'foo' but the Debian 'python-foo' package is not installed. In this case, during the build process, 'foo' would get downloaded from the Cheeseshop and this would mask a bug in the debian/control file (since any listed in debian/control would get installed from the archive and thus be available by the time setuptools/distribute runs).
The question is: what's the best way for us Debian packagers to absolutely prevent download from Cheeseshop? We would much rather have setuptools/distribute spew an error and stop, because then we'd fix debian/control and ensure that all the package's dependencies came from the OS archive instead of external resources.
One way that seems to work is to add this to setup.cfg:
[easy_install] allow_hosts: www.example.com
This will break the download by limiting acceptable hosts to bogus ones that can't possibly satisfy the requirement. But it's unsatisfying for several reasons:
* It's obscure and doesn't really describe what we're trying to do ('fixable' I suppose by a comment) * Requires the Debian packager to add a setup.cfg or modify an existing one in the upstream package.
Note that I thought this might also work, but it does not afaict:
[easy_install] no_deps: true
So, do you have any suggestions for a better way to say "never download dependencies" for a particular package, or class of packages. Ideally, there'd be one file we could modify, e.g. /etc/distribute.cfg that would allow us to prevent downloading globally for all system provided packages, but would still allow downloading for local development packages, e.g. via virtualenv.
Thoughts are welcome, but also perhaps we can discuss further at the Pycon sprint.
Cheers, -Barry

On Wed, Feb 23, 2011 at 9:46 PM, Barry Warsaw barry@python.org wrote:
Something that's come up recently in the Debian Python mailing list is setuptools/distribute's habit of downloading *_requires packages (e.g. install_requires) when they are not available locally.
This causes us problems because dependencies are defined in two places. They are defined in setup.py by the upstream package author, and in the debian/control file by the OS packager. Generally, this is okay because we can generate debian/control from setup.py -- though it does take some manual intervention to keep things in sync.
This came up in the context of always enabling tests when we build the OS package. The problem arises if the two dependency lists are out of sync. For example, your setup.py depends on 'foo' but the Debian 'python-foo' package is not installed. In this case, during the build process, 'foo' would get downloaded from the Cheeseshop and this would mask a bug in the debian/control file (since any listed in debian/control would get installed from the archive and thus be available by the time setuptools/distribute runs).
The question is: what's the best way for us Debian packagers to absolutely prevent download from Cheeseshop? We would much rather have setuptools/distribute spew an error and stop, because then we'd fix debian/control and ensure that all the package's dependencies came from the OS archive instead of external resources.
One way that seems to work is to add this to setup.cfg:
[easy_install] allow_hosts: www.example.com
This will break the download by limiting acceptable hosts to bogus ones that can't possibly satisfy the requirement. But it's unsatisfying for several reasons:
- It's obscure and doesn't really describe what we're trying to do ('fixable'
I suppose by a comment)
- Requires the Debian packager to add a setup.cfg or modify an existing one in
the upstream package.
Note that I thought this might also work, but it does not afaict:
[easy_install] no_deps: true
Well, if you want to handle all the dependencies for a project yourself, you can shortcut distribute or setuptools by using the --single-version-externaly managed option.
When using this option, the project will be installed by the vanilla distutils install command.
Then it's up to you to handle dependencies. That's how pip does, and Fedora IIRC
HTH
Cheers Tarek

On Wed, Feb 23, 2011 at 10:04:24PM +0100, Tarek Ziadé wrote:
One way that seems to work is to add this to setup.cfg:
[easy_install] allow_hosts: www.example.com
This will break the download by limiting acceptable hosts to bogus ones that can't possibly satisfy the requirement. But it's unsatisfying for several reasons:
- It's obscure and doesn't really describe what we're trying to do ('fixable'
I suppose by a comment)
- Requires the Debian packager to add a setup.cfg or modify an existing one in
the upstream package.
Note that I thought this might also work, but it does not afaict:
[easy_install] no_deps: true
Well, if you want to handle all the dependencies for a project yourself, you can shortcut distribute or setuptools by using the --single-version-externaly managed option.
When using this option, the project will be installed by the vanilla distutils install command.
Then it's up to you to handle dependencies. That's how pip does, and Fedora IIRC
What Barry's talking about is slightly different I think. When running python setup.py test, setup.py may download additional modules that should have been specified in the system package (thus the download should never be tried). This occurs before the software is installed anywhere.
For Fedora we deal with this by preventing processes related to the build from making any non-localhost network connnections. That doesn't catch things when a packager is building on their local machine but it does catch things when the package is built on the builders
There's two pieces that work on that: 1) The build hosts themselves are configured with a firewall that prevents a lot of packets from leaving the box, and prevent any packets from going to a non-local network. 2) We build in a chroot and part of chroot construction is to create an empty resolv.conf. This prevents DNS lookups from succeeding and controls the automatic downloading among other things.
Neither of these are especially well adapted to being run by a casual packager but the second (a chroot with empty resolv.conf) could be done without too much trouble (we have a tool called mock that creates chroots, it was based on a tool called mach which can use apt and might be better for a Debian usage). Both 1 and 2 could be performed on a VM if you can get your packagers to go that far or are dealing with a build system rather than individual packagers.
-Toshio

On Feb 23, 2011, at 02:25 PM, Toshio Kuratomi wrote:
What Barry's talking about is slightly different I think. When running python setup.py test, setup.py may download additional modules that should have been specified in the system package (thus the download should never be tried). This occurs before the software is installed anywhere.
Right on, Toshio.
For Fedora we deal with this by preventing processes related to the build from making any non-localhost network connnections. That doesn't catch things when a packager is building on their local machine but it does catch things when the package is built on the builders
There's two pieces that work on that:
- The build hosts themselves are configured with a firewall that prevents
a lot of packets from leaving the box, and prevent any packets from going to a non-local network. 2) We build in a chroot and part of chroot construction is to create an empty resolv.conf. This prevents DNS lookups from succeeding and controls the automatic downloading among other things.
Neither of these are especially well adapted to being run by a casual packager but the second (a chroot with empty resolv.conf) could be done without too much trouble (we have a tool called mock that creates chroots, it was based on a tool called mach which can use apt and might be better for a Debian usage). Both 1 and 2 could be performed on a VM if you can get your packagers to go that far or are dealing with a build system rather than individual packagers.
I believe our builders prevent external connections too. I'm not positive about it but it wouldn't be too difficult to test. Still, as you point out, it's more difficult to enforce with local builders, and that's where packagers are going to be more able to quickly fix any such problems. One difficultly for Debian/Ubuntu local build environments (aside from the fact that there are several ways people do it ;), is that at least with some of the local builders, they *have* to do external connections, e.g. to download build dependencies into the chroot the build is done from. You could of course tightly control that, but given the geographical archive mirroring, it just makes things more complicated.
-Barry

Barry Warsaw wrote:
So, do you have any suggestions for a better way to say "never download dependencies" for a particular package, or class of packages.
easy_install has (at least in distribute) a documented feature that makes it use only locally stored files.
http://packages.python.org/distribute/easy_install.html#installing-on-un-net...
If it works (I never tried it), that is a nice feature for configuration management; it can't possibly get the wrong version if it can only use what you have provided.
In your case, anything that is not already installed (from another debian package) is "wrong". You might provide a directory containing only the single package that you want to install. Anything else would be not found, and that would be your clue that something was wrong. You either need to install another debian package to provide the missing python package, or you need another python package as part of the debian package that you are presently creating.
Ideally, there'd be one file we could modify, e.g. /etc/distribute.cfg that would allow us to prevent downloading globally for all system provided packages, but would still allow downloading for local development packages, e.g. via virtualenv.
Here, it sounds like you want it to go to the network for some packages, but not others. I'm not real clear on what you have in mind, though. Does it work to pass the parameters on the command line to indicate that you do/don't want to go to the network during this run of easy_install?
Mark

On Feb 25, 2011, at 10:27 AM, Mark Sienkiewicz wrote:
Barry Warsaw wrote:
So, do you have any suggestions for a better way to say "never download dependencies" for a particular package, or class of packages. easy_install has (at least in distribute) a documented feature that makes it use only locally stored files.
http://packages.python.org/distribute/easy_install.html#installing-on-un-net...
If it works (I never tried it), that is a nice feature for configuration management; it can't possibly get the wrong version if it can only use what you have provided.
In your case, anything that is not already installed (from another debian package) is "wrong". You might provide a directory containing only the single package that you want to install. Anything else would be not found, and that would be your clue that something was wrong. You either need to install another debian package to provide the missing python package, or you need another python package as part of the debian package that you are presently creating.
Ah, this link just turned on the light. I can add this to setup.cfg:
[easy_install] allow_hosts: None
It's moderately better than my previous take which was to set allow_hosts to www.example.com. Sounds like the best way forward is for us to recommend adding this entry to prevent unwanted downloads.
Cheers, -Barry
participants (4)
-
Barry Warsaw
-
Mark Sienkiewicz
-
Tarek Ziadé
-
Toshio Kuratomi