Re: [Distutils] "just use debian"

Nicolas Chauvat wrote:
Also some of the Debian Python packages are broken or grossly out-of-date.
File a bug report :)
Yes, because that automatically frees up the packager's time to work on these issues...
My problem with setup tools is that they come from windows
I don't think that's the case at all.
is (almost) no package management system. The consequence is that their author reinvented the wheel, but limited it to Python, then moved to eggs and made things worse.
Yes, the egg format is annoying. No, I don't think having a package management system that targets only python packages is a bad idea.
My main tool is Python, but I have many other tools on my system. I do not want to have as many package management utilities as "subsystems".
Then I suggest you volunteer to maintain the debian packages for every single python package.
If I have one tool for Python, one for Java, one for C, one for Fortran, one for C libraries, one for Gnome, etc. integration becomes a nightmare.
If you have projects this large, then you likely want to roll your own OS packages anyway.
[Please note that for an experienced Debian developer, making the initial package of a Python module can be a matter of half an hour to a couple hours and releasing a new version a matter of minutes.]
...and for someone not using Debian or not an experienced Debian developer? Despite being a fan of Debian, I'm well aware of just how "friendly" a community it can be to the new user... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

I'm interested in hearing what you find so annoying about the egg format because for me it's the one part of the setuptools system that I would keep. If I had my way we'd separate out eggs from distutils/setuptools and all the automagical package installation of easy_install and focus on making eggs work as plugins. Eggs are a best effort attempt to make a jar-like system for Python and in many cases they work well (c.f. Trac plugins). Sure, there are some workarounds needed for C-extensions and packages that access __file__, etc. but these problems seem more surmountable than the deeply ingrained problems with distutils. Cheers, Stephen. On Tue, Sep 23, 2008 at 11:47 AM, Chris Withers <chris@simplistix.co.uk>wrote:
Nicolas Chauvat wrote:
Also some of the Debian Python packages are broken or grossly
out-of-date.
File a bug report :)
Yes, because that automatically frees up the packager's time to work on these issues...
My problem with setup tools is that they come from windows
I don't think that's the case at all.
is (almost) no package management system. The consequence is that
their author reinvented the wheel, but limited it to Python, then moved to eggs and made things worse.
Yes, the egg format is annoying. No, I don't think having a package management system that targets only python packages is a bad idea.
My main tool is Python, but I have many other tools on my system. I
do not want to have as many package management utilities as "subsystems".
Then I suggest you volunteer to maintain the debian packages for every single python package.
If I have one tool for Python, one for Java, one for C,
one for Fortran, one for C libraries, one for Gnome, etc. integration becomes a nightmare.
If you have projects this large, then you likely want to roll your own OS packages anyway.
[Please note that for an experienced Debian developer, making the
initial package of a Python module can be a matter of half an hour to a couple hours and releasing a new version a matter of minutes.]
...and for someone not using Debian or not an experienced Debian developer? Despite being a fan of Debian, I'm well aware of just how "friendly" a community it can be to the new user...
cheers,
Chris
-- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk _______________________________________________ pyconuk mailing list pyconuk@python.org http://mail.python.org/mailman/listinfo/pyconuk

On Tue, 2008-09-23 at 11:59 +0100, Stephen Pascoe wrote:
I'm interested in hearing what you find so annoying about the egg format because for me it's the one part of the setuptools system that I would keep.
If I had my way we'd separate out eggs from distutils/setuptools and all the automagical package installation of easy_install and focus on making eggs work as plugins. Eggs are a best effort attempt to make a jar-like system for Python and in many cases they work well (c.f. Trac plugins). Sure, there are some workarounds needed for C-extensions and packages that access __file__, etc. but these problems seem more surmountable than the deeply ingrained problems with distutils.
Remember though that jars carry no dependency information, hence Maven, Ivy, OSGi, JSR277, etc. Avoiding the Python equivalent of Jar Hell surely has to be a prime directive. I guess the question in my mind is if the Ruby community have Ruby Gems, what is the Python equivalent, and why doesn't it work? -- Russel. ==================================================== Dr Russel Winder t: +44 20 7585 2200 41 Buckmaster Road m: +44 7770 465 077 London SW11 1EN, UK w: http://www.russel.org.uk/

On Sep 23, 2008, at 5:53 AM, Russel Winder wrote:
I guess the question in my mind is if the Ruby community have Ruby Gems, what is the Python equivalent, and why doesn't it work?
I'm fairly satisfied with distutils/eggs/setuptools/easy_install. It isn't perfect, but it's good enough, and it is improving thanks to Philip Eby, Chris Galvan, Philip Jenvey, Tarek Ziadé, and other contributors. Also, even though there is a sizable fraction of Python programmers who don't like it, there is no other tool that has anywhere near the vast universe of compatible Python code. Partially because it is compatible with distutils and partly because it is widely accepted itself. There are currently 4,800 packages listed on http://pypi.python.org , in addition to which there are an uncounted number of publicly available Python packages that are not listed there. (By the way, there are 783 packages in the Debian unstable Python section -- http://packages.debian.org/unstable/python -- and 712 packages in the Ubuntu Hardy Python section -- http://packages.ubuntu.com/hardy/ python ). There are probably at least 4,800 different programmers responsible for writing and maintaining the world's publicly re-usable Python packages. Almost all of these thousands and thousands of packages are seamlessly re-usable by setuptools, and if you use distutils or setuptools to package and distribute your Python code, then your code will be re-usable by those folks (whether they use distutils or setuptools themselves). I get great value from being able to re-use almost any other Python package in my code without having to ask my users to manually deal with more dependencies, and without having to spend time to create platform-specific packages, e.g. Debian packages, Windows installers, etc. But note also that setuptools does not *prevent* me from creating such packages. Currently the standard Tahoe install instructions are generic and apply to all supported platforms, but Tahoe also gets packaged up as .debs and as a Windows app for specific customers. See also the stdeb tool which is a handy way to produce .debs automatically from your Python source code and bbfreeze which, I am told, is a good way to produce Windows packages: http://stdeb.python-hosting.com http://pypi.python.org/pypi/bbfreeze Also, I really like setuptools plugins as a way to make build tools be separately maintained and packaged instead of piling up in your setup.py. Here are the first few packages which use the new classifier "Framework :: Setuptools Plugin": http://pypi.python.org/pypi?:action=browse&c=524 Hopefully in the future more of these packages will get classified as being setuptools plugins that are useful for development: http://pypi.python.org/pypi?% 3Aaction=search&term=setuptools&submit=search My major project, Tahoe, has been using setuptools for more than a year now. Here are the installation instructions for Tahoe. Note that these instructions are the same for all supported platforms, which includes Windows, Cygwin, Mac, Linux, and Solaris. http://allmydata.org/source/tahoe/trunk/docs/install.html Here is the list of Python packages that Tahoe needs (not including the packages that *those* packages need, such as pyOpenSSL and pyutil): http://allmydata.org/trac/tahoe/browser/_auto_deps.py And here is the list of open tickets about how we would like to improve Tahoe packaging: http://allmydata.org/trac/tahoe/report/10 Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month

I guess the question in my mind is if the Ruby community have Ruby Gems, what is the Python equivalent, and why doesn't it work?
Python Eggs == Ruby Gems, and they both work more or less equally well as a packaging format. Maybe people just hear "Ruby Gems is awesome and Python don't got squat" because Python people are grumpier than Ruby people? Maybe it's because the Python Eggs page sports a PEAK logo that looks like it was made with Corel Draw 5 and this brings backs unpleasant memories of when they got paid $6 bucks an hour making coupons in Corel Draw 5 whereas the Ruby Gems page has an eye-pleasing RubyGem logo? Perhaps we should get Tony Robbins to present, "The Eggscellent Power of Positive Thinking" as a keynote at the next PyCon? And from our communities newfound exuberant positivity, talented graphic designers will be lining up to participate in? OK, joking aside, it's differences in how eggs and gems are distributed and consumed that I think make people claims that one works better than the other. Ruby Gems are installed with the 'gem' script, which installs gems in a versioned cache location, whereas 'easy_install' by default installs Python eggs into a global, versionless location. The documentation for Ruby Gems is also on the whole more approachable then the Python Egg documentation, so when new people are learning the tool and they get stuck, with gems they tend to find their answer and go away happy and evangelical, whereas with eggs they might go away bitter and grumpy. Eggs have a small "Z-shaped" learning curve in that a new developer learns "sudo easy_install some_package" and it works and they say, "yay!". Later on though they want to use two versions of the same package and they realize that they have to learn how-to do things differently *and* they're presented with TIMTOWTDI - either manual management (symlinks or hand-munged .pth files) or setuptools or multiple Python installs or VirtualEnv or Buildout or some combination of approaches. To a certain extent, TIMTOWTDI is necessary with package management, since there are so many different use cases - but it would be very nice if there was an approachable documentation resource to help people explore these different tools and techniqiues more easily though. Or they can just use debian! Any debian developers out there up for the task of packaging up the 1500+ odd packages released from the Zope community?

Kevin Teague wrote:
Python Eggs == Ruby Gems, and they both work more or less equally well as a packaging format.
I don't agree. Sure, they are the same idea, but the implementation is vastly different, and that's what matters IMHO, or at least is one big problem. If you look at the ruby gem page, you have one link for a specification; I have not done it, and maybe I would realize I were wrong by trying it, but I got the impression I could generate gems myself from the specification. Can I do that with eggs ? Also, gems and rake/rant are different projects (maybe by different people ?). In practice, it is nice to have everything integrated (and ruby gems certainly feel as integrated as python eggs), but having different packages for different tasks force to have proper behavior, not a behavior which works in some cases, and broke in others. And rant is a proper build system, whereas distutils isn't. There is also the problem that by making some things easy but effectively "magic", when it breaks, you don't know how to fix. Those two problems (everything intermixed and magic) are linked. If several tasks were separated, there would have been a clear specification/API, and less magic. Of course, basing setuptools on the top of distutils make this task nearly impossible (but I understand it was the best if not only choice given Philip requirements when he started setuptools). You have people who ignore the problem eggs are trying to solve (the "install debian and solve real problems" crowd), but I don't think the majority of people who object to eggs in principle. They object to implementation problems. cheers, David

On Sep 24, 2008, at 0:27 AM, David Cournapeau wrote:
If you look at the ruby gem page, you have one link for a specification; I have not done it, and maybe I would realize I were wrong by trying it, but I got the impression I could generate gems myself from the specification. Can I do that with eggs ?
How about this: http://peak.telecommunity.com/DevCenter/EggFormats
There is also the problem that by making some things easy but effectively "magic", when it breaks, you don't know how to fix.
I agree that this is a problem. People interested in improving it should read Philip J. Eby's post "setuptools: past, present, future" from 2006: http://mail.python.org/pipermail/python-dev/2006-April/064145.html Since then we've made a great step forward by having the distutils in Python 2.5 and newer automatically produce .egg-info files. Then, we made another step forward when we persuaded Linux distributions like Debian and Red Hat to stop deleting those .egg- info files. ;-) In my opinion the next step forward at this layer of basic compatibility is to formalize setuptools's "requirements" syntax (mainly install_requires, but also setup_requires, test_requires, and extras_require) as a standard part of Python. Note that I am not saying anything about the implementation of how requirements get satisfied, which we've already failed to agree on for a Python standard, only that if developers want to write down "My package depends on package XYZ" in their package's meta-data, that they can do so in a single, standard syntax so that all of the new crop of packaging tools can read what they wrote. Of course, this standard syntax should be compatible with the most widely-used current implementation -- setuptools/easy_install. This is an opportunity to standardize some basic metadata, not to innovate and not to standardize anything harder and more implementation- specific than simple dependency declaration. Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month

zooko wrote:
On Sep 24, 2008, at 0:27 AM, David Cournapeau wrote:
How about this:
I guess we don't have the same meaning when speaking about specification. A reference which states at the beginning "Note, however, that these are all internal implementation details and are therefore subject to change; stick to the published API if you don't want to be responsible for keeping your code from breaking when setuptools changes. You have been warned." Is not a specification in my mind. This is pervasive in the distutils/setuptools world BTW: everything is defined by implementation, you don't know what is implementation detail and what is the API (My first contact with distutils was to plug a scons command to use scons within distutils for the numpy project, and I was horrified by the code of distutils. The only way to understand what's going on is to run it; I can't say setuptools is much better in that departement, but since a setuptools requirement was to extend distutils, I don't blame setuptools for this)
I agree that this is a problem. People interested in improving it should read Philip J. Eby's post "setuptools: past, present, future" from 2006:
http://mail.python.org/pipermail/python-dev/2006-April/064145.html
Since then we've made a great step forward by having the distutils in Python 2.5 and newer automatically produce .egg-info files.
Then, we made another step forward when we persuaded Linux distributions like Debian and Red Hat to stop deleting those .egg-info files. ;-)
I hope that with the fork, we will be able to deal better with the situation. To me, the biggest problem of setuptools is that it is a big mud of code, with things which are vastly different in purpose. There should be a submodule for building eggs, a submodule to deal with dependencies, a submodule for the distutils extensions, etc... I want to be able to import setuptools "build an egg" module without using setuptools at all otherwise, so that I can build an egg in e.g. scons. I want to be able to use setuptools capabilities for dependencies to get a list of all the dependencies without using setuptools at all otherwise. cheers, David

(following up to my own proposal with a case study) The Twisted Matrix project is a very big, widely used, well- engineered Python project. http://twistedmatrix.com It requires zope.interface to function. The Twisted hackers are mostly setuptools-haters, and are certainly not going to start using and depending on it, but they are willing to declare Twisted's dependency on zope.interface in a machine-readable way in order to facilitate correct installation of Twisted. The way they have currently accomplished this is by importing setuptools but attempting not to use it, and then if setuptools is present in sys.modules add the flag to setup(): "install_requires=['zope.interface']": http://twistedmatrix.com/trac/browser/trunk/setup.py The goal of all this from the perspective of Twisted developers is simply to add "This package requires zope.interface ." into their metadata in a way that setuptools, easy_install, pyinstall, distribute, stdeb, bbfreeze, vanguardistas.pydebdep, virtualenv, and other tools will understand. They do not want to use or depend on setuptools, nor do they want setuptools to have any other effects on their project than to emit that one simple fact of dependency metadata. What I am proposing is that in the next release of Python, all that Twisted developers need to do is put "install_requires= ['zope.interface']" into their invocation of distutils.setup(), and the appropriate metadata will be included in the resulting .egg-info in a way that all of the aforementioned tools will understand. This is a modest proposal -- it is backwards compatible, it is likely to be forwards-compatible with future Python packaging tools, and it does not, I hope, cause any problems for people who prefer not to use it. Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month

On Tue, Sep 23, 2008 at 09:24:00PM -0700, Kevin Teague wrote:
Or they can just use debian! Any debian developers out there up for the task of packaging up the 1500+ odd packages released from the Zope community?
The SchoolTool guys made a tool and built .debs for all of Zope 3 that SchoolTool needs. The resulting packages are here: https://launchpad.net/~schooltool-owners/+archive Marius Gedminas -- Bumper sticker: Alcohol and calculus don't mix. Never drink and derive.

On Sep 24, 2008, at 14:47 PM, Marius Gedminas wrote:
On Tue, Sep 23, 2008 at 09:24:00PM -0700, Kevin Teague wrote:
Or they can just use debian! Any debian developers out there up for the task of packaging up the 1500+ odd packages released from the Zope community?
The SchoolTool guys made a tool and built .debs for all of Zope 3 that SchoolTool needs. The resulting packages are here: https://launchpad.net/~schooltool-owners/+archive
I used the stdeb tool on several Python packages that I maintain and it worked to produce .deb's from Python source distributions. http://stdeb.python-hosting.com Regards, Zooko --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month

zooko writes:
On Sep 24, 2008, at 14:47 PM, Marius Gedminas wrote:
On Tue, Sep 23, 2008 at 09:24:00PM -0700, Kevin Teague wrote:
Or they can just use debian! Any debian developers out there up for the task of packaging up the 1500+ odd packages released from the Zope community?
The SchoolTool guys made a tool and built .debs for all of Zope 3 that SchoolTool needs. The resulting packages are here: https://launchpad.net/~schooltool-owners/+archive
yes, and we are evaluating how this is maintainable, seen from the packaging view.
I used the stdeb tool on several Python packages that I maintain and it worked to produce .deb's from Python source distributions.
see the todo list why this cannot be yet used for packages that we want to upload to debian/ubuntu.

Marius Gedminas wrote:
On Tue, Sep 23, 2008 at 09:24:00PM -0700, Kevin Teague wrote:
Or they can just use debian! Any debian developers out there up for the task of packaging up the 1500+ odd packages released from the Zope community?
The SchoolTool guys made a tool and built .debs for all of Zope 3 that SchoolTool needs. The resulting packages are here: https://launchpad.net/~schooltool-owners/+archive
Yes, but how many of these have made it into an official debian release? Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

Hi Stephen, On Tue, Sep 23, 2008 at 11:59:18AM +0100, Stephen Pascoe wrote:
I'm interested in hearing what you find so annoying about the egg format because for me it's the one part of the setuptools system that I would keep.
I suppose the best answer is to point you to this thread: http://teams.debian.net/lurker/message/20070904.152810.4f84c924.fr.html There is also useful information at: http://wiki.debian.org/DebianPythonFAQ and http://www.debian.org/doc/packaging-manuals/python-policy/ but I would not bet on the latter being up to date. Baseline is "no problem with providing egg-info metadata, but pretty please Python developers, do not code *for* distutils/setuptools/etc. Just find a way to provide useful dependency/meta information then let your users choose how they install your code on *their* system". -- Nicolas Chauvat logilab.fr - services en informatique scientifique et gestion de connaissances

Nicolas Chauvat wrote:
Baseline is "no problem with providing egg-info metadata, but pretty please Python developers, do not code *for* distutils/setuptools/etc. Just find a way to provide useful dependency/meta information then let your users choose how they install your code on *their* system".
Right, now this I agree with, and it seems a lot of other people do too... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

Hi, [sorry for slowly drifting away from pure distutils-related topics] On Tue, Sep 23, 2008 at 11:47:03AM +0100, Chris Withers wrote:
My main tool is Python, but I have many other tools on my system. I do not want to have as many package management utilities as "subsystems".
Then I suggest you volunteer to maintain the debian packages for every single python package.
Do you really think every single Python package in PyPI deserves to be packaged for every distribution? I don't. How do I make a difference? When I need something I download it. When I find it really useful and plan on using it I package it. Many others are behaving in the same way and the result is "apt-cache search python".
If you have projects this large, then you likely want to roll your own OS packages anyway.
I am not sure what you mean by "OS packages". Do you mean "roll your own distribution" as in "roll your own thing based on Debian and adapt packages to your needs as Ubuntu is doing"?
[Please note that for an experienced Debian developer, making the initial package of a Python module can be a matter of half an hour to a couple hours and releasing a new version a matter of minutes.]
...and for someone not using Debian or not an experienced Debian developer? Despite being a fan of Debian, I'm well aware of just how "friendly" a community it can be to the new user...
Do you expect someone who is not proficient at programming to quickly design and develop a piece of software? What makes you think it would be different for integrating pieces of software into a consistent system? As I said on the pyconuk list, packaging software requires some work and currently there is no way around it. Tools get better over time, but automation is out of reach. As usual "user != developer". For someone not using Debian: just be happy with whatever tool you choose to use. For someone not an experienced Debian developer: just wait for someone to do the work you want to benefit from, or learn to do it yourself and get it done. In my company's case, we worked for years on setting up an efficient environment for software development and system administration and we are now able to move python code from a mercurial repository to a production system running Debian in a matter of minutes. The tools we built to support this process have been free software from the very beginning and can be found on our website http://www.logilab.org/project/logilab-devtools -- Nicolas Chauvat logilab.fr - services en informatique scientifique et gestion de connaissances

Nicolas Chauvat wrote:
Do you really think every single Python package in PyPI deserves to be packaged for every distribution? I don't. How do I make a difference? When I need something I download it. When I find it really useful and plan on using it I package it. Many others are behaving in the same way and the result is "apt-cache search python".
This is narrow-minded. I understand your POV, I really do (I use a debian-based system myself, and hate the way software installation works on any other system), but what you are saying cannot work in a general manner. For example, installing from sources on most other systems is frowned upon, and rightfully so, because it is even more complicated to do than on a decent linux system. It is painful because either the OS makes it terribly difficult (windows), or because you have antique/not well supported toolsets (old Solaris, etc...). If you mainly use only one OS, you just can't understand the pain. I am not saying that python plugins must be THE deployment system, but that it has to be one system, because plugins systems are as pervasive on other OS as .deb are on debian. So we should think about what kind of things python core can provide to help other tools to either build "native" packages or eggs, and not having a big pile of code which mix everything. As Matthias Klose mentioned earlier, a lot of those formats share common requirements. We should talk about those instead of saying my package is bigger than yours.
As usual "user != developer". For someone not using Debian: just be happy with whatever tool you choose to use. For someone not an experienced Debian developer: just wait for someone to do the work you want to benefit from, or learn to do it yourself and get it done.
And how do you distribute new versions of your package ? You wait for debian to package it correctly ? For fast-moving packages, debian are not the ultimate solution, far from it. I mean, it is not like OpenSuse build service, Ubuntu ppa systems came from nowhere. There is a need for softwares developers to distribute themselves the software for newer versions, and in that case, the native system (at least used "officialy") simply is not appropriate. cheers, David

2008/9/27 David Cournapeau <david@ar.media.kyoto-u.ac.jp>
Nicolas Chauvat wrote:
Do you really think every single Python package in PyPI deserves to be packaged for every distribution? I don't. How do I make a difference? When I need something I download it. When I find it really useful and plan on using it I package it. Many others are behaving in the same way and the result is "apt-cache search python".
This is narrow-minded. I understand your POV, I really do (I use a debian-based system myself, and hate the way software installation works on any other system), but what you are saying cannot work in a general manner.
IMHO we could have a command-line search feature à la CPAN at PyPI like what apt provides. $ pypi-search "foo" Tarek -- Tarek Ziadé - Directeur Technique INGENIWEB (TM) - SAS 50000 Euros - RC B 438 725 632 Bureaux de la Colline - 1 rue Royale - Bâtiment D - 9ème étage 92210 Saint Cloud - France Phone : 01.78.15.24.00 / Fax : 01 46 02 44 04 http://www.ingeniweb.com - une société du groupe Alter Way

2008/9/27 Tarek Ziade <tarek.ziade@ingeniweb.com>
2008/9/27 David Cournapeau <david@ar.media.kyoto-u.ac.jp>
Nicolas Chauvat wrote:
Do you really think every single Python package in PyPI deserves to be packaged for every distribution? I don't. How do I make a difference? When I need something I download it. When I find it really useful and plan on using it I package it. Many others are behaving in the same way and the result is "apt-cache search python".
This is narrow-minded. I understand your POV, I really do (I use a debian-based system myself, and hate the way software installation works on any other system), but what you are saying cannot work in a general manner.
IMHO we could have a command-line search feature à la CPAN at PyPI like what apt provides.
$ pypi-search "foo"
It exists, its called yolk, and it rocks :D
Tarek
-- Tarek Ziadé - Directeur Technique INGENIWEB (TM) - SAS 50000 Euros - RC B 438 725 632 Bureaux de la Colline - 1 rue Royale - Bâtiment D - 9ème étage 92210 Saint Cloud - France Phone : 01.78.15.24.00 / Fax : 01 46 02 44 04 http://www.ingeniweb.com - une société du groupe Alter Way
-- Tarek Ziadé - Directeur Technique INGENIWEB (TM) - SAS 50000 Euros - RC B 438 725 632 Bureaux de la Colline - 1 rue Royale - Bâtiment D - 9ème étage 92210 Saint Cloud - France Phone : 01.78.15.24.00 / Fax : 01 46 02 44 04 http://www.ingeniweb.com - une société du groupe Alter Way

Hi, On Sat, Sep 27, 2008 at 05:18:36PM +0900, David Cournapeau wrote:
Do you really think every single Python package in PyPI deserves to be packaged for every distribution? I don't. How do I make a difference? When I need something I download it. When I find it really useful and plan on using it I package it. Many others are behaving in the same way and the result is "apt-cache search python".
For example, installing from sources on most other systems is frowned upon, and rightfully so, because it is even more complicated to do than on a decent linux system. It is painful because either the OS makes it terribly difficult (windows), or because you have antique/not well supported toolsets (old Solaris, etc...). If you mainly use only one OS, you just can't understand the pain.
I have used many operating systems and I still do from time to time. I frown upon anything that has to be done more than once by hand, including installing things from source.
I am not saying that python plugins must be THE deployment system, but that it has to be one system, because plugins systems are as pervasive on other OS as .deb are on debian. So we should think about what kind of things python core can provide to help other tools to either build "native" packages or eggs, and not having a big pile of code which mix everything. As Matthias Klose mentioned earlier, a lot of those formats share common requirements. We should talk about those instead of saying my package is bigger than yours.
Sure, the package system I use is bigger than yours (if you are not using Debian), but that's not my main point and insisting on it would turn into an endless flame war. Can we focus on something else? You call me narrow minded, but I pretend to understand why people came up with distutils/setuptools/eggs etc. I have been there. I felt the need for a tool to easily manage systems and install dependencies. I started writing one myself. Then I discovered Debian and stopped using the other tools I had. Problem solved. For me. Now that more and more people are using Python and computers, more and more people feel the same need. Not everyone can solve his problem by dropping what he has and adopting Debian. I am well aware of that. I am not trying to convince people to adopt Debian, I am trying to explain to people who probably have not used it or not developed a lot of packages for a large system: * how I do it very efficiently, * why it suits my needs, * why other people trying to make "python plugins systems" are making my work more difficult when it becomes the only/main distribution channel. I repeat. I am not trying to force other people to use Debian, I am trying to get other people not to force me to use tools I do not need (distutils, etc) for I have good ones already (debian packages).
As usual "user != developer". For someone not using Debian: just be happy with whatever tool you choose to use. For someone not an experienced Debian developer: just wait for someone to do the work you want to benefit from, or learn to do it yourself and get it done.
And how do you distribute new versions of your package ? You wait for debian to package it correctly ? For fast-moving packages, debian are not the ultimate solution, far from it.
As I said in an other email, I now get python code from a mercurial repository to a Debian repository then to a production system running Debian in a matter of minutes. That's fast enough for me.
I mean, it is not like OpenSuse build service, Ubuntu ppa systems came from nowhere. There is a need for softwares developers to distribute themselves the software for newer versions, and in that case, the native system (at least used "officialy") simply is not appropriate.
http://ftp.logilab.org/debian/sid/ is not official. Anyone wanting to make one's own repository can do it. Anyone wanting to use some repository that's "non-official" as a source of packages for one's system can do it. I have talked a lot. I think I'll stop drowning this list with my messages. Here is my conclusion. People can do whatever they think is good for them! Just do not use easy_install/whatever as the only distribution channel. If you use it, put a standard tarball with an up-to-date README detailing installation and dependencies on the same download page. Please. -- Nicolas Chauvat logilab.fr - services en informatique scientifique et gestion de connaissances

Nicolas Chauvat wrote:
Sure, the package system I use is bigger than yours (if you are not using Debian), but that's not my main point and insisting on it would turn into an endless flame war. Can we focus on something else?
Sure, that's what I am interested in :)
You call me narrow minded, but I pretend to understand why people came up with distutils/setuptools/eggs etc. I have been there. I felt the need for a tool to easily manage systems and install dependencies. I started writing one myself. Then I discovered Debian and stopped using the other tools I had. Problem solved. For me.
The problem is that debian packages are not always the solution (even on debian systems). Two big problems are: - installation as non root - developers deploying their own software on a custom debian repository does not scale at all. We have to think about those user-case.
I repeat. I am not trying to force other people to use Debian, I am trying to get other people not to force me to use tools I do not need (distutils, etc) for I have good ones already (debian packages).
This part I don't understand: distutils and debian dpkg-related tools do have some overlap, but they are not a replacement from each other. In particular, distutils manages the details of python C extensions, etc... Where distutils failed big time IMHO is that it made it more difficult for you (or for me for that matter), not easier. Autotools did help packagers; a distutils successor should be able to help without getting in the way. For example, by providing simple discoverable meta-data. Wouldn't it help a debian packager to have a simple description of the meta-data for the dependencies ? Wouldn't it help if it was easy to set data_dir, doc_dir, etc... according to the FHS ? Autotools "packages" are relatively easy to package; I don't see why we could not achieve the same for python packages. cheers, David

On Mon, Sep 29, 2008 at 1:46 PM, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Nicolas Chauvat wrote:
Sure, the package system I use is bigger than yours (if you are not using Debian), but that's not my main point and insisting on it would turn into an endless flame war. Can we focus on something else?
Sure, that's what I am interested in :)
You call me narrow minded, but I pretend to understand why people came up with distutils/setuptools/eggs etc. I have been there. I felt the need for a tool to easily manage systems and install dependencies. I started writing one myself. Then I discovered Debian and stopped using the other tools I had. Problem solved. For me.
The problem is that debian packages are not always the solution (even on debian systems). Two big problems are: - installation as non root - developers deploying their own software on a custom debian repository does not scale at all.
We have to think about those user-case.
I repeat. I am not trying to force other people to use Debian, I am trying to get other people not to force me to use tools I do not need (distutils, etc) for I have good ones already (debian packages).
This part I don't understand: distutils and debian dpkg-related tools do have some overlap, but they are not a replacement from each other. In particular, distutils manages the details of python C extensions, etc...
Where distutils failed big time IMHO is that it made it more difficult for you (or for me for that matter), not easier. Autotools did help packagers; a distutils successor should be able to help without getting in the way. For example, by providing simple discoverable meta-data. Wouldn't it help a debian packager to have a simple description of the meta-data for the dependencies ? Wouldn't it help if it was easy to set data_dir, doc_dir, etc... according to the FHS ? Autotools "packages" are relatively easy to package; I don't see why we could not achieve the same for python packages.
That is exactly what was brought in the other thread in distutils-SIG, providing the package metadata in a simple way for os-vendors, without having to deal with things like setup.py Then having third party applications that knows how to use them to install things in debian, or whatever the system is. Now, the question is, what would debian miss in here to install: http://www.python.org/dev/peps/pep-0345/ If you can come up with a list of missing elements, we could probably start to work on a PEP together. Tarek
cheers,
David _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
-- Tarek Ziadé | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/

Hi, On Mon, Sep 29, 2008 at 02:09:15PM +0200, Tarek Ziadé wrote:
That is exactly what was brought in the other thread in distutils-SIG, providing the package metadata in a simple way for os-vendors, without having to deal with things like setup.py
Then having third party applications that knows how to use them to install things in debian, or whatever the system is.
Now, the question is, what would debian miss in here to install:
http://www.python.org/dev/peps/pep-0345/
If you can come up with a list of missing elements, we could probably start to work on a PEP together.
I started a thread on debian-python to ask for help: http://lists.debian.org/debian-python/2008/09/msg00025.html Here the answer from Josselin Mouette who is the author of python-support, a tool that dramatically eases the packaging of Python code for Debian. --------------------------------------------------------------------- Le lundi 29 septembre 2008 à 15:12 +0200, Nicolas Chauvat a écrit :
Here is where we stand today: http://mail.python.org/pipermail/distutils-sig/2008-September/010126.html
This looks like a step in the right direction if we want to generate inter-module dependencies. Most things defined in the PEP will not be useful for packaging, except for making something like a dh_make_python almost trivial to write. The one thing that we'd almost certainly use is the Requires and Provides fields. However you should be careful with the notion of version. It is nice to have a lot of flexibility in specifying versioned dependencies, but the more stuff the norm allows, the more complicated it will be to translate this into inter-package dependencies. For example, if you require a minimal version of 1.4, you can translate this to a package version of 1.4; it is a bit hackish but will work if you handle epochs correctly. But if the package you depend on has a Provides: blah (1.4), you have no way to map that to a dependency, because you can't know what other versions of the package will provide. In all cases, it will be necessary to manually add shlibs-like information to the packages; they could be partly autogenerated like symbol files, but you need a mapping between provided modules and the first version of the package that provides it. --------------------------------------------------------------------- Good to see this is moving forward :) -- Nicolas Chauvat logilab.fr - services en informatique scientifique et gestion de connaissances

On Tue, Sep 30, 2008 at 10:42 AM, Nicolas Chauvat <nicolas.chauvat@logilab.fr> wrote:
For example, if you require a minimal version of 1.4, you can translate this to a package version of 1.4; it is a bit hackish but will work if you handle epochs correctly. But if the package you depend on has a Provides: blah (1.4), you have no way to map that to a dependency, because you can't know what other versions of the package will provide.
I am not sure to fully understand, could you provide a real-word example ?
In all cases, it will be necessary to manually add shlibs-like information to the packages; they could be partly autogenerated like symbol files, but you need a mapping between provided modules and the first version of the package that provides it.
Is this related ? http://lists.debian.org/debian-dpkg/2006/01/msg00118.html

Le mardi 30 septembre 2008 à 14:05 +0200, Tarek Ziadé a écrit :
On Tue, Sep 30, 2008 at 10:42 AM, Nicolas Chauvat <nicolas.chauvat@logilab.fr> wrote:
For example, if you require a minimal version of 1.4, you can translate this to a package version of 1.4; it is a bit hackish but will work if you handle epochs correctly. But if the package you depend on has a Provides: blah (1.4), you have no way to map that to a dependency, because you can't know what other versions of the package will provide.
I am not sure to fully understand, could you provide a real-word example ?
Let’s say you have module bar, contained in the package python-bar. The last version is 1.4. After that version, it is decided to distribute it in the same tarball as module bar. It is therefore moved to the package python-foo, which is at version 1.2. In this case, you can specify in the metadata : Provides: foo Provides: bar (1.4) This is the typical use case for versioned provides. Let’s say application baz requires module bar with minimal version 1.3, it will have as dependency: Requires: bar >= 1.3 This way it will be happy to find the versioned provides if module foo is installed, and everyone is happy. Well, except that, if you try to build a package of baz, there is no way to express correctly that you depend on python-bar (>= 1.3) or python-foo (>= 1.2). This is why I’d prefer to have versioned provides simply not part of the specification. Another thing that can cause issues is exact dependencies. If you require strictly version 1.1 of foo, there is no good way of translating it into a package dependency. All the following will have serious drawbacks when facing the real world: python-foo (>= 1.1), python-foo (<< 1.1.~) python-foo (>= 1.1), python-foo (<< 1.2) python-foo (= 1.1-1) If you allow to specify requires and provides in a sophisticated way, people will use it and we will run into unmanageable situations when converting them to packages. If a module provides an API at version 1.2, it will have to still provide it at version 1.3, otherwise the module should remain private and never be installed in a public python module directory. Just like we rename C libraries when their ABI changes, we need to reach a situation where we can make the same assumptions about python modules.
In all cases, it will be necessary to manually add shlibs-like information to the packages; they could be partly autogenerated like symbol files, but you need a mapping between provided modules and the first version of the package that provides it.
Is this related ? http://lists.debian.org/debian-dpkg/2006/01/msg00118.html
Yes. The thread you point to did not let to something being actually implemented, because at that moment, we lacked the necessary metadata. Since then, setuptools appeared, but it does not provide it in a sane way and it is not universal. Which is why I’m interested into the metadata format that’s discussed here. From this metadata, we will be able to generate some files that express what is provided and the required version. Something like: foo 1.0-1 bar 1.2~beta3 This way, if another package requires foo (> 1.1) and bar (without a version requirement), we can convert this dependency into: python-foo (>= 1.1), python-foo (>= 1.2~beta3) which can then be factorized, of course. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

On Tue, Sep 30, 2008 at 3:17 PM, Josselin Mouette <joss@debian.org> wrote:
Le mardi 30 septembre 2008 à 14:05 +0200, Tarek Ziadé a écrit :
On Tue, Sep 30, 2008 at 10:42 AM, Nicolas Chauvat <nicolas.chauvat@logilab.fr> wrote:
For example, if you require a minimal version of 1.4, you can translate this to a package version of 1.4; it is a bit hackish but will work if you handle epochs correctly. But if the package you depend on has a Provides: blah (1.4), you have no way to map that to a dependency, because you can't know what other versions of the package will provide.
I am not sure to fully understand, could you provide a real-word example ?
Let's say you have module bar, contained in the package python-bar. The last version is 1.4. After that version, it is decided to distribute it in the same tarball as module bar. It is therefore moved to the package python-foo, which is at version 1.2. In this case, you can specify in the metadata : Provides: foo Provides: bar (1.4) This is the typical use case for versioned provides.
Let's say application baz requires module bar with minimal version 1.3, it will have as dependency: Requires: bar >= 1.3 This way it will be happy to find the versioned provides if module foo is installed, and everyone is happy. Well, except that, if you try to build a package of baz, there is no way to express correctly that you depend on python-bar (>= 1.3) or python-foo (>= 1.2).
This is why I'd prefer to have versioned provides simply not part of the specification.
The "Obsoletes" info could be used maybe. But the main problem I can see is that in any case several versions of the same module can be needed to build one application. That is what tools like zc.buildout or virtualenv exists : they are building an isolated environment where they install the packages so a given Python application can run. In other words the problem we have today with an OS-based installation is that you cannot really have two versions of the same package installed, that would make happy two Python applications. The setuptools project has partly improved this by providing a way to install several version of the same package in Python and give a way to select which one is active.
From your point of view, how could we solve it at Debian level ? to kind of isolate a group of packages that fit the needs of one given application ?
(btw A recent change it Python has allowed us to define per-user site-packages http://mail.python.org/pipermail/python-dev/2008-January/076108.html)
Is this related ? http://lists.debian.org/debian-dpkg/2006/01/msg00118.html
Yes. The thread you point to did not let to something being actually implemented, because at that moment, we lacked the necessary metadata. Since then, setuptools appeared, but it does not provide it in a sane way and it is not universal. Which is why I'm interested into the metadata format that's discussed here.
From this metadata, we will be able to generate some files that express what is provided and the required version. Something like: foo 1.0-1 bar 1.2~beta3
This way, if another package requires foo (> 1.1) and bar (without a version requirement), we can convert this dependency into: python-foo (>= 1.1), python-foo (>= 1.2~beta3) which can then be factorized, of course.
Interesting.. That would mean you would do version conflict resolution at the OS level, That makes me think about the previous point: how two applications that use conflicting versions that are not comptabile with each other (you have to choose one of them) can cohabit ? Cheers
Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.
-- Tarek Ziadé | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/

Le mardi 30 septembre 2008 à 15:49 +0200, Tarek Ziadé a écrit :
The "Obsoletes" info could be used maybe. But the main problem I can see is that in any case several versions of the same module can be needed to build one application.
This is indeed a problem, and when it happens, it needs fixing instead of trying to work with it.
That is what tools like zc.buildout or virtualenv exists : they are building an isolated environment where they install the packages so a given Python application can run.
In other words the problem we have today with an OS-based installation is that you cannot really have two versions of the same package installed, that would make happy two Python applications.
And this is not a problem, but something that is desired. No, the problem we have today is that some developers are providing modules without API stability, which means you cannot simply depend on a module, you need a specific version. Again, when a C library changes its ABI, we do not allow it to keep the same name. It’s as simple as that.
The setuptools project has partly improved this by providing a way to install several version of the same package in Python and give a way to select which one is active.
This is not an improvement, it is a nightmare for the sysadmin. You cannot install things as simple (and as critical) as security updates if you allow several versions to be installed together.
From your point of view, how could we solve it at Debian level ? to kind of isolate a group of packages that fit the needs of one given application ?
I think we need to enforce even more the habit to move unstable and private-use modules to private directories. It is not viable to add them to public directories. This is something that is done punctually in some Debian packages, but it should become mandatory for all cases where there is no API stability. A tool that eases installation and use of modules in private directories would certainly encourage developers to do so and improve the situation in this matter.
(btw A recent change it Python has allowed us to define per-user site-packages http://mail.python.org/pipermail/python-dev/2008-January/076108.html)
This is definitely a nice improvement for those on multi-user systems without administrative rights, and for those who wish to install a more recent version of a specific module. However, I don’t think we should rely on it as the normal way of installing python modules. And especially, we should not rely on on-demand download/installation of modules like setuptools does.
Interesting.. That would mean you would do version conflict resolution at the OS level, That makes me think about the previous point: how two applications that use conflicting versions that are not comptabile with each other (you have to choose one of them) can cohabit ?
Two conflicting versions must not use the same module namespace. The real, fundamental issue, that generates even more brokenness when you accept it and work around it, is here. It is a nightmare for the developer (who can’t rely on a defined API after "import foo"), a nightmare for the distributor (who has to use broken-by-design selection methods), and a nightmare for the system administrator (who cannot easily track what is installed on the system). Forbid that strictly, and you’ll see that methods that work today for a Linux distribution (where we already forbid it) will work just as nicely for all other distribution mechanisms. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

On Tue, Sep 30, 2008 at 4:27 PM, Josselin Mouette <joss@debian.org> wrote:
In other words the problem we have today with an OS-based installation is that you cannot really have two versions of the same package installed, that would make happy two Python applications.
And this is not a problem, but something that is desired.
No, the problem we have today is that some developers are providing modules without API stability, which means you cannot simply depend on a module, you need a specific version.
Again, when a C library changes its ABI, we do not allow it to keep the same name. It's as simple as that.
I see, so there's no deprecation processes for a package ? I mean, if you change a public API of your package , you *have* to change its name ? My convention is to : - keep the the old API and the new API in the new version, let's say "2.0" - mark the old API as deprecated (we have this "warning'" module in Python to do so) - remove the old API in the next release, like "2.1" But I don't want to change the package name. And the development cycles in a python package are really short compared to OS systems, in fact we can have quite a few releases before a package is really stable.
The setuptools project has partly improved this by providing a way to install several version of the same package in Python and give a way to select which one is active.
This is not an improvement, it is a nightmare for the sysadmin. You cannot install things as simple (and as critical) as security updates if you allow several versions to be installed together.
mmm... unless the version is "part of the name" in a way.... [cut]
Interesting.. That would mean you would do version conflict resolution at the OS level, That makes me think about the previous point: how two applications that use conflicting versions that are not comptabile with each other (you have to choose one of them) can cohabit ?
Two conflicting versions must not use the same module namespace. The real, fundamental issue, that generates even more brokenness when you accept it and work around it, is here. It is a nightmare for the developer (who can't rely on a defined API after "import foo"), a nightmare for the distributor (who has to use broken-by-design selection methods), and a nightmare for the system administrator (who cannot easily track what is installed on the system). Forbid that strictly, and you'll see that methods that work today for a Linux distribution (where we already forbid it) will work just as nicely for all other distribution mechanisms.
I have an idea: what about having a "known good set" (KGS) like what Zope has built on its side. a Known Good Set is a set of python package versions, that are known to provide the good execution context for a given version of Python. Maybe the Python community could maintain a known good set of python packages at PyPI, with a real work on its integrity, like any OS-vendor does I believe. And maybe this KGS could be used by Debian as the reference of package versions. -> if a package is listed in this KGS, it defines the version, for a given version of Python then, application developers in Python could work with this KGS, in their code. And if they can't get a package added in the official KGS, they will have to be on their own, inside their application, and maintain their own modules.
Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.
-- Tarek Ziadé | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/

Tarek Ziadé wrote:
In other words the problem we have today with an OS-based installation is that you cannot really have two versions of the same package installed, that would make happy two Python applications.
Right, which is why dependencies can often be best matched by a project-based tool like buildout rather than having to have one python setup support all use cases.
No, the problem we have today is that some developers are providing modules without API stability, which means you cannot simply depend on a module, you need a specific version.
This problem is never going away, it's the nature of software.
Again, when a C library changes its ABI, we do not allow it to keep the same name. It's as simple as that.
That's insane, and I bet without trying to hard, I could find examples of violation of this supposed practice.
My convention is to : - keep the the old API and the new API in the new version, let's say "2.0" - mark the old API as deprecated (we have this "warning'" module in Python to do so) - remove the old API in the next release, like "2.1"
Right.
But I don't want to change the package name.
Right.
The setuptools project has partly improved this by providing a way to install several version of the same package in Python and give a way to select which one is active. This is not an improvement, it is a nightmare for the sysadmin.
Absolutely. This multi-version rubbish is totally and utterly insanely wrong.
I have an idea: what about having a "known good set" (KGS) like what Zope has built on its side.
a Known Good Set is a set of python package versions, that are known to provide the good execution context for a given version of Python.
Given how poorly maintained Zope's "KGS" is, I think this is a pipe dream. Besides, accurately specified dependency information, including versions, within a package should suffice. It would be handy if you could also specify python version compatibility in this, something that setuptools does not currently support AFAIK. cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

2008/9/30 Chris Withers <chris@simplistix.co.uk>
Tarek Ziadé wrote:
In other words the problem we have today with an OS-based installation is
that you cannot really have two versions of the same package installed, that would make happy two Python applications.
Right, which is why dependencies can often be best matched by a project-based tool like buildout rather than having to have one python setup support all use cases.
No, the problem we have today is that some developers are providing
modules without API stability, which means you cannot simply depend on a module, you need a specific version.
This problem is never going away, it's the nature of software.
Again, when a C library changes its ABI, we do not allow it to keep the
same name. It's as simple as that.
That's insane, and I bet without trying to hard, I could find examples of violation of this supposed practice.
My convention is to :
- keep the the old API and the new API in the new version, let's say "2.0" - mark the old API as deprecated (we have this "warning'" module in Python to do so) - remove the old API in the next release, like "2.1"
Right.
But I don't want to change the package name.
Right.
The setuptools project has partly improved this by providing a way to
install several version of the same package in Python and give a way to select which one is active.
This is not an improvement, it is a nightmare for the sysadmin.
Absolutely. This multi-version rubbish is totally and utterly insanely wrong.
I have an idea: what about having a "known good set" (KGS) like what
Zope has built on its side.
a Known Good Set is a set of python package versions, that are known to provide the good execution context for a given version of Python.
Given how poorly maintained Zope's "KGS" is, I think this is a pipe dream.
Besides, accurately specified dependency information, including versions, within a package should suffice. It would be handy if you could also specify python version compatibility in this, something that setuptools does not currently support AFAIK.
you can use the Requires-Python metadata though. For KGS I agree that this is a big work, but there's the need to work at a higher level that in your package that is what zc.buildout brought in a way, but at the application level, and with no respect to the OS-level in a way. Si we should find a way to generalize this at Python level imho: being able to develop your package in a known environment. and being able to give that info to the OS. Python frameworks are exploding in a myriad of packags : a Python instalation needs to handle up to a hundreds of public packages now to run a plone site for example
cheers,
Chris
-- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
-- Tarek Ziadé - Directeur Technique INGENIWEB (TM) - SAS 50000 Euros - RC B 438 725 632 Bureaux de la Colline - 1 rue Royale - Bâtiment D - 9ème étage 92210 Saint Cloud - France Phone : 01.78.15.24.00 / Fax : 01 46 02 44 04 http://www.ingeniweb.com - une société du groupe Alter Way

Tarek Ziade wrote:
For KGS I agree that this is a big work, but there's the need to work at a higher level that in your package
Why? You really need to explain to me why the dependency information in each of the packages isn't enough?
Python frameworks are exploding in a myriad of packags : a Python instalation needs to handle up to a hundreds of public packages now to run a plone site for example
Yes, Plone and Zope both got the wrong end of the stick by making myriads of eggs rather than a few big ones... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

On Tue, Sep 30, 2008 at 5:55 PM, Chris Withers <chris@simplistix.co.uk> wrote:
Tarek Ziade wrote:
For KGS I agree that this is a big work, but there's the need to work at a higher level that in your package
Why? You really need to explain to me why the dependency information in each of the packages isn't enough?
Because you can keep up with the dependencies changes, removed, or introduced by a package you depend on. How do you decide that the version 1.2 of bar is the one you should use, when you use the foo package that can work with any version of bar ? You can define the version of foo, but can't describe all the versions of the packages foo uses. You'd end up building your own KGS in a way.. So a general list of versions can help
Python frameworks are exploding in a myriad of packags : a Python instalation needs to handle up to a hundreds of public packages now to run a plone site for example
Yes, Plone and Zope both got the wrong end of the stick by making myriads of eggs rather than a few big ones...
I think it is a good opportunity to re-uses things. Right now I can work on projects that use packages from pylons AND plone AND zope 3. Bigger eggs wouldn't let you reuse thing like you can now imho
Chris
-- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk
-- Tarek Ziadé | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/

Tarek Ziadé wrote:
Tarek Ziade wrote:
For KGS I agree that this is a big work, but there's the need to work at a higher level that in your package Why? You really need to explain to me why the dependency information in each of the packages isn't enough?
Because you can keep up with the dependencies changes, removed, or introduced by a package you depend on.
Why can this not be expressed in the dependency information in the package?
How do you decide that the version 1.2 of bar is the one you should use, when you use the foo package that can work with any version of bar ?
If you are using no other packages that have a dependency on bar that has a specific version requirement, then the answer is that you can use any version of bar you desire.
You can define the version of foo, but can't describe all the versions of the packages foo uses.
Why?
You'd end up building your own KGS in a way..
...you mean like the [versions] section with buildout? Yes, I agree us paranoid people may want to do that, but we really shouldn't need to provided the packages each correctly define their dependencies...
So a general list of versions can help
I would be happy to wager that this would never successfully be maintained.
Bigger eggs wouldn't let you reuse thing like you can now imho
That doesn't explain the majority of eggs that end up dragging a whole load of eggs down themselves. In this case, they should all be packaged as one egg... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

Chris Withers wrote:
Tarek Ziadé wrote:
Tarek Ziade wrote:
For KGS I agree that this is a big work, but there's the need to work at a higher level that in your package Why? You really need to explain to me why the dependency information in each of the packages isn't enough?
Because you can keep up with the dependencies changes, removed, or introduced by a package you depend on.
Why can this not be expressed in the dependency information in the package?
I tried this briefly for a while when Setuptools first came out, and I found it completely unmaintainable. Say I have a package that represents an application. We'll call it FooBlog. I release version 1.0. It uses the Turplango web framework (1.5 at the time of release) and the Storchalmy ORM (0.4), and Turplango uses HardJSON (1.2.1). I want my version 1.0 to keep working. So, I figure I'll add the dependencies: Turplango==1.5 Storchalmy==0.4 Then HardJSON 2.0 is released, and Turplango only required HardJSON>=1.2, so new installations start installing HardJSON 2.0. But my app happens not to be compatible with that library, and so it's broken. OK... so, I could add HardJSON==1.2.1 in my requirements. But then a small bug fix, HardJSON 1.2.2 comes out, that fixes a security bug. Turplango releases version 1.5.1 that requires HardJSON>=1.2.2. I now have have to update FooBlog to require both Turplango==1.5.1 and HardJSON==1.2.2. Later on, I decide that Turplango 1.6 fixes some important bugs, and I want to try it with my app. I can install Turplango 1.6, but I can't start my app because I'll get a version conflict. So to even experiment with a new version of the app, I have to check out FooBlog, update setup.py, reinstall (setup.py develop) the package, and then I can start using it. But if I've made other hard requirements of packages like HardJSON, I'll have to update all those too. So... that's the kind of thing I encountered with just a couple dependencies, but in practice it was much worse because there were a lot more than 3 libraries involved. I now think it is best to only use version requirements to express known conflicts. For future versions of packages you can't really know if they will cause conflicts until they are released. -- Ian Bicking : ianb@colorstudy.com : http://blog.ianbicking.org

On Tue, Sep 30, 2008 at 6:37 PM, Ian Bicking <ianb@colorstudy.com> wrote:
Chris Withers wrote:
Tarek Ziadé wrote:
Tarek Ziade wrote:
For KGS I agree that this is a big work, but there's the need to work at a higher level that in your package
Why? You really need to explain to me why the dependency information in each of the packages isn't enough?
Because you can keep up with the dependencies changes, removed, or introduced by a package you depend on.
Why can this not be expressed in the dependency information in the package?
I tried this briefly for a while when Setuptools first came out, and I found it completely unmaintainable.
Say I have a package that represents an application. We'll call it FooBlog. I release version 1.0. It uses the Turplango web framework (1.5 at the time of release) and the Storchalmy ORM (0.4), and Turplango uses HardJSON (1.2.1).
I want my version 1.0 to keep working. So, I figure I'll add the dependencies:
Turplango==1.5 Storchalmy==0.4
Then HardJSON 2.0 is released, and Turplango only required HardJSON>=1.2, so new installations start installing HardJSON 2.0. But my app happens not to be compatible with that library, and so it's broken. OK... so, I could add HardJSON==1.2.1 in my requirements.
But then a small bug fix, HardJSON 1.2.2 comes out, that fixes a security bug. Turplango releases version 1.5.1 that requires HardJSON>=1.2.2. I now have have to update FooBlog to require both Turplango==1.5.1 and HardJSON==1.2.2.
Later on, I decide that Turplango 1.6 fixes some important bugs, and I want to try it with my app. I can install Turplango 1.6, but I can't start my app because I'll get a version conflict. So to even experiment with a new version of the app, I have to check out FooBlog, update setup.py, reinstall (setup.py develop) the package, and then I can start using it. But if I've made other hard requirements of packages like HardJSON, I'll have to update all those too.
So... that's the kind of thing I encountered with just a couple dependencies, but in practice it was much worse because there were a lot more than 3 libraries involved. I now think it is best to only use version requirements to express known conflicts. For future versions of packages you can't really know if they will cause conflicts until they are released.
Exactly, you can't control everything from your package unless you work in an isolated environement like virtualenv or zc.buildout provides, so I can't see any solution unless someone is taking care of it at a higher level :( maybe PyPI though, can automate this, when a package is uploaded, by browsing all dependency and finding relevant conflict ? PyPI "knows" all the packages out there. At least display those conflicts somehow ? or warn about them. (I am pushing this to catalog-sig as well, sorry for the cross-post. I do thing though, these mailing lists should merge)
-- Ian Bicking : ianb@colorstudy.com : http://blog.ianbicking.org
-- Tarek Ziadé | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/

Tarek Ziadé wrote:
So... that's the kind of thing I encountered with just a couple dependencies, but in practice it was much worse because there were a lot more than 3 libraries involved. I now think it is best to only use version requirements to express known conflicts. For future versions of packages you can't really know if they will cause conflicts until they are released.
Exactly, you can't control everything from your package unless you work in an isolated environement like virtualenv or zc.buildout provides, so I can't see any solution unless someone is taking care of it at a higher level :(
maybe PyPI though, can automate this, when a package is uploaded, by browsing all dependency and finding relevant conflict ? PyPI "knows" all the packages out there.
At least display those conflicts somehow ? or warn about them.
Yes, keeping this version information separate from packages would help, I think. If you find out more information about a conflict it shouldn't require a new release -- new releases take a while to do, and have cascading effects. This kind of metadata isn't so much about the package, as about how the package relates to other packages. If we could somewhat safely have collaborative conflict information that would be nice, though there's different kinds of conflicts so it might be infeasible. It's all too common for a person to just poke around with version stuff until something works, but in a way that is only accurate for the context of their application, and if they submit that information upstream they could easily break other people's setups unnecessarily. -- Ian Bicking : ianb@colorstudy.com : http://blog.ianbicking.org

Le mardi 30 septembre 2008 à 11:37 -0500, Ian Bicking a écrit :
Say I have a package that represents an application. [snip]
Then HardJSON 2.0 is released, and Turplango only required HardJSON>=1.2, so new installations start installing HardJSON 2.0. But my app happens not to be compatible with that library, and so it's broken. OK...
No, please stop here. That’s not OK. If a new version of HardJSON breaks your application, it is friggin’ broken. If that new version is not compatible, it should be called HardJSON2, and nothing will break. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

Josselin Mouette wrote:
Le mardi 30 septembre 2008 à 11:37 -0500, Ian Bicking a écrit :
Say I have a package that represents an application.
[snip]
Then HardJSON 2.0 is released, and Turplango only required HardJSON>=1.2, so new installations start installing HardJSON 2.0. But my app happens not to be compatible with that library, and so it's broken. OK...
No, please stop here. That’s not OK. If a new version of HardJSON breaks your application, it is friggin’ broken. If that new version is not compatible, it should be called HardJSON2, and nothing will break.
I disagree with your assertion that the name HAS to imply API compatibility. There ought to be something that specifies API / ABI compatibility, such as the combination of name and some portion of a version number, but too many people depend on a name for marketing or other purposes for us to impose that it indicate technical aspects. If your OS distribution chooses to do things that way, then fine, when your OS builds the distribution, it can rename it to HardJSON2 but that shouldn't be required of every platform. -- Dave

Le mardi 30 septembre 2008 à 15:46 -0500, Dave Peterson a écrit :
Josselin Mouette wrote:
No, please stop here. That’s not OK. If a new version of HardJSON breaks your application, it is friggin’ broken. If that new version is not compatible, it should be called HardJSON2, and nothing will break.
I disagree with your assertion that the name HAS to imply API compatibility. There ought to be something that specifies API / ABI compatibility, such as the combination of name and some portion of a version number, but too many people depend on a name for marketing or other purposes for us to impose that it indicate technical aspects.
The marketing name does not have to be the same of the name of the module you import. The situation where they differ is even quite common. You can also argue for separating the name from the API version, like the soname of a library, and I’ll agree, but in the end it is very similar.
If your OS distribution chooses to do things that way, then fine, when your OS builds the distribution, it can rename it to HardJSON2 but that shouldn't be required of every platform.
We can do that, but we won’t as long as it is possible to do otherwise. It completely breaks compatibility with third-party packages or modules, and it is unnecessarily hard to maintain. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

On Tue, 30 Sep 2008 23:32:14 +0200, Josselin Mouette <joss@debian.org> wrote:
[snip]
The marketing name does not have to be the same of the name of the module you import. The situation where they differ is even quite common.
You can also argue for separating the name from the API version, like the soname of a library, and I’ll agree, but in the end it is very similar.
Do you think this is practical for non-trivial libraries? For any library which has more than one API, the possibility exists for one API to change incompatibly and the other to remain compatible. With larger libraries, the value of changing the module name because one (or some other small fraction of the whole) API changed incompatibly decreases as compared to the cost of updating all software which uses the library to use the new name (much of which may well be unaffected by the incompatible change). I am a huge fan of backward compatibility. You may not find a bigger one (at least in the Python community). I can't understand how this approach can be made feasible though. Should the next release of Twisted include a Python packaged named "twisted2" instead of "twisted"? And the one after that "twisted3"? There are thousands of APIs in Twisted, and we do change them incompatibly (after giving notice programmatically for no less than 12 months). Should we instead give up on this and make all users of Twisted update their code to reflect the new name with each release? Jean-Paul

Josselin Mouette wrote:
Le mardi 30 septembre 2008 à 15:46 -0500, Dave Peterson a écrit :
Josselin Mouette wrote:
No, please stop here. That’s not OK. If a new version of HardJSON breaks your application, it is friggin’ broken. If that new version is not compatible, it should be called HardJSON2, and nothing will break.
I disagree with your assertion that the name HAS to imply API compatibility. There ought to be something that specifies API / ABI compatibility, such as the combination of name and some portion of a version number, but too many people depend on a name for marketing or other purposes for us to impose that it indicate technical aspects.
The marketing name does not have to be the same of the name of the module you import. The situation where they differ is even quite common.
But we already have a separation between project name and module names that are contained within that project. We don't currently declare dependencies on the module names but on the project name. i.e. a dependency on HardJSON > 2.0 does not say anything about what modules you're expecting to import or use, only that you expect to use version 2 of a project called HardJSON. Were you suggesting that change? I think the rest of the comments are easily resolved after the above is clear. -- Dave

Le mardi 30 septembre 2008 à 16:57 -0500, Dave Peterson a écrit :
But we already have a separation between project name and module names that are contained within that project. We don't currently declare dependencies on the module names but on the project name. i.e. a dependency on HardJSON > 2.0 does not say anything about what modules you're expecting to import or use, only that you expect to use version 2 of a project called HardJSON. Were you suggesting that change?
As I understand PEP 345, it proposes to introduce dependencies based on module names, not on project names. This is one of the key points where it is superior to setuptools; especially, if developers correctly change the module name when changing the API, the dependencies are clear and reliable. At the distribution level, we will end up changing the binary package name, but this is not something you have to worry about. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

Josselin Mouette wrote:
As I understand PEP 345, it proposes to introduce dependencies based on module names, not on project names. This is one of the key points where it is superior to setuptools; especially, if developers correctly change the module name when changing the API, the dependencies are clear and reliable.
I don't understand this. Why do you need to change the module name ? You need to be able to request a given version of the module, yes, but you don't need to change the name itself. Changing name is an implementation detail of the concept of retrieving a particular version, no ? I mean, when I build my C program against the glibc, I never have to change my compiler commands depending on which version of glibc I want. Internally, this is handled by versioned sonames, at the link and load stage, yes, but only because it is "easy" to do and doable for C, because the library itself is never referenced in the code; IOW, there is the link process in C and similar compiled languages which use the name changes. But other languages use something different; I don't claim any deep understanding of the scheme, but mono uses a totally different scheme to handle versioning: http://www.mono-project.com/Assemblies_and_the_GAC Does mono cause problems to debian ? I think this kind of system is much more adapted to python than the C model is, because C has this two steps thing that python does not have. Of couse, it is more complicated than changing names. But I fear that people wants their cake and eating it: you can't have well maintained packages and reliability without thinking about API/ABI issues. There has to be a balance between OS distributors need (mostly only one version) and other people needs who may need several parallel installations. I think setuptools and the whole eggs thing makes it well too easy to install several versions at the same time, making some people naively thinking it solves the dependency issues, without thinking that it replaces a kind of dll hell by a dependency hell. But OTOH, as a developer, I need to be able to develop my packages, distribute them, and breaking them from time to time before I reach 1.0. That's almost requirement of open source development, at least if you follow the release early/release often. cheers, David

Le mercredi 01 octobre 2008 à 17:24 +0900, David Cournapeau a écrit :
I mean, when I build my C program against the glibc, I never have to change my compiler commands depending on which version of glibc I want.
Indeed, and the reason is that *functions never disappear from the glibc*.
Internally, this is handled by versioned sonames, at the link and load stage, yes, but only because it is "easy" to do and doable for C, because the library itself is never referenced in the code; IOW, there is the link process in C and similar compiled languages which use the name changes. But other languages use something different; I don't claim any deep understanding of the scheme, but mono uses a totally different scheme to handle versioning:
http://www.mono-project.com/Assemblies_and_the_GAC
Does mono cause problems to debian ? I think this kind of system is much more adapted to python than the C model is, because C has this two steps thing that python does not have.
I don’t think Mono causes issues, but I’m pretty sure that allowing multiple versions like the GAC allows *will* causes issues. Not for purely-packaged things, where you can safely ignore those directory renames, but if you start mixing distributed packages and user-downloaded stuff, they will run into the same issues we have with setuptools.
Of couse, it is more complicated than changing names. But I fear that people wants their cake and eating it: you can't have well maintained packages and reliability without thinking about API/ABI issues. There has to be a balance between OS distributors need (mostly only one version) and other people needs who may need several parallel installations.
Changing name is the only thing that is available at the language level, currently. Everything else that comes on top of it is a pile of gross hacks. If you really want something better, with e.g. a name and an API version, you need to convince the core python developers to introduce such functionality in the interpreter itself. Something that allows to say "import foo with apiver 2 minver 2.1".
I think setuptools and the whole eggs thing makes it well too easy to install several versions at the same time, making some people naively thinking it solves the dependency issues, without thinking that it replaces a kind of dll hell by a dependency hell. But OTOH, as a developer, I need to be able to develop my packages, distribute them, and breaking them from time to time before I reach 1.0. That's almost requirement of open source development, at least if you follow the release early/release often.
C developers have already solved these kinds of issues, and they no reason why you can’t do the same: keep the modules private until they are deemed stable, introduce deprecation warnings long (and I mean two years, not two weeks) before API breaks, require other developers to use e.g. specific flags to access unstable API, etc. They are all compromises, because there is no silver bullet for dealing with these issues. And setuptools is not a silver bullet either, except for shooting yourself in the foot. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

Josselin Mouette wrote:
Indeed, and the reason is that *functions never disappear from the glibc*.
Yes and no. If you remove a function, you're indeed screwed, because you can't handle versioning in the header. But you can handle versioning in libraries at the link step, and file name of the library is an implementation detail of this versioning.
I don’t think Mono causes issues, but I’m pretty sure that allowing multiple versions like the GAC allows *will* causes issues. Not for purely-packaged things, where you can safely ignore those directory renames, but if you start mixing distributed packages and user-downloaded stuff, they will run into the same issues we have with setuptools.
Please read the article carefully, it is not only about the GAC. It does handle the two conflicting issues: API stability installed globally vs easiness of deployment. That's why it is an interesting read IMHO: it addresses both issues. I don't think there is a single chance to see something as strong as C for python, because it would severely undermine the whole idea of the language used for prototyping.
Changing name is the only thing that is available at the language level, currently. Everything else that comes on top of it is a pile of gross hacks. If you really want something better, with e.g. a name and an API version, you need to convince the core python developers to introduce such functionality in the interpreter itself. Something that allows to say "import foo with apiver 2 minver 2.1".
Yes, that's exactly what I am saying. But that's the only solution in the long term I can see. Setuptools and co will never be usable for robust deployment: it kind of works for simple cases, or for developers who know what they are doing. But it is inherently unable to handle more complicated cases.
C developers have already solved these kinds of issues, and they no reason why you can’t do the same: keep the modules private until they are deemed stable, introduce deprecation warnings long (and I mean two years, not two weeks) before API breaks, require other developers to use e.g. specific flags to access unstable API, etc.
Again, this is pipe-dream. You can do that for C because it is a dead language (dead in the sense not evolving). Python is not like that. Changing python philosophy has zero chance of success. How many languages appeared after C do like C ? I don't know many. cheers, David

David Cournapeau wrote:
Josselin Mouette wrote:
Indeed, and the reason is that *functions never disappear from the glibc*.
Yes and no. If you remove a function, you're indeed screwed, because you can't handle versioning in the header. But you can handle versioning in libraries at the link step, and file name of the library is an implementation detail of this versioning.
I'm not 100% certain but I think that Josselin is speaking of glibc in particular here and you're speaking of c libraries in general.
I don’t think Mono causes issues, but I’m pretty sure that allowing multiple versions like the GAC allows *will* causes issues. Not for purely-packaged things, where you can safely ignore those directory renames, but if you start mixing distributed packages and user-downloaded stuff, they will run into the same issues we have with setuptools.
Please read the article carefully, it is not only about the GAC. It does handle the two conflicting issues: API stability installed globally vs easiness of deployment. That's why it is an interesting read IMHO: it addresses both issues. I don't think there is a single chance to see something as strong as C for python, because it would severely undermine the whole idea of the language used for prototyping.
Mono is absolutely horrid in this regard. Those who care about mono (not our most persuasive speakers, I'm afraid) have asked upstream to stop making that a best practice for Mono applications. I've said before that ideally a Linux distribution only wants one version of a library. In Fedora we're willing to have compat packages that hold old API versions if we must but by and large we would rather help upstream apps port their applications forward than to have compatibility packages. This is because upstream for the library will always be focusing on the newer versions of the libraries, not the older versions. If applications stay stuck on older versions, we end up having to support libraries by ourselves with no upstream to help us with the old version. As much as I'd rather not have compat packages, having private versions of third party libraries as advocated in that Mono document is worse. The primary problem is security. If a distro allows application to have their own private copies of libraries and a security flaw is discovered we're going to hate life. We'll have to: 1) Find what packages include that library. Unlike when the link goes to a system installed library, this will not cause a dependency between packages. So we can't just query the package metadata to find out what packages are affected. 2) Fix all the versions in all the packages. Because each package includes its own version of the library, there are multiple versions of the library in these packages. If we're unlucky the change will be conceptual and we'll have to fix lots of different looking code. 3) Push rebuilds of all the fixed packages out that our users have to download. There's PR involved here: Security fix to Library Foo vs Security Fix to Library Foo, Application, Bar, Baz, [...] Zod. There's also the burden for the users to download the packages. Compare this with having to fix a set of compat library packages that's not included in other applications: 1) Find all the libraries in the affected set. It will probably be enough to look by package name since these will be like: python-foo1-1.0, python-foo2-2.2, python-foo-3.0 2) Fix the library (probably with help from upstream) and the compat-libraries (maybe with upstream help or maybe on our own). 3) Push rebuilds of the library packages for our users to download. Another concern is licensing. Anytime a package includes other, third party modules, the licensing situation becomes more complex. Are the licensing terms of any of the works being violated? How do we have to list the licensing terms in the package? Are licensing terms for all the packages available? is everything open source? (believe it or not, we do find non-OSS stuff in third party directories when we audit these bundled packages :-( Another concern is not giving back to upstream. Once a package starts including its own, private copies of a library it becomes more and more tempting for the package to make bug fixes and enhancements on its own copy. This has two serious problems: 1) It becomes harder to port the application forward to a new version because this is no longer what upstream has shipped at any time. 2) The changes may not get back to upstream at all. Those bug fixes and feature enhancements may end up being only part of this package, even though the whole community would benefit. Another concern is build scripts that become tied to building with/installing the private versions. Distributions have policies on inclusion of third party libraries in another application. Sometimes upstream has a reason to include a copy of a library for compatibility on Windows or for customers who aren't going to get it from their distribution. In these cases, if the distribution can give a command to the build scripts to not build or install the compat library, all is still well. However, using system libraries often bitrots in an upstream's build scripts because they start caring more about the pegged library version they control than about making things work with system libraries. This makes more work for the packager. Another concern is shipping of prebuilt-binaries. Just before Fedora 9 came out we had to go through our Mono packages, get rid of some, and do extensive work on the build scripts of others. This was because some packagers hadn't been watching their packages very well and upstream had started shipping prebuilt versions of the third party modules they required. For upstream, they were making things easier for their end-users. For us, we had to assure that everything was built from source that was auditable and available if needed (for instance, if one of those third party libraries had a security flaw). Another case is upstream reluctance to take patches to forward port the application. Unfortunately, when upstreams start including their own, known good versions of libraries inside their packages, they sometimes become reluctant to forward port to a new version of the library. For them, it's moving from a version that they know about to an untested version from upstream. For distributions, which have a vested interest in helping upstreams forward port so they have only one version of a library to maintain in the distro, this is an impediment to helping upstream by providing patches to forward port. These are some of the reasons that packaging Mono applications is something I personally avoid in Fedora :-) Please, do not go down this road with python. -Toshio

Toshio Kuratomi wrote:
I'm not 100% certain but I think that Josselin is speaking of glibc in particular here and you're speaking of c libraries in general.
Maybe, but I don't see how this change the point: when you change the soname of a library, it has 0 impact on the source code of the software which links against it. That's not true in python.
I've said before that ideally a Linux distribution only wants one version of a library.
And ideally, developers do not want to care about versioning issues :) There is a middle ground to find; up to now, I think python distutils and co did not care at all about this, but you can not ask to move 180 degrees and do only as linux vendors want. I understand the reasons why OS vendors want to avoid distributing multiple versions as much as possible. But that's just not realistic in every case. Even in C-derived languages, this is sometimes a PITA. I could give you examples where distributions screwed up the packaging badly and hurt some projects I am involved with. But that would be unfair and besides the point (we all screw up); the point is that there are some practical reasons for sometimes including private copies. Because for example in Fortran's case, there is this huge mess of gfortran and g77 not being ABI compatible; there are examples in C++, and even in C. You also can't impose every software to follow distributions time-schedule. Reasons for single version for OS vendors are valid; but so are the ones to have multiple versions. I think compat modules would cover most needsl the problem is that python does not have a mechanism to request a particular version of a module. But wouldn't this help OS vendors to have such a mechanism (to decrease the burden of compat versions ?)
In Fedora we're willing to have compat packages that hold old API versions if we must but by and large we would rather help upstream apps port their applications forward than to have compatibility packages.
Yes, but here again the C comparison breaks. Some people use python as a "tool", not so much as a "programming language". Their applications are scripts, or softwares for experiment, that are not released, because they can't open source it, or simply because it has no use for anyone else. You can't port that. cheers, David

David Cournapeau wrote:
Toshio Kuratomi wrote:
I'm not 100% certain but I think that Josselin is speaking of glibc in particular here and you're speaking of c libraries in general.
Maybe, but I don't see how this change the point: when you change the soname of a library, it has 0 impact on the source code of the software which links against it. That's not true in python.
<nod>. I just noticed that you guys seemed to be speaking past each other and wanted to point it out.
I've said before that ideally a Linux distribution only wants one version of a library.
And ideally, developers do not want to care about versioning issues :) There is a middle ground to find; up to now, I think python distutils and co did not care at all about this, but you can not ask to move 180 degrees and do only as linux vendors want. I understand the reasons why OS vendors want to avoid distributing multiple versions as much as possible. But that's just not realistic in every case.
Even in C-derived languages, this is sometimes a PITA. I could give you examples where distributions screwed up the packaging badly and hurt some projects I am involved with. But that would be unfair and besides the point (we all screw up); the point is that there are some practical reasons for sometimes including private copies. Because for example in Fortran's case, there is this huge mess of gfortran and g77 not being ABI compatible; there are examples in C++, and even in C. You also can't impose every software to follow distributions time-schedule.
Reasons for single version for OS vendors are valid; but so are the ones to have multiple versions. I think compat modules would cover most needsl the problem is that python does not have a mechanism to request a particular version of a module. But wouldn't this help OS vendors to have such a mechanism (to decrease the burden of compat versions ?)
Very true! Which is why say single package versions are ideal for Linux distributions. Ideal being one of those ever striven for, never achieved goals and Linux distributions being the demographic whose opinion I'm pretending to represent :-) I can definitely understand the need to develop packages with different versions of python packages than a system might have. (I develop software as well as package it.) So where the concerns intersect is when you go to distribute your package. To have your package run in as many places as possible, you want to guarantee the correct versions of libraries are installed in those places. OTOH, in order to get Linux distributions interested in packaging your project you really must not use your own private copies of those libraries. (BTW, Josselin seems to be saying something different is true on Debian but I had posted this question to the cross-distro distributions-list freedesktop.org two weeks ago after dealing with it in the banshee package and people seemed to agree that it was not proper packaging. I'll have to ask for clarification here. Perhaps I phrased my question poorly on that list :-) Anyhow... the problems I outlined in my mail are the reasons that packagers have a wtf moment when they untar a source tarball and find that there's fifteen other upstream packages included. Ways that this can be remedied: 1) Have a source tarball that's separate from binary distribution. The binary distribution can contain the third party modules while the source tarball just contains your code. Distro packagers will love you for this because it means the source tarball is clean and they can just g to work packaging it. 2) If you must distribute the source tarball with third party modules, make sure your build scripts work with the installed system packages instead of the modules you are including. This lets a packager build and install just your code and ignore the rest. 3) Make sure you document how to do this. Good packagers read the README. If you have to rm -rf THIRD_PARTY-DIR prior to building to just build and install your code, mention that. 4) make sure your package works with vanilla upstream versions of the third party modules. It's tempting to fix things in your local copies of modules. If at all possible don't. If that's not possible, make sure upstream has incorporated the patch and make a note in the README -- using a patched version of Foo-x.y project. The patch is in their svn as of DATE. patch is located in myproject/foo-x.y/patches. Doing this means that the distribution packager of your package can take your patch to the packager of Foo and ask that the patch be incorporated there.
In Fedora we're willing to have compat packages that hold old API versions if we must but by and large we would rather help upstream apps port their applications forward than to have compatibility packages.
Yes, but here again the C comparison breaks. Some people use python as a "tool", not so much as a "programming language". Their applications are scripts, or softwares for experiment, that are not released, because they can't open source it, or simply because it has no use for anyone else. You can't port that.
If complaints in Fedora are any indication this happens with C as well ;-). If by you, you mean me or a distribution you're right. Of course, the "can't port that" doesn't apply to the person with the script. The solution at the distro level for the non-OSS software in deployment scenario is not to run a distribution with a limited lifespan. If you need something to run reliably on python2.4 for years, run RHEL5 or CentOS or Debian stable. They won't change the major components without reason and you'll be able to run just your changes (more recent packages, etc) on top. There is no solution at the distro level for experimental, in development stuff. But I think that using eggs, workingenv, etc is a fine solution for the development case. scripts and small, local software is a problem. "I have a script to download por^Wfiles from the internet. You started shipping py3k and now urllib is busted!" Debian has solved one portion of this by shipping different versions of python that are parallel installable. Fedora has solved a different portion by shipping compat packages (using setuptools for parallel install) when there's a need. If the software is small the best answer may be to port. If the software is intricate, the best answer may be to use workingenv or something else from the development case. I think it's important to note that I see different use cases for using distribution packages and using local solutions like workingenv. Local solutions let you use more current (or less current) versions of things. They let you experiment and give you the option to refuse to port your local scripts. Supporting people overriding what's installed on their OS distro is important for doing these things. But distro packaging serves a very useful purpose as well. It lets people experiment with your work who are just looking for something that can get a specific job done. It lets developers worry about something other than security fixes to packages which they depend on. -Toshio

Le jeudi 02 octobre 2008 à 10:08 -0700, Toshio Kuratomi a écrit :
So where the concerns intersect is when you go to distribute your package. To have your package run in as many places as possible, you want to guarantee the correct versions of libraries are installed in those places. OTOH, in order to get Linux distributions interested in packaging your project you really must not use your own private copies of those libraries.
(BTW, Josselin seems to be saying something different is true on Debian but I had posted this question to the cross-distro distributions-list freedesktop.org two weeks ago after dealing with it in the banshee package and people seemed to agree that it was not proper packaging. I'll have to ask for clarification here. Perhaps I phrased my question poorly on that list :-)
I’m not sure I understand what point you think is different on Debian. We ship several versions of Python at once, but we do not ship several versions of a module unless absolutely necessary.
1) Have a source tarball that's separate from binary distribution. The binary distribution can contain the third party modules while the source tarball just contains your code. Distro packagers will love you for this because it means the source tarball is clean and they can just g to work packaging it.
Full ACK. Repackaging is a pain.
4) make sure your package works with vanilla upstream versions of the third party modules. It's tempting to fix things in your local copies of modules. If at all possible don't. If that's not possible, make sure upstream has incorporated the patch and make a note in the README -- using a patched version of Foo-x.y project. The patch is in their svn as of DATE. patch is located in myproject/foo-x.y/patches. Doing this means that the distribution packager of your package can take your patch to the packager of Foo and ask that the patch be incorporated there.
Amen. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

On Thu, Oct 02, 2008 at 10:08:01AM -0700, Toshio Kuratomi wrote:
4) make sure your package works with vanilla upstream versions of the third party modules. It's tempting to fix things in your local copies of modules. If at all possible don't. If that's not possible, make sure upstream has incorporated the patch and make a note in the README -- using a patched version of Foo-x.y project. The patch is in their svn as of DATE. patch is located in myproject/foo-x.y/patches. Doing this means that the distribution packager of your package can take your patch to the packager of Foo and ask that the patch be incorporated there.
Mercurial's patch queues can be of great help for this. http://www.selenic.com/mercurial/wiki/index.cgi/MqExtension
development stuff. But I think that using eggs, workingenv, etc is a fine solution for the development case.
Someone told me about http://0install.net/ but I have not tested it and do not know how good/bad it is. -- Nicolas Chauvat logilab.fr - services en informatique scientifique et gestion de connaissances

David Cournapeau wrote:
when you change the soname of a library, it has 0 impact on the source code of the software which links against it. That's not true in python.
In the Eiffel world, there's a thing called an ACE, which stands for Assembly of Classes in Eiffel. It's a way of specifying a mapping between class names used internally in the code and where to get them from in the environment. I think Python could do with something similar for managing version issues. Perhaps we could study how ACEs work and see if we can use any ideas from there. We could call it an AMP -- Assembly of Modules in Python. -- Greg

Greg Ewing wrote:
I think Python could do with something similar for managing version issues. Perhaps we could study how ACEs work and see if we can use any ideas from there.
Interesting. From your description, it does sound a bit like the GAC for MONO/.Net I was mentioning earlier: http://www.mono-project.com/Assemblies_and_the_GAC I know Bertrand Meyer say mostly good things about .net technologies, and Eiffel softwares went a lot toward so .net technologies, so I would not be surprised if it were somewhat similar. Do you have a link toward this. I could not find much info with google on this Assembly Classes for Eiffel. cheers, David

David Cournapeau wrote:
Do you have a link toward this. I could not find much info with google on this Assembly Classes for Eiffel.
It seems to be difficult to find any in-depth information about it on the web. There's a summary of the syntax here: http://archive.eiffel.com/nice/language/page.html#HDR199 although it doesn't give much idea of what it all means. There's an example of one here: http://archive.eiffel.com/doc/online/eiffel50/intro/language/invitation-15.h... There's some better info here on the GNU SmartEiffel site: http://smarteiffel.loria.fr/wiki/en/index.php/ACE -- Greg

Ian Bicking wrote:
Say I have a package that represents an application. We'll call it FooBlog. I release version 1.0. It uses the Turplango web framework (1.5 at the time of release) and the Storchalmy ORM (0.4), and Turplango uses HardJSON (1.2.1).
I want my version 1.0 to keep working. So, I figure I'll add the dependencies:
Turplango==1.5 Storchalmy==0.4
Why? I would have suggested: Turplango>=1.5,<2.0 Storchalmy==>=0.4,<0.5
Then HardJSON 2.0 is released, and Turplango only required HardJSON>=1.2, so new installations start installing HardJSON 2.0. But my app happens not to be compatible with that library, and so it's broken.
OK... so, I could add HardJSON==1.2.1 in my requirements.
Not could, should, in fact must. Relying on a dependency provided by library you're using is suicide. Again, I'd suggest: HardJSON >=1.2.1,<1.3
But then a small bug fix, HardJSON 1.2.2 comes out, that fixes a security bug. Turplango releases version 1.5.1 that requires HardJSON>=1.2.2. I now have have to update FooBlog to require both Turplango==1.5.1 and HardJSON==1.2.2.
Not if you'd followed my advice above.
Later on, I decide that Turplango 1.6 fixes some important bugs, and I want to try it with my app. I can install Turplango 1.6, but I can't start my app because I'll get a version conflict.
Not if you'd followed my advice above.
So to even experiment with a new version of the app, I have to check out FooBlog, update setup.py, reinstall (setup.py develop) the package, and then I can start using it.
Right, you're developing FooBlog by changing the software it uses, so it seems natural enough to have to edit FooBlog code. You don't have to check those edits into your SCM ;-)
But if I've made other hard requirements of packages like HardJSON, I'll have to update all those too.
Yes, that's true, and why I recommeded what I did. That said, if you're paranoid enough to specify the exact versions (there's nothing wrong with this ;-) ) then it should be no surprise that you need to edit them...
more than 3 libraries involved. I now think it is best to only use version requirements to express known conflicts.
Or likely sources of known conflicts, such as major version increases, which is why I suggested what I did above...
For future versions of packages you can't really know if they will cause conflicts until they are released.
Right, which is why consistency is version numbering for backwards incompatible changes is important. Myself, I stick pretty rigidly to: x.y.z: z = no api change y = new apis added x = old apis changed or removed cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

Chris Withers wrote:
Ian Bicking wrote:
Say I have a package that represents an application. We'll call it FooBlog. I release version 1.0. It uses the Turplango web framework (1.5 at the time of release) and the Storchalmy ORM (0.4), and Turplango uses HardJSON (1.2.1).
I want my version 1.0 to keep working. So, I figure I'll add the dependencies:
Turplango==1.5 Storchalmy==0.4
Why?
I would have suggested:
Turplango>=1.5,<2.0 Storchalmy==>=0.4,<0.5
Then when Turplango 1.6 comes out it'll break my code.
Then HardJSON 2.0 is released, and Turplango only required HardJSON>=1.2, so new installations start installing HardJSON 2.0. But my app happens not to be compatible with that library, and so it's broken.
OK... so, I could add HardJSON==1.2.1 in my requirements.
Not could, should, in fact must. Relying on a dependency provided by library you're using is suicide.
Again, I'd suggest:
HardJSON >=1.2.1,<1.3
What does 1.3 mean? You imply there is a disciplined use of a versioning pattern, and that every release is a guarantee that the versioning has been properly declared. There isn't a common understanding of versions, and it's common that conflicts are released unintentionally.
But then a small bug fix, HardJSON 1.2.2 comes out, that fixes a security bug. Turplango releases version 1.5.1 that requires HardJSON>=1.2.2. I now have have to update FooBlog to require both Turplango==1.5.1 and HardJSON==1.2.2.
Not if you'd followed my advice above.
OK, change that to "a small bug fix comes out as HardJSON 1.3", and the same problems follow. I don't know what the nature of future releases will be.
Later on, I decide that Turplango 1.6 fixes some important bugs, and I want to try it with my app. I can install Turplango 1.6, but I can't start my app because I'll get a version conflict.
Not if you'd followed my advice above.
Now you've introduced an entirely different requirement -- for some reason I am supposed to have known that HardJSON 1.3 would break my code, but only Turplango 2.0 would cause a conflict. And Turplango 1.6 wouldn't
So to even experiment with a new version of the app, I have to check out FooBlog, update setup.py, reinstall (setup.py develop) the package, and then I can start using it.
Right, you're developing FooBlog by changing the software it uses, so it seems natural enough to have to edit FooBlog code. You don't have to check those edits into your SCM ;-)
But if I've made other hard requirements of packages like HardJSON, I'll have to update all those too.
Yes, that's true, and why I recommeded what I did. That said, if you're paranoid enough to specify the exact versions (there's nothing wrong with this ;-) ) then it should be no surprise that you need to edit them...
It's not surprising, it's just very annoying.
more than 3 libraries involved. I now think it is best to only use version requirements to express known conflicts.
Or likely sources of known conflicts, such as major version increases, which is why I suggested what I did above...
You presume you can predict likely sources of known conflicts in software that doesn't exist yet. This is simply not true.
For future versions of packages you can't really know if they will cause conflicts until they are released.
Right, which is why consistency is version numbering for backwards incompatible changes is important.
There is no single concept of what backward compatibility even is. You can off something that fixes my specific example, using knowledge that would not be available to you at the time when you were using the code. That doesn't really prove anything -- I could also come up with conflicts that would break any example you could provide. There's no version change so minor that it can't break anything, and there's no version change so major that you should end up with a cascading set of updates that only change dependency information just to accommodate it. -- Ian Bicking : ianb@colorstudy.com : http://blog.ianbicking.org

Ian Bicking wrote:
Chris Withers wrote:
Ian Bicking wrote:
Say I have a package that represents an application. We'll call it FooBlog. I release version 1.0. It uses the Turplango web framework (1.5 at the time of release) and the Storchalmy ORM (0.4), and Turplango uses HardJSON (1.2.1).
I want my version 1.0 to keep working. So, I figure I'll add the dependencies:
Turplango==1.5 Storchalmy==0.4
Why?
I would have suggested:
Turplango>=1.5,<2.0 Storchalmy==>=0.4,<0.5
Then when Turplango 1.6 comes out it'll break my code.
I'm assuming that you, as a consumer of Turplango, understand the versioning structure of Turplango. Based on the above, my model assumed that <2.0 would be api-compatible with 1.x. If that's not the case, adjust the dependencies as necessary.
Not could, should, in fact must. Relying on a dependency provided by library you're using is suicide.
Again, I'd suggest:
HardJSON >=1.2.1,<1.3
What does 1.3 mean? You imply there is a disciplined use of a versioning pattern,
I think for each usable library, there *is* a versioning pattern. If it's extremely unstable, that *should* push users away from the library.
and that every release is a guarantee that the versioning has been properly declared.
This comes under stability. Shit software is shit software whether its because it contains tonnes of bugs or because it doesn't specify its dependencies properly.
There isn't a common understanding of versions,
...within a project, there generally is, which is all that's required here.
and it's common that conflicts are released unintentionally.
Well, if people have drummed into them how important accurate version dependencies are, then this won't happen...
But then a small bug fix, HardJSON 1.2.2 comes out, that fixes a security bug. Turplango releases version 1.5.1 that requires HardJSON>=1.2.2. I now have have to update FooBlog to require both Turplango==1.5.1 and HardJSON==1.2.2.
Not if you'd followed my advice above.
OK, change that to "a small bug fix comes out as HardJSON 1.3", and the same problems follow. I don't know what the nature of future releases will be.
See previous comments on the versioning structure used by a library.
Later on, I decide that Turplango 1.6 fixes some important bugs, and I want to try it with my app. I can install Turplango 1.6, but I can't start my app because I'll get a version conflict.
Not if you'd followed my advice above.
Now you've introduced an entirely different requirement -- for some reason I am supposed to have known that HardJSON 1.3 would break my code, but only Turplango 2.0 would cause a conflict. And Turplango 1.6 wouldn't
You're trying to make something out of nothing here. If the version dependencies are specified in setup.py or some kind of KGS they still have to be specified correctly. If they are not, you're screwed...
But if I've made other hard requirements of packages like HardJSON, I'll have to update all those too.
Yes, that's true, and why I recommeded what I did. That said, if you're paranoid enough to specify the exact versions (there's nothing wrong with this ;-) ) then it should be no surprise that you need to edit them...
It's not surprising, it's just very annoying.
Well then, only use libraries which properly specify their version dependencies and fix those that don't and you have no problem or annoyance.
Or likely sources of known conflicts, such as major version increases, which is why I suggested what I did above...
You presume you can predict likely sources of known conflicts in software that doesn't exist yet. This is simply not true.
Indeed, but I'm damned sure I can tell you what version ranges of *existing* software should be api compatible.
Right, which is why consistency is version numbering for backwards incompatible changes is important.
There is no single concept of what backward compatibility even is.
There doesn't have to be one single concept, just that each library has to have its own understanding of this so that consumers of that library can express their requirements properly.
You can off something that fixes my specific example, using knowledge that would not be available to you at the time when you were using the code. That doesn't really prove anything -- I could also come up with conflicts that would break any example you could provide. There's no version change so minor that it can't break anything, and there's no version change so major that you should end up with a cascading set of updates that only change dependency information just to accommodate it.
Well, if you want to be this negative about it, then you can lock down versions. No-ones stopping you and current tools such as buildout support this. Personally, I just don't thinl it should be necessary... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

On Fri, Oct 03, 2008 at 04:14:22PM +0100, Chris Withers wrote:
Myself, I stick pretty rigidly to:
x.y.z:
z = no api change
y = new apis added
x = old apis changed or removed
That's our policy at www.logilab.org -- Nicolas Chauvat logilab.fr - services en informatique scientifique et gestion de connaissances

Le mardi 30 septembre 2008 à 16:36 +0100, Chris Withers a écrit :
No, the problem we have today is that some developers are providing modules without API stability, which means you cannot simply depend on a module, you need a specific version.
This problem is never going away, it's the nature of software.
It doesn’t have to go away per se, but we need proper ways to deal with incompatible changes in the interfaces.
Again, when a C library changes its ABI, we do not allow it to keep the same name. It's as simple as that.
That's insane, and I bet without trying to hard, I could find examples of violation of this supposed practice.
Of course, Python developers don’t have the monopoly on misunderstanding maintainability requirements. It evens happens more often in C, where the ABI can change without any incompatibility in the API. When this happens without a soname change, we either change the soname ourselves (diverging from upstream) or change the package name, making it impossible to install two conflicting versions at once. In Python libraries, this is not possible without changing the code, since the file name and the module name are the same. If a Python module changes its API incompatibly, we are forced to update all reverse dependencies and add versioned conflicts, without being able to ensure none is forgotten, and without enforcing the change for third-party packages. [snip]
Besides, accurately specified dependency information, including versions, within a package should suffice.
It will suffice, but we will not be able to manage it in distributions if you allow too many weird things to be specified in these dependencies. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

Josselin Mouette wrote:
Le mardi 30 septembre 2008 à 16:36 +0100, Chris Withers a écrit :
No, the problem we have today is that some developers are providing modules without API stability, which means you cannot simply depend on a module, you need a specific version. This problem is never going away, it's the nature of software.
It doesn’t have to go away per se, but we need proper ways to deal with incompatible changes in the interfaces.
Well, the generally accepted way seems to be to increase the major version number...
In Python libraries, this is not possible without changing the code, since the file name and the module name are the same.
The distribution name and package/module name do not have to be the same...
It will suffice, but we will not be able to manage it in distributions if you allow too many weird things to be specified in these dependencies.
Explain "too many weird things"... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

Le mardi 30 septembre 2008 à 17:08 +0100, Chris Withers a écrit :
Josselin Mouette wrote:
It doesn’t have to go away per se, but we need proper ways to deal with incompatible changes in the interfaces.
Well, the generally accepted way seems to be to increase the major version number...
This information is not accessible directly at import time. If you want to rely on it to check the API compatibility, you’ll end up doing the horrible things pygtk and gst-python did. And believe me, that will not be helpful.
In Python libraries, this is not possible without changing the code, since the file name and the module name are the same.
The distribution name and package/module name do not have to be the same...
Indeed, but if we change the package name without changing the file name, we have to make both packages conflict with each other. This works for distribution packages, but it doesn’t help for third-party addons, and it can make things complicated if two packages need different APIs.
It will suffice, but we will not be able to manage it in distributions if you allow too many weird things to be specified in these dependencies.
Explain "too many weird things"...
I showed already two examples: versioned provides and exact dependencies. That’s just after thinking about it for 5 minutes; if we want it to really work, we need to thoroughly think of what exact kind of information we are able to use. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

Josselin Mouette wrote:
This information is not accessible directly at import time.
Two questions as answers: - why does it need to be? - why not?
If you want to rely on it to check the API compatibility, you’ll end up doing the horrible things pygtk and gst-python did. And believe me, that will not be helpful.
It's unfortunately not a black'n'white thing... ...but if you require a particular API, just define the appropriate dependency in setup.py. What's the problem with that?
Indeed, but if we change the package name without changing the file name, we have to make both packages conflict with each other.
Why?
This works for distribution packages,
What do you mean by "distribution package"?
but it doesn’t help for third-party addons,
Why not?
and it can make things complicated if two packages need different APIs.
If two packages need two different versions of a third package, then your project is in trouble, api differences or not...
I showed already two examples: versioned provides and exact dependencies.
I don't know what either of these terms means I'm afraid :-S cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk

Le vendredi 03 octobre 2008 à 16:06 +0100, Chris Withers a écrit :
Josselin Mouette wrote:
This information is not accessible directly at import time.
Two questions as answers:
- why does it need to be?
It does not strictly need to be, it’s just more convenient ; especially since you generally have to be compatible with several APIs and have compatibility code for each of them. Note that the reduced complexity of removing the old API is by far compensated by the added complexity of such compatibility code.
- why not?
Currently there is simply no way to express the API version of a module.
If you want to rely on it to check the API compatibility, you’ll end up doing the horrible things pygtk and gst-python did. And believe me, that will not be helpful.
It's unfortunately not a black'n'white thing...
...but if you require a particular API, just define the appropriate dependency in setup.py. What's the problem with that?
There are at least two problems with that: * you cannot reliably translate this to package dependencies (even when doing it by hand as we do currently); * it is not possible to install two different APIs for the same module name on the same system.
Indeed, but if we change the package name without changing the file name, we have to make both packages conflict with each other.
Why?
Because otherwise the package manager will choke when trying to install both of them together.
This works for distribution packages,
What do you mean by "distribution package"?
I mean a .deb or a .rpm.
but it doesn’t help for third-party addons,
Why not?
Because you can say in the package metadata: "Conflicts: python-foo", but you cannot express "this package will break local installations of module foo". The user will just break his python installation without knowing why.
and it can make things complicated if two packages need different APIs.
If two packages need two different versions of a third package, then your project is in trouble, api differences or not...
Your project is in trouble because of the total absence of API management. Otherwise there’s no reason why you couldn’t have some dependencies use one API and some other dependencies an other one. But the problem is worse than that: you know, there’s not only one application installed on a system. If one project needs one version and the other project another, incompatible version, you cannot install both on the same system.
I showed already two examples: versioned provides and exact dependencies.
I don't know what either of these terms means I'm afraid :-S
Versioned provides mean "module bar at version 1.2 provides functionality of module foo at version 1.0". Exact dependencies mean "module bar at version 1.2 requires module foo at version 1.0 *and only this version*." Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

Le mardi 30 septembre 2008 à 17:20 +0200, Tarek Ziadé a écrit :
Again, when a C library changes its ABI, we do not allow it to keep the same name. It's as simple as that.
I see, so there's no deprecation processes for a package ?
Not per se. It is the job of the package manager to propose removing deprecated packages when they are no longer available in the repository.
I mean, if you change a public API of your package , you *have* to change its name ?
Yes, this is the requirement for C libraries, and we try to enforce it as well for other languages.
My convention is to : - keep the the old API and the new API in the new version, let's say "2.0" - mark the old API as deprecated (we have this "warning'" module in Python to do so) - remove the old API in the next release, like "2.1"
But I don't want to change the package name.
And the development cycles in a python package are really short compared to OS systems, in fact we can have quite a few releases before a package is really stable.
I don’t think the requirements are different from those of C library developers. There are, of course, special cases for libraries that are in development; generally we take a snapshot, give it a specific soname and enforce the ABI compatibility in the Debian package. The other possibility is to distribute the library only in a private directory. Nothing in this process is specific to C; the technical details are different for python modules, but we should be able to handle it in a similar way.
This is not an improvement, it is a nightmare for the sysadmin. You cannot install things as simple (and as critical) as security updates if you allow several versions to be installed together.
mmm... unless the version is "part of the name" in a way....
Yes, this is what C libraries do with the SONAME, for which the convention is to postfix it with a number, which changes when the ABI is changed in an incompatible way. I don’t know whether it would be possible to do similar things with python modules, but it is certainly something to look at.
Two conflicting versions must not use the same module namespace.
I have an idea: what about having a "known good set" (KGS) like what Zope has built on its side.
a Known Good Set is a set of python package versions, that are known to provide the good execution context for a given version of Python.
Maybe the Python community could maintain a known good set of python packages at PyPI, with a real work on its integrity, like any OS-vendor does I believe.
Having a body that enforces API stability for a number of packages would probably prevent such issues from happening in those packages. However, that means relying too much and this body, and experience proves it will quickly lag behind. Furthermore, the need to add packages that are not in the KGS to distributions will arise sooner or later.
And maybe this KGS could be used by Debian as the reference of package versions.
We will always need, for some cases, more recent packages or packages that are not in the KGS.
-> if a package is listed in this KGS, it defines the version, for a given version of Python
You don’t have to define it so strictly. There is no reason why a new version couldn’t be accepted in the KGS for an existing python version, if it has been checked that it will not break existing applications using this module. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

Josselin Mouette wrote:
Le mardi 30 septembre 2008 à 17:20 +0200, Tarek Ziadé a écrit :
I mean, if you change a public API of your package , you *have* to change its name ?
Yes, this is the requirement for C libraries, and we try to enforce it as well for other languages.
Things are somewhat different in C, because the filename of the .so isn't something you refer to in the source code. Applying the same thing to Python would require the version to be specified every time the module is mentioned in an import statement. -- Greg

Le jeudi 02 octobre 2008 à 12:36 +1200, Greg Ewing a écrit :
Josselin Mouette wrote:
Le mardi 30 septembre 2008 à 17:20 +0200, Tarek Ziadé a écrit :
I mean, if you change a public API of your package , you *have* to change its name ?
Yes, this is the requirement for C libraries, and we try to enforce it as well for other languages.
Things are somewhat different in C, because the filename of the .so isn't something you refer to in the source code.
Applying the same thing to Python would require the version to be specified every time the module is mentioned in an import statement.
There are no ABI issues with Python. Only when the API changes, software using it is broken. When the API changes incompatibly in C, the problem is the same. It is solved elegantly by pkg-config: the two versions can have a different (versioned) directory for the includes and a different library name, so you only change the pkg-config calls or the -I and -l build options. I wish we had a similar mechanism in Python, allowing to select the API version in some way. However, since there is no build step, you need to make it happen somewhere in the code. Cheers, -- .''`. : :' : We are debian.org. Lower your prices, surrender your code. `. `' We will add your hardware and software distinctiveness to `- our own. Resistance is futile.

Josselin Mouette wrote:
if you try to build a package of baz, there is no way to express correctly that you depend on python-bar (>= 1.3) or python-foo (>= 1.2).
Seems to me that baz shouldn't have to say that -- all it should have to say is that it requires bar version 1.3. It's up to the package manager to know how to look inside packages to see what versions of other packages they contain, if such a thing is going to be allowed. Otherwise, whenever one package is moved inside another, then in order to take advantage of that, all other packages that use it would have to have their dependencies updated, which doesn't seem reasonable. I can't see the point of nesting packages like this anyway. If bar really is usable independently of foo, then why not just leave it in a separate tar file and let foo declare a dependency on it? They can be bundled together for convenience of manual distribution if desired, but when installed, such a bundle should be split out into separate packages as far as the package manager sees them. If they're being automatically retrieved from a repository, it makes more sense to keep them separate. There's no more to download that way, and there may be less, since you can just download the packages actually needed. -- Greg

On Sep 29, 2008, at 6:09 AM, Tarek Ziadé wrote:
Now, the question is, what would debian miss in here to install:
It really seems to me that PEP-345's specification of dependency metadata is the wrong starting point. There are not, to my knowledge, any Python packages in existence which use this form of dependency metadata, and there are not, to my knowledge, any Python tools which are capable of producing or consuming it. In contrast, there are a large number of packages already in existence that declare their dependencies in their EGG-INFO/ depends.txt. There are many tools -- I don't even know how many -- which already know how to produce and consume that dependency metadata. In fact, one such tool has a patch that I contributed myself to use that dependency metadata to automatically produce the Debian "Depends:" information [1]. I learned yesterday that there is a tool by David Malcolm to do likewise for Fedora RPM packages. We would gain power by continuing to use the format that is already implemented and deployed, instead of asking everyone to switch to a different format. So it seems like the next step is to write a PEP that supercedes the parts of PEP-345 which are about dependency metadata and instead says that the standard way to encode Python dependency metadata is in the EGG-INFO/requires.txt file. Regards, Zooko [1] https://code.launchpad.net/~astraw/stdeb/autofind-depends --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month

On Tue, Sep 30, 2008 at 2:38 PM, zooko <zooko@zooko.com> wrote:
On Sep 29, 2008, at 6:09 AM, Tarek Ziadé wrote:
Now, the question is, what would debian miss in here to install:
It really seems to me that PEP-345's specification of dependency metadata is the wrong starting point.
There are not, to my knowledge, any Python packages in existence which use this form of dependency metadata, and there are not, to my knowledge, any Python tools which are capable of producing or consuming it.
In contrast, there are a large number of packages already in existence that declare their dependencies in their EGG-INFO/depends.txt. There are many tools -- I don't even know how many -- which already know how to produce and consume that dependency metadata.
In fact, one such tool has a patch that I contributed myself to use that dependency metadata to automatically produce the Debian "Depends:" information [1]. I learned yesterday that there is a tool by David Malcolm to do likewise for Fedora RPM packages.
We would gain power by continuing to use the format that is already implemented and deployed, instead of asking everyone to switch to a different format.
The point is not to switch to a different format but to make sure: - we are able to read it without a setup.py magic call - we do have everything needed in these metadata for OS-vendors to work with the package otherwise propose some extensions
So it seems like the next step is to write a PEP that supercedes the parts of PEP-345 which are about dependency metadata and instead says that the standard way to encode Python dependency metadata is in the EGG-INFO/requires.txt file.
I would go further and say that we shouldn't have to run a command to generate the EGG-INFO of PKG-INFO or wathever, they should be avalaible in the package, directly, in a flat file. maybe in a "package_info.py" file I don't know or a .cfg file. But we shouldn't depend on a setup.py command call to read them or on a directory built by a command. That is one simple evolution I'd like to propose in the PEP I am working on. Regards,
Regards,
Zooko
[1] https://code.launchpad.net/~astraw/stdeb/autofind-depends --- http://allmydata.org -- Tahoe, the Least-Authority Filesystem http://allmydata.com -- back up all your files for $5/month
-- Tarek Ziadé | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/

On Mon, Sep 29, 2008 at 08:46:15PM +0900, David Cournapeau wrote:
The problem is that debian packages are not always the solution (even on debian systems). Two big problems are: - installation as non root
True, this is a common use case. Unfortunately, this use case is common because users get stuck on a system with no way to get things installed on it. No admin rights, no way to get the contracted sysadmin to install things, etc. So far I have not found a good solution to this problem when I had to face it. Tried several things including a "user-side gentoo". Did not work well. I see why easy_install could be a solution.
- developers deploying their own software on a custom debian repository does not scale at all.
Why do you think that?
Where distutils failed big time IMHO is that it made it more difficult for you (or for me for that matter), not easier. Autotools did help packagers; a distutils successor should be able to help without getting in the way.
Yes.
For example, by providing simple discoverable meta-data. Wouldn't it help a debian packager to have a simple description of the meta-data for the dependencies ? Wouldn't it help if it was easy to set data_dir, doc_dir, etc... according to the FHS ? Autotools "packages" are relatively easy to package; I don't see why we could not achieve the same for python packages.
Good. Let's do it. Do you agree with Tarek that writing a PEP is a good approach? -- Nicolas Chauvat logilab.fr - services en informatique scientifique et gestion de connaissances

Nicolas Chauvat wrote:
- developers deploying their own software on a custom debian repository does not scale at all.
Why do you think that?
With N vendors packaging P packages, it is not hard to imagine that many package will overlap, and since they are independently built, it will fail pretty quickly for people who use several of those repositories. Package distribution is very centralized with rpm/deb. That's a big shortcoming of the current software distribution on Linux IMHO.
Good. Let's do it. Do you agree with Tarek that writing a PEP is a good approach?
I think there is already enough material from people familiar with the problems in the last few days emails on this ML to write something semi-formalized, as a basis for further discussion, yes. cheers, David
participants (17)
-
Chris Withers
-
Dave Peterson
-
David Cournapeau
-
Greg Ewing
-
Ian Bicking
-
Jean-Paul Calderone
-
Josselin Mouette
-
Kevin Teague
-
Marius Gedminas
-
Matthias Klose
-
Nicolas Chauvat
-
Russel Winder
-
Stephen Pascoe
-
Tarek Ziade
-
Tarek Ziadé
-
Toshio Kuratomi
-
zooko