Hi all, We need to make a decision about the packaging module in Python 3.3. Please read this message and breathe deeply before replying :) [Sorry this ends up being so long; Tarek, Georg, Guido, I hope you have the time to read it.] Let me first summarize the history of packaging in the standard library. (Please correct if I misremember something; this email expresses my opinion and I did not talk with Tarek or Alexis before writing it.) Two years ago the distutils2 (hereafter d2) project was started outside of the stdlib, to allow for fast-paced changes, releases and testing before merging it back. Changes in distutils were reverted to go back to misfeature-for-misfeature compatibility (but not bug-for-bug: bug fixes are allowed, unless we know for sure everyone is depending on them or working around them). Tarek’s first hope was to have something that could be included in 2.7 and 3.2, but these deadlines came too fast. At one point near the start of 2011 (didn’t find the email) there was a discussion with Martin about adding support for the stable ABI or parallel builds to distutils, in which Tarek and I opposed adding this new feature to distutils as per the freeze policy, and Martin declared he was not willing to work outside of the standard library. We (d2 developers and python-dev) then quickly agreed that distutils2 would be merged back after the release of 3.2, which was done. There was no PEP requested for this addition, maybe because this was not a fully new module but an improvement of an existing one with real-world-tested features, or maybe just because nobody thought about the process. In retrospect, a PEP would have really helped define the scope of the changes and the roadmap for packaging. Real life caused contributors to come and go, and the primary maintainer (Tarek at first, me since last December) to be at times very busy (like me these last three months), with the result that packaging is in my opinion just not ready. Many big and small things need more work: the three packaging PEPs implemented in d2 have small flaws or missing pieces (I’m not repeating the list here to avoid spawning subthreads) that need to be addressed, we’ve started to get feedback from users and developers only recently (pip and buildout devs since last PyCon for example) the public Python API of d2 is far from great, the implementation is of very unequal quality, important features have patches that are not fully ready (—and I do acknowledge that I am the blocker for reviews on many of them—), the compiler system has not been revised, tests are not all clear and robust, some of the bdist commands need to be removed, a new bdist format needs to be designed, etc. With beta coming, a way to deal with that unfortunate situation needs to be found. We could (a) grant an exception to packaging to allow changes after beta1; (b) keep packaging as it is now under a provisional status, with due warnings that many things are expected to change; (c) remove the unstable parts and deliver a subset that works (proposed by Tarek to the Pyramid author on distutils-sig); (d) not release packaging as part of Python 3.3 (I think that was also suggested on distutils-sig last month). I don’t think (a) would give us enough time; we really want a few months (and releases) to hash out the API (most notably with the pip and buildout developers) and clean the bdist situation. Likewise (c) would require developer (my) time that is currently in short supply. (b) also requires time and would make development harder, not to mention probable user pain. This leaves (d), after long reflection, as my preferred choice, even though I disliked the idea at first (and I fully expect Tarek to feel the same way). I’d like to stress that this is not as bad as it appears at first. We (I) will have to craft reassuring wording to explain why 3.3b1 does not include packaging any more, but I believe that it would be worse for our users (code-wise and PR-wise) to deliver a half-finished version in 3.3 rather than to retract it and wait for 3.4. And if we keep in mind that many people are still using a 2.x version, releasing in 3.3 or 3.4 makes no change for them: the standalone releases on PyPI will keep coming. Developer-wise, this would *not* mean that the considerable effort that went into porting and merging, and the really appreciated patches from developers such as Vinay, would have been in vain: even if packaging is removed from the main repo (or just from the release systems), there would be a clone to continue development, or the module would be added back right after the 3.3 release, or we would develop in the d2 repo and merge it back when it’s ready—this is really an implementation detail for the decision; my point is that the work will not be lost. Thanks for reading; please express your opinion (especially Tarek as d2 project lead, Georg as RM and Guido as BDFL).
Éric Araujo wrote:
This leaves (d), after long reflection, as my preferred choice, even though I disliked the idea at first (and I fully expect Tarek to feel the same way).
Thanks for reading; please express your opinion (especially Tarek as d2 project lead, Georg as RM and Guido as BDFL).
I would go with (d) -- it's still available on PyPI, and having a half-done product in the final release would not be good. ~Ethan~ (as ordinary user ;)
Reverting and writing a full packaging PEP for 3.4 sounds like a wise course of action to me. Regards, Nick. -- Sent from my phone, thus the relative brevity :)
Nick nailed it (again). On Tue, Jun 19, 2012 at 3:14 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Reverting and writing a full packaging PEP for 3.4 sounds like a wise course of action to me.
Regards, Nick. -- Sent from my phone, thus the relative brevity :)
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On Tue, 19 Jun 2012 21:36:35 -0700 Guido van Rossum <guido@python.org> wrote:
Nick nailed it (again).
Let's make things clear: packaging is suffering from a lack of developer involvement, and a lack of user interest. What makes you think that removing packaging from 3.3, and adding the constraint of a new PEP to be written, will actually *improve* things? Regards Antoine.
On 06/20, Antoine Pitrou wrote:
Let's make things clear: packaging is suffering from a lack of developer involvement,
Absolutely. And to be more precise: solid hands-on leadership. Eric wrote it in his original mail: both packaging maintainers are burned out/busy. That’s a state that is very unlikely to attract more developers – myself included.
and a lack of user interest.
Maybe I'm getting you wrong here, but ISTM that proper packaging is in the short list on nearly everybody’s “things I wish they'd fix in Python”.
On Wed, 20 Jun 2012 13:20:04 +0200 Hynek Schlawack <hs@ox.cx> wrote:
and a lack of user interest.
Maybe I'm getting you wrong here, but ISTM that proper packaging is in the short list on nearly everybody’s “things I wish they'd fix in Python”.
I agree, but I think people have also been burnt by the setuptools maintenance problem, the setuptools -> distribute migration, the easy_install / pip duality, and other stuff. I'm not sure they want to try out "yet another third-party distutils improvement from the cheeseshop". Regards Antoine.
Hi all, Sorry I can’t take the time to reply to all messages, this week I’m fully busy with work and moving out. To answer or correct a few things: - I am lacking time these months, but that’s because I’m still getting used to having a full-time job and being settled into a new country. With the feedback we’ve been getting from people recently, I am motivated, not burned out. - I have started building a group of distutils2 contributors here in Montreal. People are motivated, but it takes time to learn the codebase and tackle on the big things. - The four modules identified as minimal, standalone, good subset all have big problems (the PEPs have open issues, and the modules' APIs need improvements). - Tarek, Georg and Guido have pronounced. With all the respect I have for Antoine’s opinion, and the valid concerns he raises and that I don’t answer here, I consider option (d) accepted and will scrap one hour to do it before b1. Regards
Am 20.06.2012 17:34, schrieb Éric Araujo:
Hi all,
Sorry I can’t take the time to reply to all messages, this week I’m fully busy with work and moving out.
To answer or correct a few things:
- I am lacking time these months, but that’s because I’m still getting used to having a full-time job and being settled into a new country. With the feedback we’ve been getting from people recently, I am motivated, not burned out.
- I have started building a group of distutils2 contributors here in Montreal. People are motivated, but it takes time to learn the codebase and tackle on the big things.
- The four modules identified as minimal, standalone, good subset all have big problems (the PEPs have open issues, and the modules' APIs need improvements).
Tarek seems to think otherwise... looks like in the end, this subset could only be included as "provisional". Georg
On 6/20/12 5:44 PM, Georg Brandl wrote:
Am 20.06.2012 17:34, schrieb Éric Araujo:
Hi all,
Sorry I can’t take the time to reply to all messages, this week I’m fully busy with work and moving out.
To answer or correct a few things:
- I am lacking time these months, but that’s because I’m still getting used to having a full-time job and being settled into a new country. With the feedback we’ve been getting from people recently, I am motivated, not burned out.
- I have started building a group of distutils2 contributors here in Montreal. People are motivated, but it takes time to learn the codebase and tackle on the big things.
- The four modules identified as minimal, standalone, good subset all have big problems (the PEPs have open issues, and the modules' APIs need improvements). Tarek seems to think otherwise... looks like in the end, this subset could only be included as "provisional". I defer to Eric -- My answers are probably missing recent changes he knows
Georg
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
On 19 June 2012 22:46, Éric Araujo <merwok@netwok.org> wrote: [...]
This leaves (d), after long reflection, as my preferred choice, even though I disliked the idea at first (and I fully expect Tarek to feel the same way).
I agree with Nick. It's regrettable, but this is probably the wisest course of action. Remove packaging from 3.3, create a PEP clearly defining what packaging should be, and aim to implement for 3.4. It seems to me that there's a lot of interest in the packaging module, but it's fragmented and people have very different goals and expectations. Developing a PEP will likely be a big task in itself, but I'd hope that a well-crafted PEP will provide something the various people with an interest could get behind and work together on, which might help ease the developer time issue. (Assuming, of course, that championing the PEP doesn't burn Éric out completely...) Paul.
On 06/19/2012 05:46 PM, Éric Araujo wrote:
Hi all,
We need to make a decision about the packaging module in Python 3.3. Please read this message and breathe deeply before replying :)
...
With beta coming, a way to deal with that unfortunate situation needs to be found. We could (a) grant an exception to packaging to allow changes after beta1; (b) keep packaging as it is now under a provisional status, with due warnings that many things are expected to change; (c) remove the unstable parts and deliver a subset that works (proposed by Tarek to the Pyramid author on distutils-sig); (d) not release packaging as part of Python 3.3 (I think that was also suggested on distutils-sig last month).
I think it'd be very wise to choose (d) here. We've lived so long without a credible packaging story that waiting one (or even two) more major release cycles isn't going to make a huge difference in the long run but including a version of packaging now which gets fixed in a rush would probably end up muddying the already dark waters of Python software distribution. - C
On Tue, 19 Jun 2012 17:46:30 -0400 Éric Araujo <merwok@netwok.org> wrote:
I don’t think (a) would give us enough time; we really want a few months (and releases) to hash out the API (most notably with the pip and buildout developers) and clean the bdist situation. Likewise (c) would require developer (my) time that is currently in short supply. (b) also requires time and would make development harder, not to mention probable user pain. This leaves (d), after long reflection, as my preferred choice, even though I disliked the idea at first (and I fully expect Tarek to feel the same way).
The question is what will happen after 3.3. There doesn't seem to be a lot of activity around the project, does it? Regards Antoine.
On Wed, Jun 20, 2012 at 11:23 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
The question is what will happen after 3.3. There doesn't seem to be a lot of activity around the project, does it?
I think the desire is there, but there are enough "good enough" approaches around that people find other more immediately satisfying things to do with their time (hammering out consensus on packaging issues takes quite a bit of lead time to fully resolve). This will make a good guinea pig for my "release alphas early" proposal, though :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, 20 Jun 2012 15:00:52 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
On Wed, Jun 20, 2012 at 11:23 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
The question is what will happen after 3.3. There doesn't seem to be a lot of activity around the project, does it?
I think the desire is there,
What makes you think that, exactly? Regards Antoine.
On 6/20/12 11:04 AM, Antoine Pitrou wrote:
On Wed, 20 Jun 2012 15:00:52 +1000 Nick Coghlan<ncoghlan@gmail.com> wrote:
On Wed, Jun 20, 2012 at 11:23 AM, Antoine Pitrou<solipsis@pitrou.net> wrote:
The question is what will happen after 3.3. There doesn't seem to be a lot of activity around the project, does it? I think the desire is there, What makes you think that, exactly? Maybe because the packaging fatigue occurs around 3 years after you start fighting that best, and we do have fresh blood working on it ? :)
Regards
Antoine. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
What is the status of the third party module on PyPI (distutils2)? Does it contain all fixes done in the packaging module? Does it have exactly the same API? Does it support Python 2.5 to 3.3, or maybe also 2.4? How is the distutils2 module installed? Installed manually? Using pip or setuptools? Is distutils2 included in some Linux distributions? If it's simple to install distutils2, it's not a big deal to not have it in the stdlib. -- It is sometimes a pain to have a buggy module in Python. For example, I got a lot of issues with the subprocess module of Python 2.5. I started to include a copy of the subprocess module from Python 2.7 in my projects to workaround these issues. In my previous work we did also backport various modules to get the last version of the xmlrpc client on Python 2.5 (especially for HTTP/1.1, to not open a new TCP socket at each request). I don't want to reopen the discussion "the stdlib should be an external project". I just want to confirm that it is better to wait until important users of the packaging API have finished their work (on porting their project to distutils2, especially pip), before we can declare the module (and its API) as stable. By the way, what is the status of "pip using distutils2"? Victor
What is the status of the third party module on PyPI (distutils2)? Does it contain all fixes done in the packaging module? Does it have exactly the same API? Does it support Python 2.5 to 3.3, or maybe also 2.4?
How is the distutils2 module installed? Installed manually? Using pip or setuptools? Is distutils2 included in some Linux distributions?
If it's simple to install distutils2, it's not a big deal to not have it in the stdlib.
--
It is sometimes a pain to have a buggy module in Python. For example, I got a lot of issues with the subprocess module of Python 2.5. I started to include a copy of the subprocess module from Python 2.7 in my projects to workaround these issues.
In my previous work we did also backport various modules to get the last version of the xmlrpc client on Python 2.5 (especially for HTTP/1.1, to not open a new TCP socket at each request).
I don't want to reopen the discussion "the stdlib should be an external project". I just want to confirm that it is better to wait until important users of the packaging API have finished their work (on porting their project to distutils2, especially pip), before we can declare the module (and its API) as stable.
By the way, what is the status of "pip using distutils2"? Some students started on a pip2 that was based on distutils2, but I don't think
On Wednesday, June 20, 2012 at 2:36 AM, Victor Stinner wrote: they've really done much/anything with actually using distutils2 and have mostly been working on other parts.
Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org (mailto:Python-Dev@python.org) http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald.stufft%40gmail.com
Am 19.06.2012 23:46, schrieb Éric Araujo: Thanks for the detailed explanation, Éric. Just quoting this paragraph, since it contains the possibilities to judge:
With beta coming, a way to deal with that unfortunate situation needs to be found. We could (a) grant an exception to packaging to allow changes after beta1; (b) keep packaging as it is now under a provisional status, with due warnings that many things are expected to change; (c) remove the unstable parts and deliver a subset that works (proposed by Tarek to the Pyramid author on distutils-sig); (d) not release packaging as part of Python 3.3 (I think that was also suggested on distutils-sig last month).
(a) and (b) are certainly out of the question. packaging must be solid when shipped, and there's not enough time. (c) might work (depending on what features we're talking about), but you say yourself that you won't be able to spend the time required, so I agree with basically everybody else that (d) is the way to go (together with a PEP). Georg
On Tue, Jun 19, 2012 at 11:46 PM, Éric Araujo <merwok@netwok.org> wrote:
I don’t think (a) would give us enough time; we really want a few months (and releases) to hash out the API (most notably with the pip and buildout developers) and clean the bdist situation. Likewise (c) would require developer (my) time that is currently in short supply. (b) also requires time and would make development harder, not to mention probable user pain. This leaves (d), after long reflection, as my preferred choice, even though I disliked the idea at first (and I fully expect Tarek to feel the same way).
It's a pity, but it sounds like the way to go. This may be crazy, but just idly wondering: is there an opportunity for the PSF to make things better by throwing some money at it? Packaging appears to be one of those Hard problems, it might be a good investment. Cheers, Dirkjan
This may be crazy, but just idly wondering: is there an opportunity for the PSF to make things better by throwing some money at it? Packaging appears to be one of those Hard problems, it might be a good investment.
Only if somebody steps forward to take the money - and somebody who can be trusted to achieve something, as well. The general problem is that issues may only occur when packages actually use the library; so it may even be difficult to fix it in a concerted effort since that fixing may actually spread over several months (or years). Regards, Martin
On 6/19/12 11:46 PM, Éric Araujo wrote: ...
I don’t think (a) would give us enough time; we really want a few months (and releases) to hash out the API (most notably with the pip and buildout developers) and clean the bdist situation. Likewise (c) would require developer (my) time that is currently in short supply. (b) also requires time and would make development harder, not to mention probable user pain. This leaves (d), after long reflection, as my preferred choice, even though I disliked the idea at first (and I fully expect Tarek to feel the same way).
Yeah I feel the same way. +1 for (d). I had unfortunately no time lately. Thanks for picking up things. We want a solid distutils replacement, and I think we wrote solid PEPs and seemed to have find consensus for most issues in the past two years. So I prefer to hold it and have a solid implementation in the stldib. The only thing I am asking is to retain ourselves to do *anything* in distutils and continue to declare it frozen, because I know it will be tempting to do stuff there... Cheers Tarek
On Wed, Jun 20, 2012 at 10:55 AM, Tarek Ziadé <tarek@ziade.org> wrote:
So I prefer to hold it and have a solid implementation in the stldib. The only thing I am asking is to retain ourselves to do *anything* in distutils and continue to declare it frozen, because I know it will be tempting to do stuff there...
That policy has been a bit annoying. Gentoo has been carrying patches forever to improve compilation with C++ stuff (mostly about correctly passing on environment variables), and forward-porting them on every release gets tedious, but the packaging/distutils2 effort has made it harder to get them included in plain distutils. I understand there shouldn't be crazy patching in distutils, but allowing it to inch forward a little would make the lives of the Gentoo Python team easier, at least. Cheers, Dirkjan
On Wed, 20 Jun 2012 11:05:43 +0200 Dirkjan Ochtman <dirkjan@ochtman.nl> wrote:
On Wed, Jun 20, 2012 at 10:55 AM, Tarek Ziadé <tarek@ziade.org> wrote:
So I prefer to hold it and have a solid implementation in the stldib. The only thing I am asking is to retain ourselves to do *anything* in distutils and continue to declare it frozen, because I know it will be tempting to do stuff there...
That policy has been a bit annoying. Gentoo has been carrying patches forever to improve compilation with C++ stuff (mostly about correctly passing on environment variables), and forward-porting them on every release gets tedious, but the packaging/distutils2 effort has made it harder to get them included in plain distutils. I understand there shouldn't be crazy patching in distutils, but allowing it to inch forward a little would make the lives of the Gentoo Python team easier, at least.
I think the whole idea that distutils should be frozen and improvements should only go in distutils2 has been misled. Had distutils been improved instead, many of those enhancements would already have been available in 3.2 (and others would soon be released in 3.3). Deciding to remove packaging from 3.3 is another instance of the same mistake, IMO. Regards Antoine.
On Wed, 20 Jun 2012 11:05:43 +0200 Dirkjan Ochtman<dirkjan@ochtman.nl> wrote:
So I prefer to hold it and have a solid implementation in the stldib. The only thing I am asking is to retain ourselves to do *anything* in distutils and continue to declare it frozen, because I know it will be tempting to do stuff there... That policy has been a bit annoying. Gentoo has been carrying patches forever to improve compilation with C++ stuff (mostly about correctly
On Wed, Jun 20, 2012 at 10:55 AM, Tarek Ziadé<tarek@ziade.org> wrote: passing on environment variables), and forward-porting them on every release gets tedious, but the packaging/distutils2 effort has made it harder to get them included in plain distutils. I understand there shouldn't be crazy patching in distutils, but allowing it to inch forward a little would make the lives of the Gentoo Python team easier, at least. I think the whole idea that distutils should be frozen and improvements should only go in distutils2 has been misled. Had distutils been improved instead, many of those enhancements would already have been available in 3.2 (and others would soon be released in 3.3). I tried to improve Distutils and I was stopped and told to start distutils2, because distutils is so rotten, any *real* change/improvment potentially brakes
On 6/20/12 11:12 AM, Antoine Pitrou wrote: the outside world. This has not changed.
Deciding to remove packaging from 3.3 is another instance of the same mistake, IMO.
So what are your suggesting, since you seem to know what's a mistake and what's not ? (time-travel machine not allowed)
Regards
Antoine.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
On Wed, 20 Jun 2012 11:22:07 +0200 Tarek Ziadé <tarek@ziade.org> wrote:
I tried to improve Distutils and I was stopped and told to start distutils2, because distutils is so rotten, any *real* change/improvment potentially brakes the outside world.
If distutils was so rotten, distutils2 would not reuse much of its structure and concepts (and test suite), would it? Most of the distutils2 improvements (new PEPs, setup.cfg, etc.) were totally possible in distutils, weren't they?
Deciding to remove packaging from 3.3 is another instance of the same mistake, IMO. So what are your suggesting, since you seem to know what's a mistake and what's not ?
I don't have any suggestion apart from keeping packaging in 3.3. But I also think it would be better for the community if people were not delusional when making decisions. Removing packaging from 3.3 is a big risk: users and potential contributors will be even less interested than they already are. Here's a datapoint: distribute (*) is downloaded 100x more times than distutils2 (**). (*) http://pypi.python.org/pypi/distribute/ (**) http://pypi.python.org/pypi/Distutils2/ Regards Antoine.
On 6/20/12 11:49 AM, Antoine Pitrou wrote:
On Wed, 20 Jun 2012 11:22:07 +0200 Tarek Ziadé<tarek@ziade.org> wrote:
I tried to improve Distutils and I was stopped and told to start distutils2, because distutils is so rotten, any *real* change/improvment potentially brakes the outside world. If distutils was so rotten, distutils2 would not reuse much of its structure and concepts (and test suite), would it? 'much' is pretty vague here. distutils2 is a fork of distutils, that has evolved a *lot*
if you look at the code, beside the compilation part and some commands, most things are different. distutils is "rotten" because when you change its internals, you might break some software that rely on them.
Most of the distutils2 improvements (new PEPs, setup.cfg, etc.) were totally possible in distutils, weren't they?
Deciding to remove packaging from 3.3 is another instance of the same mistake, IMO. So what are your suggesting, since you seem to know what's a mistake and what's not ? I don't have any suggestion apart from keeping packaging in 3.3.
But I also think it would be better for the community if people were not delusional when making decisions. Removing packaging from 3.3 is a big risk: users and potential contributors will be even less interested than they already are. That's a good point. But if no one works on its polishing *now*, it's going to be the same effect on people:
I started there, remember ? And we ended up saying it was impossible to continue without breaking the packaging world. they'll likely to be very annoyed if the replacer is not rock solid.
Here's a datapoint: distribute (*) is downloaded 100x more times than distutils2 (**).
(*) http://pypi.python.org/pypi/distribute/ (**) http://pypi.python.org/pypi/Distutils2/
why would you expect a different datapoint ? - Distutils2 was released as a beta software, and not really promoted yet - Distribute is downloaded automatically by many stacks out there, and PyPI does not make a difference whether the hit was from a human behind pip, or from a stack like zc.buildout
Regards
Antoine.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
On Wed, 20 Jun 2012 12:30:51 +0200 Tarek Ziadé <tarek@ziade.org> wrote:
Most of the distutils2 improvements (new PEPs, setup.cfg, etc.) were totally possible in distutils, weren't they?
I started there, remember ? And we ended up saying it was impossible to continue without breaking the packaging world.
"we" were only certain people, AFAIR.
why would you expect a different datapoint ?
I wasn't expecting a different datapoint, I'm pointing that shipping packaging in the stdlib would provide a much better exposure. Regards Antoine.
On 6/20/12 12:39 PM, Antoine Pitrou wrote:
On Wed, 20 Jun 2012 12:30:51 +0200 Tarek Ziadé<tarek@ziade.org> wrote:
Most of the distutils2 improvements (new PEPs, setup.cfg, etc.) were totally possible in distutils, weren't they? I started there, remember ? And we ended up saying it was impossible to continue without breaking the packaging world. "we" were only certain people, AFAIR. That was the BDFL decision after a language summit. Having tried to innovate in Distutils in the past, I think it's a very good decision,
Am 20.06.2012 12:39, schrieb Antoine Pitrou:
On Wed, 20 Jun 2012 12:30:51 +0200 Tarek Ziadé <tarek@ziade.org> wrote:
Most of the distutils2 improvements (new PEPs, setup.cfg, etc.) were totally possible in distutils, weren't they?
I started there, remember ? And we ended up saying it was impossible to continue without breaking the packaging world.
"we" were only certain people, AFAIR.
Yes. The people willing to work on packaging in Python, to be exact.
why would you expect a different datapoint ?
I wasn't expecting a different datapoint, I'm pointing that shipping packaging in the stdlib would provide a much better exposure.
But not if it's not ready for prime time. (And providing the finished distutils2 for Python 2 will provide even better exposure at the moment.) Georg
On Wed, 20 Jun 2012 09:51:03 +0000 (UTC) Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Antoine Pitrou <solipsis <at> pitrou.net> writes:
Deciding to remove packaging from 3.3 is another instance of the same mistake, IMO.
What's the rationale for leaving it in, when it's known to be incomplete/unfinished?
As an incentive for users to start using the features that are finished enough, and exercise the migration path from distutils. The module can be marked "provisional" so as to allow further API variations. Regards Antoine.
On 6/20/12 11:54 AM, Antoine Pitrou wrote:
On Wed, 20 Jun 2012 09:51:03 +0000 (UTC) Vinay Sajip<vinay_sajip@yahoo.co.uk> wrote:
Antoine Pitrou<solipsis<at> pitrou.net> writes:
Deciding to remove packaging from 3.3 is another instance of the same mistake, IMO.
What's the rationale for leaving it in, when it's known to be incomplete/unfinished? As an incentive for users to start using the features that are finished enough, and exercise the migration path from distutils. The module can be marked "provisional" so as to allow further API variations. It's true that some modules are quite mature and already useful:
- packaging.version (PEP 386) - packaging.pypi - packaging.metadata (PEP 345) - packaging.database (PEP 386) the part that is not ready is the installer and some setuptools bridging
Regards
Antoine.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
On 20 June 2012 11:34, Tarek Ziadé <tarek@ziade.org> wrote:
On 6/20/12 11:54 AM, Antoine Pitrou wrote:
On Wed, 20 Jun 2012 09:51:03 +0000 (UTC) Vinay Sajip<vinay_sajip@yahoo.co.uk> wrote:
Antoine Pitrou<solipsis<at> pitrou.net> writes:
Deciding to remove packaging from 3.3 is another instance of the same mistake, IMO.
What's the rationale for leaving it in, when it's known to be incomplete/unfinished?
As an incentive for users to start using the features that are finished enough, and exercise the migration path from distutils. The module can be marked "provisional" so as to allow further API variations.
It's true that some modules are quite mature and already useful:
- packaging.version (PEP 386) - packaging.pypi - packaging.metadata (PEP 345) - packaging.database (PEP 386)
the part that is not ready is the installer and some setuptools bridging
I've never seen that information mentioned before. So that's (good) news. A question, then. Why is it not an option to: 1. Rip out all bar those 4 modules. 2. Make sure they are documented and tested solidly (they may already be, I don't know). 3. Declare that to be what packaging *is* for Python 3.3. Whether any of those modules are of any use in isolation, is a slightly more complex question. As is whether the APIs are guaranteed to be sufficient for further development on "the rest" of packaging, given that by doing this we commit to API stability and backward compatibility. Your comment "quite mature and already useful" is not quite firm enough to reassure me that we're ready to set those modules in stone (although presumably the 3 relating to the PEPs are, simply because they implement what the PEPs say). Paul.
On 6/20/12 1:19 PM, Paul Moore wrote:
On 20 June 2012 11:34, Tarek Ziadé<tarek@ziade.org> wrote:
On 6/20/12 11:54 AM, Antoine Pitrou wrote:
On Wed, 20 Jun 2012 09:51:03 +0000 (UTC) Vinay Sajip<vinay_sajip@yahoo.co.uk> wrote:
Antoine Pitrou<solipsis<at> pitrou.net> writes:
Deciding to remove packaging from 3.3 is another instance of the same mistake, IMO.
What's the rationale for leaving it in, when it's known to be incomplete/unfinished? As an incentive for users to start using the features that are finished enough, and exercise the migration path from distutils. The module can be marked "provisional" so as to allow further API variations. It's true that some modules are quite mature and already useful:
- packaging.version (PEP 386) - packaging.pypi - packaging.metadata (PEP 345) - packaging.database (PEP 386)
the part that is not ready is the installer and some setuptools bridging I've never seen that information mentioned before. So that's (good) news.
A question, then. Why is it not an option to:
1. Rip out all bar those 4 modules. 2. Make sure they are documented and tested solidly (they may already be, I don't know). 3. Declare that to be what packaging *is* for Python 3.3.
Whether any of those modules are of any use in isolation, is a slightly more complex question. As is whether the APIs are guaranteed to be sufficient for further development on "the rest" of packaging, given that by doing this we commit to API stability and backward compatibility. Your comment "quite mature and already useful" is not quite firm enough to reassure me that we're ready to set those modules in stone (although presumably the 3 relating to the PEPs are, simply because they implement what the PEPs say).
The last time I checked: packaging.version is the implementation of PEP 386, and stable. It's one building block that would be helpful as-is in the stdlib. it's completely standalone. packaging.metadata is the implementation of all metadata versions. standalone too. packaging.pypi is the PyPI crawler, and has fairly advanced features. I defer to Alexis to tell us is it's completely stable packaging.database is where PEP 376 is. It has the most innovations, implements PEP 376 packaging.config is the setup.cfg reader. Ity's awseome because together with packaging.database and packaging.markers, it gives you OS-independant data files. see http://docs.python.org/dev/packaging/setupcfg.html#resources Yeah maybe this subset could be left in 3.3 and we'd remove packaging-the-installer part (pysetup, commands, compilers) I think it's a good idea !
Paul.
On Wed, Jun 20, 2012 at 9:31 PM, Tarek Ziadé <tarek@ziade.org> wrote:
Yeah maybe this subset could be left in 3.3
and we'd remove packaging-the-installer part (pysetup, commands, compilers)
I think it's a good idea !
OK, to turn this into a concrete suggestion based on the packaging docs. Declare stable, include in 3.3 ------------------------------------------ packaging.version — Version number classes packaging.metadata — Metadata handling packaging.markers — Environment markers packaging.database — Database of installed distributions Maybe needed as dependencies for above? ------------------------------------------------ packaging.errors — Packaging exceptions packaging.util — Miscellaneous utility functions It seems to me that stripping the library, docs and tests back to just these 4 modules and their dependencies shouldn't be much harder than stripping packaging in its entirety, but my question is what benefit would we gain from having these (and just these) in the 3.3 stdlib over having them available on PyPI in distutils2? Third party projects over the next couple of years are going to want to support more than just 3.3, so simply depending on distutils2 for this functionality seems like a far more sensible option. OTOH, it does send a clear message that progress *is* being made, we just tried to take too big a jump from implementing these lower level standards up to "wholesale replacement of distutils" without first clearly documenting exactly what was wrong with the status quo and what we wanted to change about it as a sequence of PEPs. I've broken up the rest of packaging's functionality below into a series of candidate PEPs that may be more manageable than a single monolothic "fix packaging" PEP. If we can get multiple interested parties focusing on different aspects, that may also help with reducing the burnout factor. Python's current packaging and distribution story is held together with duct tape and baling wire due to decisions that were made years ago - unwinding some of those decisions and replacing them with an alternative that is built on a solid architectural foundation backed by some level of community consensus is *not* an easy task, and not one that will be completed quickly (undue haste will fail the "some level of community consensus" part, thus missing much of the point of the exercise). That said, I don't think it's unsolvable either, and there's a reasonable chance to get something cleaner in place for 3.4. 3.4 PEP: Distutils replacement: Packaging, Distribution & Installation -------------------------------------------- # This is one of the big balls of mud w.r.t distutils where third party projects dig deep into the implementation details because that is the only way to get things done # It may even be the case that this can be broken up even further packaging.install — Installation tools packaging.dist — The Distribution class packaging.manifest — The Manifest class packaging.command — Standard Packaging commands packaging.command.cmd — Abstract base class for Packaging commands 3.4 PEP: Distutils replacement: Compiling Extension Modules -------------------------------------------- # Another big ball of mud. It sounds like the Gentoo folks may have some feedback in this space. packaging.compiler — Compiler classes packaging.compiler.ccompiler — CCompiler base class packaging.compiler.extension — The Extension class 3.4 PEP: Standard library package downloader (pysetup) ---------------------------------- # Amongst other things, this needs to have a really good security story (refusing to install unsigned packages by default, etc) packaging.depgraph — Dependency graph builder packaging.pypi — Interface to projects indexes packaging.pypi.client — High-level query API packaging.pypi.base — Base class for index crawlers packaging.pypi.dist — Classes representing query results packaging.pypi.simple — Crawler using the PyPI “simple” interface packaging.pypi.xmlrpc — Crawler using the PyPI XML-RPC interface packaging.tests.pypi_server — PyPI mock server packaging.fancy_getopt — Wrapper around the getopt module # Why does this exist? 3.4 PEP: Simple binary package distribution format -------------------------------------------------------------------------- bdist_simple has been discussed enough times, finally seeing a PEP for it would be nice :) I think the main lesson to be learned here is that "fix packaging" is simply too big a task to be managed sensibly. Smaller goals like "Standardise versioning", "Fix package metadata", "Support uninstall" are hard enough to achieve, but also provide concrete milestones along the way to the larger goal. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 20 June 2012 13:53, Nick Coghlan <ncoghlan@gmail.com> wrote: [...]
3.4 PEP: Simple binary package distribution format --------------------------------------------------------------------------
bdist_simple has been discussed enough times, finally seeing a PEP for it would be nice :)
I had a PEP for this one part written - Éric had a brief look at it but I never posted it publicly. Before it'll go anywhere, a bit more of the "infrastructure PEPs" you mentioned and I trimmed would need to be completed, but I'd be willing to resurrect it when we get to that stage... Paul.
3.4 PEP: Standard library package downloader (pysetup) ---------------------------------- # Amongst other things, this needs to have a really good security story (refusing to install unsigned packages by default, etc) packaging.depgraph — Dependency graph builder packaging.pypi — Interface to projects indexes packaging.pypi.client — High-level query API packaging.pypi.base — Base class for index crawlers packaging.pypi.dist — Classes representing query results packaging.pypi.simple — Crawler using the PyPI “simple” interface packaging.pypi.xmlrpc — Crawler using the PyPI XML-RPC interface packaging.tests.pypi_server — PyPI mock server packaging.fancy_getopt — Wrapper around the getopt module # Why does this exist? I'm okay and willing to work on this part. I started a full review of
On 20/06/2012 14:53, Nick Coghlan wrote: the code I wrote years ago, and which clearly needs some cleaning. Alos, I'm not sure to understand what having a PEP to manage this means: should I describe all the API in a text document (with examples) so we can discuss this and validate before doing the changes/cleanings to the API? Alexis
On Wed, Jun 20, 2012 at 11:19 PM, Alexis Métaireau <alexis@notmyidea.org> wrote:
On 20/06/2012 14:53, Nick Coghlan wrote:
3.4 PEP: Standard library package downloader (pysetup) ---------------------------------- # Amongst other things, this needs to have a really good security story (refusing to install unsigned packages by default, etc) packaging.depgraph — Dependency graph builder packaging.pypi — Interface to projects indexes packaging.pypi.client — High-level query API packaging.pypi.base — Base class for index crawlers packaging.pypi.dist — Classes representing query results packaging.pypi.simple — Crawler using the PyPI “simple” interface packaging.pypi.xmlrpc — Crawler using the PyPI XML-RPC interface packaging.tests.pypi_server — PyPI mock server packaging.fancy_getopt — Wrapper around the getopt module # Why does this exist?
I'm okay and willing to work on this part. I started a full review of the code I wrote years ago, and which clearly needs some cleaning. Alos, I'm not sure to understand what having a PEP to manage this means: should I describe all the API in a text document (with examples) so we can discuss this and validate before doing the changes/cleanings to the API?
There would be two main parts to such a PEP: - defining the command line interface and capabilities (pysetup) - defining the programmatic API (packaging.pypi and the dependency graph management) I would suggest looking at PEP 405 (venv) and PEP 397 (Windows launcher) to get an idea of the kind of content that might be appropriate. It's definitely not necessary to reproduce the full API details verbatim in the PEP text - it's OK to provide highlights and point to a reference implementation for the full details. The PEP process can also be a good way to get feedback on an API design that otherwise may not be forthcoming (cf. the recent inspect.Signature discussions). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
There would be two main parts to such a PEP: - defining the command line interface and capabilities (pysetup) - defining the programmatic API (packaging.pypi and the dependency graph management) Okay. I don't think that the command line has anything to do with
I would suggest looking at PEP 405 (venv) and PEP 397 (Windows launcher) to get an idea of the kind of content that might be appropriate. It's definitely not necessary to reproduce the full API details verbatim in the PEP text - it's OK to provide highlights and point to a reference implementation for the full details. Thanks for the pointers, will read them and try to come back with a PEP
Le mer. 20 juin 2012 15:28:56 CEST, Nick Coghlan a écrit : packaging.pypi and dependency management tools. One is dealing with the hole cli for different tings (install / remove / search etc.) while the other one is only how to communicate with the indexes and build dependency graphs from there. We probably should put the cli part in a separate PEP, as the scopes aren't the same that the ones I see for packaging.pypi / depgraph proposal. Alexis
On 6/20/12 2:53 PM, Nick Coghlan wrote:
On Wed, Jun 20, 2012 at 9:31 PM, Tarek Ziadé<tarek@ziade.org> wrote:
Yeah maybe this subset could be left in 3.3
and we'd remove packaging-the-installer part (pysetup, commands, compilers)
I think it's a good idea ! OK, to turn this into a concrete suggestion based on the packaging docs.
Declare stable, include in 3.3 ------------------------------------------ packaging.version — Version number classes packaging.metadata — Metadata handling packaging.markers — Environment markers packaging.database — Database of installed distributions I think that's a good subset.
+1 on all of the things you said after If you succeed on getting the sci people working on "PEP: Distutils replacement: Compiling Extension Modules" it will be a big win.
On 20/06/2012 13:31, Tarek Ziadé wrote:
packaging.metadata is the implementation of all metadata versions. standalone too.
packaging.pypi is the PyPI crawler, and has fairly advanced features. I defer to Alexis to tell us is it's completely stable
packaging.pypi is functionally working but IMO the API can (and probably should) be improved (we really lack feedback to know that).
On 20 June 2012 14:16, Alexis Métaireau <alexis@notmyidea.org> wrote:
On 20/06/2012 13:31, Tarek Ziadé wrote:
packaging.metadata is the implementation of all metadata versions. standalone too.
packaging.pypi is the PyPI crawler, and has fairly advanced features. I defer to Alexis to tell us is it's completely stable
packaging.pypi is functionally working but IMO the API can (and probably should) be improved (we really lack feedback to know that).
I wasn't aware of this - I've had a look and my first thought is that the documentation needs completing. At the moment, there's a lot that isn't documented, and we should avoid getting into the same situation as with distutils where people have to use undocumented APIs to get anything done. There are a lot of examples, but not so much formal API documentation. I don't mean to pick on this one module - unless things have changed a lot, the same is probably true of much of the rest of packaging. Lack of documentation is the #1 criticism I've seen. Are there people willing to do some serious documentation work to get the docs for the "agreed stable" parts of packaging complete? There's more time to do this (doc changes don't have to be done before the beta), but by deciding to retain parts of packaging, we *are* making a commitment to complete that documentation, in my view. Paul. PS packaging.pypi is nice - I wish I'd known of its existence for a bit of work I was doing a little while ago...
On 20 June 2012 10:12, Antoine Pitrou <solipsis@pitrou.net> wrote:
I think the whole idea that distutils should be frozen and improvements should only go in distutils2 has been misled. Had distutils been improved instead, many of those enhancements would already have been available in 3.2 (and others would soon be released in 3.3).
The problem seems to be that in the past, any changes in distutils have been met with cries of "you can't do that", basically because the lack of a clearly distinguished extension API means that the assumption is that for any bit of the internals, someone somewhere has monkeypatched or relied on it. The issue is compounded by the fact that a lot of distutils is undocumented, or at least badly documented, so saying "if it's not documented, it's internal" doesn't work :-( Maybe if we could be a bit more aggressive in saying what counts as "internal" and resist the cries of "but I use it", modifying distutils might be a more viable option, But there doesn't seem to be anyone willing to take and defend that step. IIRC, Tarek proposed distutils2/packaging after getting frustrated with how little he could usefully do on distutils itself.
Deciding to remove packaging from 3.3 is another instance of the same mistake, IMO.
I see your point, but without sufficient developer resource, the question is whether packaging is in a usable state at all. Nobody but Éric is actually working on packaging (and as he says, even he is not able to at the moment), so what alternatives are there? I guess one extra option not mentioned by Éric is to make the packaging issues into release blocker bugs. That would effectively stall the release of 3.3 until packaging could be brought to an acceptable state, effectively a form of blackmail. I can't imagine anyone wants to take that approach. And yet, some of the existing bugs would clearly be release blockers if they were in any other part of Python. I think the first question is, do we need an enhanced distutils in the stdlib? As far as I can see, this one has been answered strongly, in the affirmative, a few times now. And yet, the need seems to be a diffuse thing, with no real champion (Tarek and Éric have both tried to take that role, and both appear to have been unable to invest the necessary amount of time - which doesn't surprise me, it seems to be a massive task). Removing packaging from 3.3, in my mind acknowledges that as it stands the approach was a failed experiment[1]. Better to get it taken out before it appears in a released version of Python. We need to rethink the approach. I see a number of options going forward, all of which are based round the need to ensure enough developer involvement, so that Tarek and Éric get help, and don't simply burn out before we have anything useful. 1. Reconsider the decision to freeze distutils, with a view to migrating incrementally to the feature set we want from packaging. That'll be hard as we'd need to take a much stronger line on making changes that could break existing code (stronger in the sense of "so we broke your code, tough - you were using undocumented/internal APIs"). And I suspect Tarek wouldn't be willing to take this route, so we immediately lose one resource. Maybe the other core developers could take up the slack, though. For example, Antoine, you seem to be implying that you would have worked on distutils if this had happened. 2. Free up distutils2 to develop as an external package, and then have a PEP proposing its inclusion in the stdlib in due course, when it is ready and has been proven in the wild. The benefit here is that I assume that as a separate project, becoming a committer would be easier than becoming a Python core developer, so there's a wider pool of developers. The downside is that the timescales would be a lot longer (I doubt we'd see anything in 3.4 this way, and maybe not even 3.5). 3. Write a PEP describing precisely what the packaging module will do, get consensus/agreement, and then restart development based on a solid scope and spec. This is the correct route for getting something added direct to the stdlib, and it's a shame it didn't happen in the first place for packaging. Having said that, the PEP would likely be huge, given the scope of packaging, and would require a lot of time from a champion. There's no guarantee that championing a PEP wouldn't burn someone out just as rapidly as developing the module itself :-( And also, given that the packaging documentation is one of its weak areas, I'd have to say I have concerns as to whether a PEP would come together in any case... The assumption here, though, is that the PEP process creates the debate, and results in interested parties coming together in the discussion. If we can keep that momentum, we get a pool of interested developers who may well assist in the coding aspects. The one option I don't like is taking packaging out, releasing 3.3, and then putting it straight back in as is, and simply carry on as now, in the hope that it'll be ready for 3.4. I honestly doubt that the only issue is that we've run out of time before 3.3. There are more fundamental problems that need to be addressed as well - specifically the reliance on one individual to bear all of the load. Just my thoughts, Paul. [1] That reads really harshly. I don't mean to criticise any of the work that's been done, I'm a huge fan of the idea of packaging, and its goals. The "experiment" in this case is around process - developing something as big and important as packaging with limited developer resource, relatively directly in the core (bringing it in from distutils2 sooner rather than later) and working from a series of smaller PEPs focused on particular details, rather than an overall one covering the whole package.
On Wed, 20 Jun 2012 12:11:03 +0100 Paul Moore <p.f.moore@gmail.com> wrote:
I think the first question is, do we need an enhanced distutils in the stdlib?
I would answer a different question: we definitely need a better distutils/packaging story. Whether it's in the form of distutils enhancements, or another package, is not clear-cut. By the way, let me point out that the "distutils freeze" has been broken to implement PEP 405 (I approve the breakage myself): http://hg.python.org/cpython/rev/294f8aeb4d4b#l4.1
As far as I can see, this one has been answered strongly, in the affirmative, a few times now. And yet, the need seems to be a diffuse thing, with no real champion
Packaging is not a very motivating topic for many developers (myself included). It's like the build process or the buildbot fleet :-)
2. Free up distutils2 to develop as an external package, and then have a PEP proposing its inclusion in the stdlib in due course, when it is ready and has been proven in the wild. [...] The downside is that the timescales would be a lot longer (I doubt we'd see anything in 3.4 this way, and maybe not even 3.5).
Agreed, especially if the "proven in the wild" criterion is required (people won't rush to another third-party distutils replacement, IMHO).
3. Write a PEP describing precisely what the packaging module will do, get consensus/agreement, and then restart development based on a solid scope and spec.
I think it's the best way to sink the project definitively. Our community isn't organized for such huge monolithic undertakings.
[1] That reads really harshly. I don't mean to criticise any of the work that's been done, I'm a huge fan of the idea of packaging, and its goals. The "experiment" in this case is around process - developing something as big and important as packaging with limited developer resource, relatively directly in the core (bringing it in from distutils2 sooner rather than later) and working from a series of smaller PEPs focused on particular details, rather than an overall one covering the whole package.
I cannot speak for Tarek, but one of the reasons it's been done as a set of smaller PEPs is that these PEPs were meant to be included in *distutils*, not distutils2. That is, the module already existed and the PEPs were individual, incremental improvements. Regards Antoine.
On Wed, Jun 20, 2012 at 9:46 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Agreed, especially if the "proven in the wild" criterion is required (people won't rush to another third-party distutils replacement, IMHO).
The existence of setuptools means that "proven in the wild" is never going to fly - a whole lot of people use setuptools and easy_install happily, because they just don't care about the downsides it has in terms of loss of control of a system configuration.
I cannot speak for Tarek, but one of the reasons it's been done as a set of smaller PEPs is that these PEPs were meant to be included in *distutils*, not distutils2. That is, the module already existed and the PEPs were individual, incremental improvements.
That initial set of PEPs were also aimed at defining interoperability standards that multiple projects could implement independently, even *without* support in the standard library. As I wrote in my other email, I think one key aspect of where we went wrong after that point was in never clearly spelling out just what we collectively meant by "fix packaging". Most of the burden of interpreting that phrase thus landed directly on the shoulders of the distutils2 project lead. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan <ncoghlan@gmail.com> wrote:
On Wed, Jun 20, 2012 at 9:46 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Agreed, especially if the "proven in the wild" criterion is required (people won't rush to another third-party distutils replacement, IMHO).
The existence of setuptools means that "proven in the wild" is never going to fly - a whole lot of people use setuptools and easy_install happily, because they just don't care about the downsides it has in terms of loss of control of a system configuration.
Maybe not "happily" :-). Speaking for myself, I'd love to find an alternative, but setuptools seems to be the only system that knows how to build shared libraries across all my platforms. I've got little interest in a packaging module that doesn't include the compiler magic to do that. Bill
On Wed, Jun 20, 2012 at 9:02 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Wed, Jun 20, 2012 at 9:46 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Agreed, especially if the "proven in the wild" criterion is required (people won't rush to another third-party distutils replacement, IMHO).
The existence of setuptools means that "proven in the wild" is never going to fly - a whole lot of people use setuptools and easy_install happily, because they just don't care about the downsides it has in terms of loss of control of a system configuration.
Um, this may be a smidge off topic, but what "loss of control" are we talking about here? AFAIK, there isn't anything it does that you can't override with command line options or the config file. (In most cases, standard distutils options or config files.) Do you just mean that most people use the defaults and don't care about there being other options? And if that's the case, which other options are you referring to? If the long-term goal is to draw setuptools users over to packaging, then AFAIK the packaging effort is still missing a few things, like build-time dependencies and alternatives to setuptools' entry points and "extras", as well as the ability to integrate version control for building sdists (without requiring the sdist's recipient to *also* have the version control integration in order to build the package or recreate a new sdist). These are just the missing features that I know of, from recent distutils-sig discussions; I don't know how complete a list this is. While no single one of these features is directly used by every project or even a majority of such projects, there is a correlation between size of a project and the likelihood that they are depending on one or more of these features. i.e., the bigger and more widely-used the project, the more likely it is to either use one of these features, or depend on a project that does. Some of these features could be built on top of packaging, in more or less the same way setuptools is built on top of distutils. But whether they're done inside or outside of the packaging library, somebody's going to have to do them, for people to be able to migrate off of setuptools.
On Thu, Jun 21, 2012 at 3:29 AM, PJ Eby <pje@telecommunity.com> wrote:
On Wed, Jun 20, 2012 at 9:02 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Wed, Jun 20, 2012 at 9:46 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Agreed, especially if the "proven in the wild" criterion is required (people won't rush to another third-party distutils replacement, IMHO).
The existence of setuptools means that "proven in the wild" is never going to fly - a whole lot of people use setuptools and easy_install happily, because they just don't care about the downsides it has in terms of loss of control of a system configuration.
Um, this may be a smidge off topic, but what "loss of control" are we talking about here? AFAIK, there isn't anything it does that you can't override with command line options or the config file. (In most cases, standard distutils options or config files.) Do you just mean that most people use the defaults and don't care about there being other options? And if that's the case, which other options are you referring to?
No, I mean there are design choices in setuptools that explain why many people don't like it and are irritated when software they want to use depends on it without a good reason. Clearly articulating the reasons that "just include setuptools" is no longer being considered as an option should be one of the goals of any PEPs associated with adding packaging back for 3.4. The reasons I'm personally aware of: - it's a unilateral runtime fork of the standard library that bears a lot of responsibility for the ongoing feature freeze in distutils. Standard assumptions about the behaviour of site and distutils cease to be valid once setuptools is installed - overuse of "*.pth" files and the associated sys.path changes for all Python programs running on a system. setuptools gleefully encourages the inclusion of non-trivial code snippets in *.pth files that will be executed by all programs. - advocacy for the "egg" format and the associated sys.path changes that result for all Python programs running on a system - too much magic that is enabled by default and is hard to switch off (e.g. http://rhodesmill.org/brandon/2009/eby-magic/) System administrators (and developers that think like system administrators when it comes to configuration management) *hate* what setuptools (and setuptools based installers) can do to their systems. It doesn't matter that package developers don't *have* to do those things - what matters is that the needs and concerns of system administrators simply don't appear to have been anywhere on the radar when setuptools was being designed. (If those concerns actually were taken into account at some point, it's sure hard to tell from the end result and the choices of default behaviour) setuptools is a masterful achievement built on shaky foundations that will work most of the time. However, when it doesn't work, you're probably screwed, and as soon as it's present on a system, you know that your assumptions about understanding the Python interpreter's startup sequences are probably off. The efforts around distutils2/packaging have been focused on taking the time to *fix the foundations first* rather than accepting the inevitable shortcomings of trying to build something in the middle of a swamp.
If the long-term goal is to draw setuptools users over to packaging, then AFAIK the packaging effort is still missing a few things, like build-time dependencies and alternatives to setuptools' entry points and "extras", as well as the ability to integrate version control for building sdists (without requiring the sdist's recipient to *also* have the version control integration in order to build the package or recreate a new sdist).
Right - clearly enumerating the features that draw people to use setuptools over just using distutils should be a key element in any PEP for 3.4 I honestly think a big part of why packaging ended up being incomplete for 3.3 is that we still don't have a clearly documented answer to two critical questions: 1. Why do people choose setuptools over distutils? 2. What's wrong with setuptools that meant the idea of including it directly in the stdlib was ultimately dropped and eventually replaced with the goal of incorporating distutils2? I imagine there are answers to both of those questions embedded in past python-dev, distutils-sig, setuptools and distutils2 mailing list discussions, but that's no substitute for having them clearly documented in a PEP (or PEPs, given the scope of the questions). We've tried to shortcircuit this process twice now, first with "just include setuptools" back around 2.5, and again now with "just include distutils2 as packaging" for 3.3. It hasn't worked, so maybe it's time to try doing it properly and clearly articulating the desired end result. If the end goal is "the bulk of the setuptools feature set without the problematic features and default behaviours that make system administrators break out the torches and pitchforks", then we should *write that down* (and spell out the implications) rather than assuming that everyone knows the purpose of the exercise. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 06/20/2012 11:57 PM, Nick Coghlan wrote:
On Thu, Jun 21, 2012 at 3:29 AM, PJ Eby<pje@telecommunity.com> wrote:
On Wed, Jun 20, 2012 at 9:02 AM, Nick Coghlan<ncoghlan@gmail.com> wrote:
On Wed, Jun 20, 2012 at 9:46 PM, Antoine Pitrou<solipsis@pitrou.net> wrote:
Agreed, especially if the "proven in the wild" criterion is required (people won't rush to another third-party distutils replacement, IMHO).
The existence of setuptools means that "proven in the wild" is never going to fly - a whole lot of people use setuptools and easy_install happily, because they just don't care about the downsides it has in terms of loss of control of a system configuration.
Um, this may be a smidge off topic, but what "loss of control" are we talking about here? AFAIK, there isn't anything it does that you can't override with command line options or the config file. (In most cases, standard distutils options or config files.) Do you just mean that most people use the defaults and don't care about there being other options? And if that's the case, which other options are you referring to?
No, I mean there are design choices in setuptools that explain why many people don't like it and are irritated when software they want to use depends on it without a good reason. Clearly articulating the reasons that "just include setuptools" is no longer being considered as an option should be one of the goals of any PEPs associated with adding packaging back for 3.4.
The reasons I'm personally aware of: - it's a unilateral runtime fork of the standard library that bears a lot of responsibility for the ongoing feature freeze in distutils. Standard assumptions about the behaviour of site and distutils cease to be valid once setuptools is installed - overuse of "*.pth" files and the associated sys.path changes for all Python programs running on a system. setuptools gleefully encourages the inclusion of non-trivial code snippets in *.pth files that will be executed by all programs. - advocacy for the "egg" format and the associated sys.path changes that result for all Python programs running on a system - too much magic that is enabled by default and is hard to switch off (e.g. http://rhodesmill.org/brandon/2009/eby-magic/)
All of these are really pretty minor issues compared with the main benefit of not needing to ship everything with everything else. The killer feature is that developers can specify dependencies and users can have those dependencies installed automatically in a cross-platform way. Everything else is complete noise if this use case is not served. IMO, the second and third things you mention above (use of pth files and eggs) are actually features when compared against the result of something like pip, which installs things using --single-version-externally-managed and then tries to manage the resulting potentially-intertwined directories. Eggs are *easier* to manage than files potentially overlapping files and directories installed into some other directory. Either they exist or they don't. Either they're mentioned in a .pth file or they aren't. It's not really that hard. In any case, any tool that tries to manage distribution installation will need somewhere to keep distribution metadata. It's a minor mystery to me why people think it could be done much better than in something very close to egg format.
System administrators (and developers that think like system administrators when it comes to configuration management) *hate* what setuptools (and setuptools based installers) can do to their systems. It doesn't matter that package developers don't *have* to do those things - what matters is that the needs and concerns of system administrators simply don't appear to have been anywhere on the radar when setuptools was being designed. (If those concerns actually were taken into account at some point, it's sure hard to tell from the end result and the choices of default behaviour)
I think you mean easy_install here. And I guess you mean managing .pth files. Note that if you use pip, neither thing needs to happen. And even easy_install lets you install a distribution that way (with --single-version-externally-managed). So I think, as you mention, this is a matter of defaults (tool and or flag defaults) rather than core functionality.
setuptools is a masterful achievement built on shaky foundations that will work most of the time. However, when it doesn't work, you're probably screwed, and as soon as it's present on a system, you know that your assumptions about understanding the Python interpreter's startup sequences are probably off.
It's true setuptools is based on shaky foundations. The rest of the stuff you say above is pretty darn specious, I think.
The efforts around distutils2/packaging have been focused on taking the time to *fix the foundations first* rather than accepting the inevitable shortcomings of trying to build something in the middle of a swamp.
If the long-term goal is to draw setuptools users over to packaging, then AFAIK the packaging effort is still missing a few things, like build-time dependencies and alternatives to setuptools' entry points and "extras", as well as the ability to integrate version control for building sdists (without requiring the sdist's recipient to *also* have the version control integration in order to build the package or recreate a new sdist).
Right - clearly enumerating the features that draw people to use setuptools over just using distutils should be a key element in any PEP for 3.4
I honestly think a big part of why packaging ended up being incomplete for 3.3 is that we still don't have a clearly documented answer to two critical questions: 1. Why do people choose setuptools over distutils?
Because it supports automated installation of dependencies. Almost everything else is noise (although some of the other things that setuptools provides, like entry points and console scripts, is important noise).
2. What's wrong with setuptools that meant the idea of including it directly in the stdlib was ultimately dropped and eventually replaced with the goal of incorporating distutils2?
Because distutils sucks and setuptools is based on distutils. It's horrible to need to hack on. Setuptools also has documentation which is effectively deltas to the distutils docs. As a result, it's very painful to try to follow the setuptools docs. IMO, it's not that the ideas in setuptools are bad, it's that setuptools requires a *lot* more docs to be consumable by normal humans, and those docs need to be a lot more accessible.
I imagine there are answers to both of those questions embedded in past python-dev, distutils-sig, setuptools and distutils2 mailing list discussions, but that's no substitute for having them clearly documented in a PEP (or PEPs, given the scope of the questions).
We've tried to shortcircuit this process twice now, first with "just include setuptools" back around 2.5, and again now with "just include distutils2 as packaging" for 3.3. It hasn't worked, so maybe it's time to try doing it properly and clearly articulating the desired end result. If the end goal is "the bulk of the setuptools feature set without the problematic features and default behaviours that make system administrators break out the torches and pitchforks", then we should *write that down* (and spell out the implications) rather than assuming that everyone knows the purpose of the exercise.
There's all kinds of built in conflict here wrt to those pitchforks. Most of it is stupid. System admininstrators tend to be stuck in a "one package to rule them all" model of deployment and that model *just cant work* on a system where you need repeatable deployments of multiple pieces of Python-based software which may require mutually exclusive different Python and library versions. Trying to pretend it can work is just plain madness. Telling developers they must work on an exact replica of the production system in order to develop the software is also a terrible, unproductive idea. This is a hopeless, 1990s waterfall model of deployment and devlopment. This is why packages like virtualenv and buildout are so popular. Using them gets developers what they need. Developers get repeatable cross-platform deployments without requiring special privilege, and this allows for a *reduction* in the system administrator's role in deployment. Sometimes a certain type of system administrator can be a hindrance to deployment and maintenance, like sometimes a DBA can be a hindrance to a developer who just needs to add a damn table. With the tools available today (Fabric, buildout, salt, virtualenv, pip), it's a heck of a lot easier to script a cross-platform deployment that will work simultaneously on Debian, Red Hat, BSD, and Mac OS X than it is to build system-level packages for multiple platforms or even *one* platform. And to be honest, if a system administrator can't cope with the notion that he may need to forsake his system-level package installer and instead follow the instructions we give to him to type four or five commands to get a completely working system deployed or updated, he probably should not be a system administrator. His job is going to quickly be taken by folks who *can* cope with such deployment mechanisms like any cloud service: all the existing Python cloud deployment services handle distutils/setuptools installs just fine and these tend to be the *only* way you can get Python software installed into a system on them. - C
On Thu, Jun 21, 2012 at 2:44 PM, Chris McDonough <chrism@plope.com> wrote:
All of these are really pretty minor issues compared with the main benefit of not needing to ship everything with everything else. The killer feature is that developers can specify dependencies and users can have those dependencies installed automatically in a cross-platform way. Everything else is complete noise if this use case is not served.
Cool. This is the kind of thing we need recorded in a PEP - there's a lot of domain knowledge floating around in the heads of packaging folks that needs to be captured so we can know *what the addition of packaging to the standard library is intended to fix*. And, like it or not, setuptools has a serious PR problem due to the fact it monkeypatches the standard library, uses *.pth files to alter sys.path for every installed application by default, actually *uses* the ability to run code in *.pth files and has hard to follow documentation to boot. I *don't* trust that I fully understand the import system on any machine with setuptools installed, because it is demonstrably happy to install state to the file system that will affect *all* Python programs running on the machine. A packaging PEP needs to explain: - what needs to be done to eliminate any need for monkeypatching - what's involved in making sure that *.pth are *not* needed by default - making sure that executable code in implicitly loaded *.pth files isn't used *at all* I *think* trying to achieve this is actually the genesis of the original distribute fork, that subsequently became distutils2 as Tarek discovered how much of the complexity in setuptools was actually due to the desire to *not* officially fork distutils (and instead monkeypatch it, effectively creating a runtime fork). However, for those of us that weren't directly involved, this is all still a strange mystery dealt with by other people. I've cribbed together bits and pieces just from following the fragments of the discussions that have happened on python-dev and at PyCon US, but if we want the madness to ever stop, then *the problems with the status quo* need to be written down so that other core developers can understand them. In fact, I just remembered that Tarek *has* written a lot of this down, just not in PEP form: http://www.aosabook.org/en/packaging.html Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Jun 21, 2012 at 9:45 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Thu, Jun 21, 2012 at 2:44 PM, Chris McDonough <chrism@plope.com> wrote:
All of these are really pretty minor issues compared with the main benefit of not needing to ship everything with everything else. The killer feature is that developers can specify dependencies and users can have those dependencies installed automatically in a cross-platform way. Everything else is complete noise if this use case is not served.
Cool. This is the kind of thing we need recorded in a PEP - there's a lot of domain knowledge floating around in the heads of packaging folks that needs to be captured so we can know *what the addition of packaging to the standard library is intended to fix*.
And, like it or not, setuptools has a serious PR problem due to the fact it monkeypatches the standard library, uses *.pth files to alter sys.path for every installed application by default, actually *uses* the ability to run code in *.pth files and has hard to follow documentation to boot. I *don't* trust that I fully understand the import system on any machine with setuptools installed, because it is demonstrably happy to install state to the file system that will affect *all* Python programs running on the machine.
A packaging PEP needs to explain: - what needs to be done to eliminate any need for monkeypatching - what's involved in making sure that *.pth are *not* needed by default - making sure that executable code in implicitly loaded *.pth files isn't used *at all*
It is not a PEP, but here are a few reasons why extending distutils is difficult (taken from our experience in the scipy community, which has by far the biggest extension of distutils AFAIK): http://cournape.github.com/Bento/html/faq.html#why-not-extending-existing-to... While I believe setuptools has been a net negative for the scipy community because of the way it works and for the reason you mentioned, I think it is fair to say it is not really possible to do any differently if you rely on distutils. If specifying install dependencies is the killer feature of setuptools, why can't we have a very simple module that adds the necessary 3 keywords to record it, and let 3rd party tools deal with it as they wish ? That would not even require speciying the format, and would let us more time to deal with the other, more difficult questions. David
On Thu, Jun 21, 2012 at 7:28 PM, David Cournapeau <cournape@gmail.com> wrote:
If specifying install dependencies is the killer feature of setuptools, why can't we have a very simple module that adds the necessary 3 keywords to record it, and let 3rd party tools deal with it as they wish ? That would not even require speciying the format, and would let us more time to deal with the other, more difficult questions.
That low level role is filled by PEP 345 (the latest PyPI metadata format, which adds the new fields), PEP 376 (local installation database) and PEP 386 (version numbering schema). The corresponding packaging submodules are the ones that were being considered for retention as a reference implementation in 3.3, but are still slated for removal along with the rest of the package (the reference implementations will remain available as part of distutils2 on PyPI). Whatever UI a Python packaging solution presents to a user, it needs to support those 3 PEPs on the back end for interoperability with other tools (including, eventually, the packaging module in the standard library). Your feedback on the commands/compilers design sounds valuable, and I would be very interested in seeing a PEP targeting that aspect of the new packaging module (if you look at the start of this thread, the failure to improve the compiler API is one of the reasons for pulling the code from 3.3). If python-dev ends up playing referee on multiple competing PEPs, that's not necessarily a bad thing. If a consensus solution doesn't meet the needs of key parties that aren't well served by existing approaches (specifically, the scientific community, and enterprise users that want to be able to translate the plethora of language specific packaging systems to a common format for internal use to simplify system administration and configuration management and auditing), then we may as well not bother and let the status quo continue indefinitely. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Jun 21, 2012 at 12:58 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Thu, Jun 21, 2012 at 7:28 PM, David Cournapeau <cournape@gmail.com> wrote:
If specifying install dependencies is the killer feature of setuptools, why can't we have a very simple module that adds the necessary 3 keywords to record it, and let 3rd party tools deal with it as they wish ? That would not even require speciying the format, and would let us more time to deal with the other, more difficult questions.
That low level role is filled by PEP 345 (the latest PyPI metadata format, which adds the new fields), PEP 376 (local installation database) and PEP 386 (version numbering schema).
The corresponding packaging submodules are the ones that were being considered for retention as a reference implementation in 3.3, but are still slated for removal along with the rest of the package (the reference implementations will remain available as part of distutils2 on PyPI).
I understand the code is already implemented, but I meant that it may be a good idea to have a simple, self-contained module that does just provide the necessary bits for the "setuptools killer feature", and let competing tools deal with it as they please.
Whatever UI a Python packaging solution presents to a user, it needs to support those 3 PEPs on the back end for interoperability with other tools (including, eventually, the packaging module in the standard library).
Your feedback on the commands/compilers design sounds valuable, and I would be very interested in seeing a PEP targeting that aspect of the new packaging module (if you look at the start of this thread, the failure to improve the compiler API is one of the reasons for pulling the code from 3.3).
The problem with compilation is not just the way the compiler classes work. It it how they interact with commands and the likes, which ends up being most of the original distutils code. What's wrong with distutils is the whole underlying model, if one can call that. No PEP will fix the issue if the premise is to work within that model. There are similar kind of arguments around the extensibility of distutils: it is not just about monkey-patching, but what kind of API you offer to allow for extensibility, and I think the only way to design this sensibly is to work on real packages and iterate, not writing a PEP as a first step. David
On Thu, Jun 21, 2012 at 10:19 PM, David Cournapeau <cournape@gmail.com> wrote:
On Thu, Jun 21, 2012 at 12:58 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Thu, Jun 21, 2012 at 7:28 PM, David Cournapeau <cournape@gmail.com> wrote:
If specifying install dependencies is the killer feature of setuptools, why can't we have a very simple module that adds the necessary 3 keywords to record it, and let 3rd party tools deal with it as they wish ? That would not even require speciying the format, and would let us more time to deal with the other, more difficult questions.
That low level role is filled by PEP 345 (the latest PyPI metadata format, which adds the new fields), PEP 376 (local installation database) and PEP 386 (version numbering schema).
The corresponding packaging submodules are the ones that were being considered for retention as a reference implementation in 3.3, but are still slated for removal along with the rest of the package (the reference implementations will remain available as part of distutils2 on PyPI).
I understand the code is already implemented, but I meant that it may be a good idea to have a simple, self-contained module that does just provide the necessary bits for the "setuptools killer feature", and let competing tools deal with it as they please.
If you're genuinely interested in that prospect, I suggest collaborating with the distutils2 team to extract the four identified modules (and any necessary support code) as a "distmeta" project on PyPI: distmeta.version — Version number classes distmeta.metadata — Metadata handling distmeta.markers — Environment markers distmeta.database — Database of installed distributions That will allow faster iteration on the core interoperability standards prior to reincorporation in 3.4, and explicitly decouple them from the higher level (more contentious) features.
Whatever UI a Python packaging solution presents to a user, it needs to support those 3 PEPs on the back end for interoperability with other tools (including, eventually, the packaging module in the standard library).
Your feedback on the commands/compilers design sounds valuable, and I would be very interested in seeing a PEP targeting that aspect of the new packaging module (if you look at the start of this thread, the failure to improve the compiler API is one of the reasons for pulling the code from 3.3).
The problem with compilation is not just the way the compiler classes work. It it how they interact with commands and the likes, which ends up being most of the original distutils code. What's wrong with distutils is the whole underlying model, if one can call that. No PEP will fix the issue if the premise is to work within that model.
I don't accept the premise that the 3.4 packaging solution must be restricted to the distutils semantic model. However, no alternative strategy has been formally presented to python-dev. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 06/21/2012 04:45 AM, Nick Coghlan wrote:
On Thu, Jun 21, 2012 at 2:44 PM, Chris McDonough<chrism@plope.com> wrote:
All of these are really pretty minor issues compared with the main benefit of not needing to ship everything with everything else. The killer feature is that developers can specify dependencies and users can have those dependencies installed automatically in a cross-platform way. Everything else is complete noise if this use case is not served.
Cool. This is the kind of thing we need recorded in a PEP - there's a lot of domain knowledge floating around in the heads of packaging folks that needs to be captured so we can know *what the addition of packaging to the standard library is intended to fix*.
And, like it or not, setuptools has a serious PR problem due to the fact it monkeypatches the standard library, uses *.pth files to alter sys.path for every installed application by default, actually *uses* the ability to run code in *.pth files and has hard to follow documentation to boot. I *don't* trust that I fully understand the import system on any machine with setuptools installed, because it is demonstrably happy to install state to the file system that will affect *all* Python programs running on the machine.
I don't know about Red Hat but both Ubuntu and Apple put all kinds of stuff on the default sys.path of the system Python of the box that's related to their software's concerns only. I don't understand why people accept this but get crazy about the fact that installing a setuptools distribution using easy_install changes the default sys.path. Installing a distribution will change behavior whether or not sys.path is changed as a result. That's its purpose. The code that runs in the .pth *file* (there's only one that matters: easy_install.pth) just mutates sys.path. The end result is this: if you understand how sys.path works, you understand how eggs work. Each egg is addded to sys.path. That's all there is to it. It's the same as manually mutating a global PYTHONPATH, except you don't need to do it. And note that this is not "setuptools" in general. It's easy_install in particular. Everything you've brought up so far I think is limited to easy_install. It doesn't happen when you use pip. I think it's a mistake that pip doesn't do it, but I think you have to make more accurate distinctions.
A packaging PEP needs to explain: - what needs to be done to eliminate any need for monkeypatching - what's involved in making sure that *.pth are *not* needed by default - making sure that executable code in implicitly loaded *.pth files isn't used *at all*
I'll note that these goals are completely sideways to any actual functional goal. It'd be a shame to have monkeypatching going on, but the other stuff I don't think are reasonable goals. Instead they represent fears, and those fears just need to be managed.
I *think* trying to achieve this is actually the genesis of the original distribute fork, that subsequently became distutils2 as Tarek discovered how much of the complexity in setuptools was actually due to the desire to *not* officially fork distutils (and instead monkeypatch it, effectively creating a runtime fork).
However, for those of us that weren't directly involved, this is all still a strange mystery dealt with by other people. I've cribbed together bits and pieces just from following the fragments of the discussions that have happened on python-dev and at PyCon US, but if we want the madness to ever stop, then *the problems with the status quo* need to be written down so that other core developers can understand them.
It'd also be useful if other core developers actually tried to use setuptools in anger. That'd be a good start towards understanding some of its tradeoffs. People can write this stuff down til they're blue in the face, but if core devs don't try the stuff, they'll always fear it.
In fact, I just remembered that Tarek *has* written a lot of this down, just not in PEP form: http://www.aosabook.org/en/packaging.html
Cool. - C
On 21 June 2012 12:48, Chris McDonough <chrism@plope.com> wrote:
On 06/21/2012 04:45 AM, Nick Coghlan wrote:
On Thu, Jun 21, 2012 at 2:44 PM, Chris McDonough<chrism@plope.com> wrote:
All of these are really pretty minor issues compared with the main benefit of not needing to ship everything with everything else. The killer feature is that developers can specify dependencies and users can have those dependencies installed automatically in a cross-platform way. Everything else is complete noise if this use case is not served.
Cool. This is the kind of thing we need recorded in a PEP - there's a lot of domain knowledge floating around in the heads of packaging folks that needs to be captured so we can know *what the addition of packaging to the standard library is intended to fix*.
And, like it or not, setuptools has a serious PR problem due to the fact it monkeypatches the standard library, uses *.pth files to alter sys.path for every installed application by default, actually *uses* the ability to run code in *.pth files and has hard to follow documentation to boot. I *don't* trust that I fully understand the import system on any machine with setuptools installed, because it is demonstrably happy to install state to the file system that will affect *all* Python programs running on the machine.
I don't know about Red Hat but both Ubuntu and Apple put all kinds of stuff on the default sys.path of the system Python of the box that's related to their software's concerns only. I don't understand why people accept this but get crazy about the fact that installing a setuptools distribution using easy_install changes the default sys.path.
I don't like the particular way that easy_install modifies sys.path so that it can no longer be overridden by PYTHONPATH. For a discussion, see: http://stackoverflow.com/questions/5984523/eggs-in-path-before-pythonpath-en... The fact that ubuntu does this for some system ubuntu packages has never bothered me, but the fact that it happens for packages that I install with easy_install has. The typical scenario would be that I: 1) Install some package X with easy_install. 2) Find a bug or some aspect of X that I want to change and checkout the latest version from e.g. github. 3) Try to use PYTHONPATH to test the checked out version and find that easy_install's path modification prevents me from doing so. 4) Run the quickfix script in the stackoverflow question above and consider not using easy_install for X in future. Oscar
On Thu, Jun 21, 2012 at 9:48 PM, Chris McDonough <chrism@plope.com> wrote:
On 06/21/2012 04:45 AM, Nick Coghlan wrote:
And, like it or not, setuptools has a serious PR problem due to the fact it monkeypatches the standard library, uses *.pth files to alter sys.path for every installed application by default, actually *uses* the ability to run code in *.pth files and has hard to follow documentation to boot. I *don't* trust that I fully understand the import system on any machine with setuptools installed, because it is demonstrably happy to install state to the file system that will affect *all* Python programs running on the machine.
I don't know about Red Hat but both Ubuntu and Apple put all kinds of stuff on the default sys.path of the system Python of the box that's related to their software's concerns only. I don't understand why people accept this but get crazy about the fact that installing a setuptools distribution using easy_install changes the default sys.path.
Because the vendor gets to decide what goes into the base install of the OS. If I'm using the system Python, then I expect sys.path to contain the system paths, just as I expect gcc to be able to see the system include paths. If I don't want that, I'll use virtualenv or a completely separate Python installation. However, when I install a new Python package into site-packages it *should* just sit there and have zero impact on other Python applications that don't import that package. As soon as someone installs a *.pth file, however, that's *no longer the case* - every Python application on that machine will now be scanning additional paths for modules whether it wants to or not. It's unnecessary coupling between components that *should* be completely independent of each other. Now, *.pth support in the interpreter certainly cannot be blamed on setuptools, but encouraging use of a packaging format that effectively requires them certainly can be. It's similar to the reason why monkeypatching and global environment variable modifications (including PYTHONPATH) are a problem: as soon as you start doing that kind of thing, you're introducing coupling that *shouldn't exist*. If there is no better solution, then sure, do it as a near term workaround, but that isn't the same as accepting it as the long term answer.
Installing a distribution will change behavior whether or not sys.path is changed as a result. That's its purpose.
No it won't. An ordinary package will only change the behaviour of Python applications that import a package by that name. Other Python applications will be completely unaffected (as it should be).
The code that runs in the .pth *file* (there's only one that matters: easy_install.pth) just mutates sys.path. The end result is this: if you understand how sys.path works, you understand how eggs work. Each egg is addded to sys.path. That's all there is to it. It's the same as manually mutating a global PYTHONPATH, except you don't need to do it.
Yes, it's the same as mutating PYTHONPATH. That's a similarly bad system global change. Individual libraries do not have the right to change the sys.path seen on initialisation by every other Python application on that system.
And note that this is not "setuptools" in general. It's easy_install in particular. Everything you've brought up so far I think is limited to easy_install. It doesn't happen when you use pip. I think it's a mistake that pip doesn't do it, but I think you have to make more accurate distinctions.
What part of "PR problem" was unclear? setuptools and easy_install are inextricably linked in everyone's minds, just like pip and distribute.
A packaging PEP needs to explain: - what needs to be done to eliminate any need for monkeypatching - what's involved in making sure that *.pth are *not* needed by default - making sure that executable code in implicitly loaded *.pth files isn't used *at all*
I'll note that these goals are completely sideways to any actual functional goal. It'd be a shame to have monkeypatching going on, but the other stuff I don't think are reasonable goals. Instead they represent fears, and those fears just need to be managed.
No, they reflect the mindset of someone with configuration management and auditing responsibilities for shared systems with multiple applications installed which may be written in a variety of languages, not just Python. You may not care about those people, but I do.
It'd also be useful if other core developers actually tried to use setuptools in anger. That'd be a good start towards understanding some of its tradeoffs. People can write this stuff down til they're blue in the face, but if core devs don't try the stuff, they'll always fear it.
setuptools (or, perhaps, easy_install, although I've seen enough posts about eggs being uploaded to PyPI to suspect otherwise), encourages the deployment of system configuration changes that alter the runtime environment of every single Python application executed on the system. That's simply not cool. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 06/21/2012 08:21 AM, Nick Coghlan wrote:
Installing a distribution will change behavior whether or not sys.path is changed as a result. That's its purpose.
No it won't. An ordinary package will only change the behaviour of Python applications that import a package by that name. Other Python applications will be completely unaffected (as it should be).
If a Python application is effected by a change to sys.path which doesn't impact modules it uses, then that Python application is plain broken, because the developer of that application cannot make assumptions about what a user does to sys.path unrelated to the modules it requires. This is completely independent of easy_install. Any Python application is going to be effected by the installation of a distribution that does impact modules it imports, whether sys.path is used to change the working set of modules or not. So what concrete situation are we actually talking about here?
The code that runs in the .pth *file* (there's only one that matters: easy_install.pth) just mutates sys.path. The end result is this: if you understand how sys.path works, you understand how eggs work. Each egg is addded to sys.path. That's all there is to it. It's the same as manually mutating a global PYTHONPATH, except you don't need to do it.
Yes, it's the same as mutating PYTHONPATH. That's a similarly bad system global change. Individual libraries do not have the right to change the sys.path seen on initialisation by every other Python application on that system.
Is it reasonable to even assume there is only one-sys.path-to-rule-them-all? And that users install "the set of libraries they need" into a common place? This quickly turns into failure, because Python is used for many, many tasks, and those tasks sometimes *require conflicting versions of libraries*. This is the root cause of why virtualenv exists and is popular. The reason it's disappointing to see OS vendors mutating the default sys.path is because they put *very old versions of very common non-stdlib packages* (e.g. zope.interface, lxml) on sys.path by default. The path is tainted out of the box for anyone who wants to use the system Python for development of newer software. So at some point they invariably punt to virtualenv or a virtualenv-like system where the OS-vendor-provided path is not present. If Python supported the installation of multiple versions of the same module and versioned imports, both PYTHONPATH and virtualenv would be much less important. But given lack of enthusiasm for that, I don't think it's reasonable to assume there is only one sys.path on every system. I sympathize, however, with Oscar's report that PYTHONPATH can't the setuptools-derived path. That's indeed a mistake that a future tool should not make.
And note that this is not "setuptools" in general. It's easy_install in particular. Everything you've brought up so far I think is limited to easy_install. It doesn't happen when you use pip. I think it's a mistake that pip doesn't do it, but I think you have to make more accurate distinctions.
What part of "PR problem" was unclear? setuptools and easy_install are inextricably linked in everyone's minds, just like pip and distribute.
Hopefully for the purposes of the discussion, folks here can make the mental separation between setuptools and easy_install. We can't help what other folks think in the meantime, certainly not solely by making technological compromises anyway.
A packaging PEP needs to explain: - what needs to be done to eliminate any need for monkeypatching - what's involved in making sure that *.pth are *not* needed by default - making sure that executable code in implicitly loaded *.pth files isn't used *at all*
I'll note that these goals are completely sideways to any actual functional goal. It'd be a shame to have monkeypatching going on, but the other stuff I don't think are reasonable goals. Instead they represent fears, and those fears just need to be managed.
No, they reflect the mindset of someone with configuration management and auditing responsibilities for shared systems with multiple applications installed which may be written in a variety of languages, not just Python. You may not care about those people, but I do.
I care about deploying Python-based applications to many platforms. You care about deploying multilanguage-based applications to a single platform. There's going to be conflict there. My only comment on that is this: Since this is a problem related to the installation of Python distributions, it should deal with the problems that Python developers have more forcefully than non-Python developers and non-programmers.
It'd also be useful if other core developers actually tried to use setuptools in anger. That'd be a good start towards understanding some of its tradeoffs. People can write this stuff down til they're blue in the face, but if core devs don't try the stuff, they'll always fear it.
setuptools (or, perhaps, easy_install, although I've seen enough posts about eggs being uploaded to PyPI to suspect otherwise), encourages the deployment of system configuration changes that alter the runtime environment of every single Python application executed on the system. That's simply not cool.
Again, it would help if you tried it in anger. What's the worst that could happen? You might like it! ;-) - C
On Thu, Jun 21, 2012 at 10:51 PM, Chris McDonough <chrism@plope.com> wrote:
Is it reasonable to even assume there is only one-sys.path-to-rule-them-all? And that users install "the set of libraries they need" into a common place? This quickly turns into failure, because Python is used for many, many tasks, and those tasks sometimes *require conflicting versions of libraries*. This is the root cause of why virtualenv exists and is popular.
And why I'm very happy to see pyvenv make it's way into the standard library :)
I care about deploying Python-based applications to many platforms. You care about deploying multilanguage-based applications to a single platform. There's going to be conflict there.
My only comment on that is this: Since this is a problem related to the installation of Python distributions, it should deal with the problems that Python developers have more forcefully than non-Python developers and non-programmers.
Thanks to venv, there's an alternative available that may be able to keep both of us happy: split the defaults. For system installs, adopt a vendor-centric, multi-language, easy-to-translate-to-language-neutral-packaging mindset (e.g. avoiding *.pth files by unpacking eggs to the file system). For venv installs, do whatever is most convenient for pure Python developers (e.g. leaving eggs packed and using *.pth files to extend sys.path within the venv). One of Python's great virtues is its role as a glue language, and part of being an effective glue language is playing well with others. That should apply to packaging & distribution as well, not just to runtime bindings to tools written in other languages. When we add the scientific users into the mix, we're actually getting to a *third* audience: multi-language developers that want to use *Python's* packaging utilities for their source and binary distribution formats. The Python community covers a broad spectrum of use cases, and I suspect that's one of the big reasons packaging can get so contentious - the goals end up being in direct conflict. Currently, I've identified at least half a dozen significant communities with very different needs (the names aren't meant to be all encompassing, just good representatives of each category, and many individuals will span multiple categories depending on which hat they're wearing at the time): Library authors: just want to quickly and easily publish their work on the Python package index in a way that is discoverable by others and allows feedback to reach them at their development site Web developers: creators of Python applications, relying primarily on other Python software and underlying OS provided functionality, potentially with some native extensions, that may need to run on multiple platforms, but can require installation using a language specific mechanism by technical staff Rich client developers: creators of Python applications relying primarily on other Python software and underlying OS provided functionality, potentially with native extensions, that need to run on multiple platforms, but must be installed using standard system utilities for the benefit of non-technical end users Enterprise developers: creators of Python or mixed language applications that need to integrate with corporate system administration policies (including packaging, auditing and configuration management) Scientists: creators of Python data analysis and modelling applications, with complex dependencies on software written in a variety of other languages and using various build systems Python embedders: developers that embed a Python runtime inside a larger application
setuptools (or, perhaps, easy_install, although I've seen enough posts about eggs being uploaded to PyPI to suspect otherwise), encourages the deployment of system configuration changes that alter the runtime environment of every single Python application executed on the system. That's simply not cool.
Again, it would help if you tried it in anger. What's the worst that could happen? You might like it! ;-)
Oh, believe me, if I ever had distribution needs that required the power and flexibility of setuptools, I would reach for it in a heartbeat (in fact, I already use it today, albeit for tasks that ordinary distutils could probably handle). That said, I do get to cheat though - since I don't need to worry about cross-platform deployment, I can just use the relevant RPM hooks directly :) You're right that most of my ire should be directed at the default behaviour of easy_install rather than at setuptools itself, though. I shall moderate my expressed opinions accordingly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 06/21/2012 09:29 AM, Nick Coghlan wrote:
My only comment on that is this: Since this is a problem related to the installation of Python distributions, it should deal with the problems that Python developers have more forcefully than non-Python developers and non-programmers.
Thanks to venv, there's an alternative available that may be able to keep both of us happy: split the defaults. For system installs, adopt a vendor-centric, multi-language, easy-to-translate-to-language-neutral-packaging mindset (e.g. avoiding *.pth files by unpacking eggs to the file system). For venv installs, do whatever is most convenient for pure Python developers (e.g. leaving eggs packed and using *.pth files to extend sys.path within the venv).
I'd like to agree with this, but I think there's a distinction that needs to be made here that's maybe not obvious to everyone. A tool to generate an OS-specific system package from a Python library project should be unrelated to a Python distribution *installer*. Instead, you'd use related tools that understood how to unpack the distribution packaging format to build one or more package structures. The resulting structures will be processed and then eventually installed by native OS install tools. But the Python distribution installer (e.g easy_install, pip, or some future similar tool) would just never come into play to create those structures. The Python distribution installer and the OS-specific build tool might share code to introspect and unpack files from the packaging format, but they'd otherwise have nothing to do with one another. This seems like the most reasonable separation of concerns to me anyway, and I'd be willing to work on the code that would be shared by both the Python-level installer and by OS-level packaging tools.
One of Python's great virtues is its role as a glue language, and part of being an effective glue language is playing well with others. That should apply to packaging& distribution as well, not just to runtime bindings to tools written in other languages.
When we add the scientific users into the mix, we're actually getting to a *third* audience: multi-language developers that want to use *Python's* packaging utilities for their source and binary distribution formats.
The Python community covers a broad spectrum of use cases, and I suspect that's one of the big reasons packaging can get so contentious - the goals end up being in direct conflict. Currently, I've identified at least half a dozen significant communities with very different needs (the names aren't meant to be all encompassing, just good representatives of each category, and many individuals will span multiple categories depending on which hat they're wearing at the time):
Library authors: just want to quickly and easily publish their work on the Python package index in a way that is discoverable by others and allows feedback to reach them at their development site
Web developers: creators of Python applications, relying primarily on other Python software and underlying OS provided functionality, potentially with some native extensions, that may need to run on multiple platforms, but can require installation using a language specific mechanism by technical staff
Rich client developers: creators of Python applications relying primarily on other Python software and underlying OS provided functionality, potentially with native extensions, that need to run on multiple platforms, but must be installed using standard system utilities for the benefit of non-technical end users
Enterprise developers: creators of Python or mixed language applications that need to integrate with corporate system administration policies (including packaging, auditing and configuration management)
Scientists: creators of Python data analysis and modelling applications, with complex dependencies on software written in a variety of other languages and using various build systems
Python embedders: developers that embed a Python runtime inside a larger application
I think we'll also need to put some limits on the goal independent of the union of everything all the audiences require. Here's some scope suggestions that I believe could be shared by all of the audiences you list above except for embedders; I think that use case is pretty much separate. It might also leave "rich client developers" wanting, but no more than they're already wanting. - Install code that can *later be imported*. This could be pure Python code or C code which requires compilation. But it's not for the purpose of compiling and installing completely arbitrary C code to arbitrary locations, it's ust written for the purpose of compiling C code which then *lives in the installed distribution* to provide an importable Python module that lives in the same distribution with logic. - Install "console scripts" which are shell-scripts/batch-files that cause some logic written in Python to get run as a result. These console scripts are written to sys.prefix + '/{bin/Scripts}' depending on the platform. - Install "package resources", which are non-Python source files that happen to live in package directories. IOW, an installer should be about installing Python libraries and supporting files to a well-known location defined by the interpreter or venv that runs it, not full applications-that-require-persistent-state which just happen to be written in Python and which require deployment to arbitrary locations. You shouldn't expect the Python packaging tools to install an instance of an application on a system, you should expect them to install enough code that would allow you to *generate* an instance of such an application. Most tools make that possible by installing a console script which can generate a sandbox that can be used to keep application state. Hopefully this is preaching to the choir.
setuptools (or, perhaps, easy_install, although I've seen enough posts about eggs being uploaded to PyPI to suspect otherwise), encourages the deployment of system configuration changes that alter the runtime environment of every single Python application executed on the system. That's simply not cool.
Again, it would help if you tried it in anger. What's the worst that could happen? You might like it! ;-)
Oh, believe me, if I ever had distribution needs that required the power and flexibility of setuptools, I would reach for it in a heartbeat (in fact, I already use it today, albeit for tasks that ordinary distutils could probably handle). That said, I do get to cheat though - since I don't need to worry about cross-platform deployment, I can just use the relevant RPM hooks directly :)
Ideally this is all you'd ever need to care deeply about in an ideal world, too, given the separation of installer vs. system-packaging-support-tools outlined above.
You're right that most of my ire should be directed at the default behaviour of easy_install rather than at setuptools itself, though. I shall moderate my expressed opinions accordingly.
Woot! - C
On Fri, Jun 22, 2012 at 12:12 AM, Chris McDonough <chrism@plope.com> wrote:
On 06/21/2012 09:29 AM, Nick Coghlan wrote:
My only comment on that is this: Since this is a problem related to the installation of Python distributions, it should deal with the problems that Python developers have more forcefully than non-Python developers and non-programmers.
Thanks to venv, there's an alternative available that may be able to keep both of us happy: split the defaults. For system installs, adopt a vendor-centric, multi-language, easy-to-translate-to-language-neutral-packaging mindset (e.g. avoiding *.pth files by unpacking eggs to the file system). For venv installs, do whatever is most convenient for pure Python developers (e.g. leaving eggs packed and using *.pth files to extend sys.path within the venv).
I'd like to agree with this, but I think there's a distinction that needs to be made here that's maybe not obvious to everyone.
A tool to generate an OS-specific system package from a Python library project should be unrelated to a Python distribution *installer*. Instead, you'd use related tools that understood how to unpack the distribution packaging format to build one or more package structures. The resulting structures will be processed and then eventually installed by native OS install tools. But the Python distribution installer (e.g easy_install, pip, or some future similar tool) would just never come into play to create those structures. The Python distribution installer and the OS-specific build tool might share code to introspect and unpack files from the packaging format, but they'd otherwise have nothing to do with one another.
This seems like the most reasonable separation of concerns to me anyway, and I'd be willing to work on the code that would be shared by both the Python-level installer and by OS-level packaging tools.
Right, but if the standard library grows a dist installer (and I think it eventually should), we're going to need to define how it should behave when executed with the *system* Python. That will give at least 3 mechanisms for Python code to get onto a system: 1. Python dist -> converter -> system package -> system Python path 2. Python dist -> system Python installer -> system Python path 3. Python dist -> venv Python installer -> venv Python path While I agree that path 2 should be discouraged for production systems, I don't think it should be prevented altogether (since it can be very convenient on personal systems). As far as the scope of the packaging utilities and what they can install goes, I think the distutils2 folks have done a pretty good job of defining that with their static metadata format: http://alexis.notmyidea.org/distutils2/setupcfg.html#files Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 06/21/2012 10:30 AM, Nick Coghlan wrote:
A tool to generate an OS-specific system package from a Python library project should be unrelated to a Python distribution *installer*. Instead, you'd use related tools that understood how to unpack the distribution packaging format to build one or more package structures. The resulting structures will be processed and then eventually installed by native OS install tools. But the Python distribution installer (e.g easy_install, pip, or some future similar tool) would just never come into play to create those structures. The Python distribution installer and the OS-specific build tool might share code to introspect and unpack files from the packaging format, but they'd otherwise have nothing to do with one another.
This seems like the most reasonable separation of concerns to me anyway, and I'd be willing to work on the code that would be shared by both the Python-level installer and by OS-level packaging tools.
Right, but if the standard library grows a dist installer (and I think it eventually should), we're going to need to define how it should behave when executed with the *system* Python.
That will give at least 3 mechanisms for Python code to get onto a system:
1. Python dist -> converter -> system package -> system Python path
2. Python dist -> system Python installer -> system Python path
3. Python dist -> venv Python installer -> venv Python path
While I agree that path 2 should be discouraged for production systems, I don't think it should be prevented altogether (since it can be very convenient on personal systems).
I'm not sure under what circumstance 2 and 3 wouldn't do the same thing. Do you have a concrete idea?
As far as the scope of the packaging utilities and what they can install goes, I think the distutils2 folks have done a pretty good job of defining that with their static metadata format: http://alexis.notmyidea.org/distutils2/setupcfg.html#files
Yeah definitely a good start. - C
On Fri, Jun 22, 2012 at 12:59 AM, Chris McDonough <chrism@plope.com> wrote:
On 06/21/2012 10:30 AM, Nick Coghlan wrote:
That will give at least 3 mechanisms for Python code to get onto a system:
1. Python dist -> converter -> system package -> system Python path
2. Python dist -> system Python installer -> system Python path
3. Python dist -> venv Python installer -> venv Python path
While I agree that path 2 should be discouraged for production systems, I don't think it should be prevented altogether (since it can be very convenient on personal systems).
I'm not sure under what circumstance 2 and 3 wouldn't do the same thing. Do you have a concrete idea?
Yep, this is what I was talking about in terms of objecting to installation of *.pth files: I think automatically installing *.pth files into the system Python path is *wrong* (just like globally editing PYTHONPATH), and that includes any *.pth files needed for egg installation. In a venv however, I assume the entire thing is application specific, so using *.pth files and eggs for ease of management makes a lot of sense and I would be fine with using that style of installation by default. If the *same* default was going to the used in both places, my preference would be to avoid *.pth files by default and require them to be explicitly requested regardless of the nature of the target environment. I really just wanted to be clear that I don't mind *.pth files at all in the venv case, because they're not affecting the runtime state of other applications. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
More fuel; fire: http://lucumr.pocoo.org/2012/6/22/hate-hate-hate-everywhere/
On Jun 21, 2012 10:12 AM, "Chris McDonough" <chrism@plope.com> wrote:
- Install "package resources", which are non-Python source files that happen to live in package directories.
I love this phrasing, by the way ("non-Python source files"). A pet peeve of mine is the insistence by some people that such files are "data" and don't belong in package directories, despite the fact that if you gave them a .py extension and added data="""...""" around them, they'd be considered part of the code. A file's name and internal format aren't what distinguishes code from data; it's the way it's *used* that matters. I think "packaging" has swung the wrong way on this particular point, and that resources and data files should be distinguished in setup.cfg, with sysadmins *not* being given the option to muck about with resources -- especially not to install them in locations where they might be mistaken for something editable.
On 06/21/2012 11:45 AM, PJ Eby wrote:
On Jun 21, 2012 10:12 AM, "Chris McDonough" <chrism@plope.com <mailto:chrism@plope.com>> wrote:
- Install "package resources", which are non-Python source files that happen to live in package directories.
I love this phrasing, by the way ("non-Python source files").
A pet peeve of mine is the insistence by some people that such files are "data" and don't belong in package directories, despite the fact that if you gave them a .py extension and added data="""...""" around them, they'd be considered part of the code. A file's name and internal format aren't what distinguishes code from data; it's the way it's *used* that matters.
I think "packaging" has swung the wrong way on this particular point, and that resources and data files should be distinguished in setup.cfg, with sysadmins *not* being given the option to muck about with resources -- especially not to install them in locations where they might be mistaken for something editable.
+1. A good number of the "package resource" files we deploy are not data files at all. In particular, a lot of them are files which represent HTML templates. These templates are exclusively the domain of the software being installed, and considering them explicitly "more editable" than the Python source they sit next to in the package structure is a grave mistake. They have exactly the same editability candidacy as the Python source files they are mixed in with. - C
Nick Coghlan <ncoghlan <at> gmail.com> writes:
The Python community covers a broad spectrum of use cases, and I suspect that's one of the big reasons packaging can get so contentious - the goals end up being in direct conflict. Currently, I've identified at least half a dozen significant communities with very different needs (the names aren't meant to be all encompassing, just good representatives of each category, and many individuals will span multiple categories depending on which hat they're wearing at the time):
One set of users not covered by your list is people who need to Cross-Compile Python to another CPU architecture (i.e. x86 to ARM/PowerPC) for use with embedded computers. Distutils does not handle this very well. If you want a recent overview of what these users go through you should see my talk from PyCon 2012: http://pyvideo.org/video/682/cross-compiling-python-c-extensions-for-embedde -Chris
On Jun 21, 2012, at 08:51 AM, Chris McDonough wrote:
The reason it's disappointing to see OS vendors mutating the default sys.path is because they put *very old versions of very common non-stdlib packages* (e.g. zope.interface, lxml) on sys.path by default. The path is tainted out of the box for anyone who wants to use the system Python for development of newer software. So at some point they invariably punt to virtualenv or a virtualenv-like system where the OS-vendor-provided path is not present.
If Python supported the installation of multiple versions of the same module and versioned imports, both PYTHONPATH and virtualenv would be much less important. But given lack of enthusiasm for that, I don't think it's reasonable to assume there is only one sys.path on every system.
This is really the key insight that should be driving us IMO. From the system vendor point of view, my job is to ensure the *system* works right, and that everything written in Python that provides system functionality is compatible with whatever versions of third party Python packages I provide in a particular OS version. That's already a hard enough problem, that frankly any illusions that I can also provide useful versions for higher level applications that people will deploy on my OS is just madness. This is why I get lots of people requesting versioned imports, or simply resorting to venv/buildout/chef/puppet/juju to deploy *their* applications on the OS. There's just no other sane way to do it. I do think Python could do better, but obviously it's a difficult problem. I suspect that having venv support out of the box in 3.3 will go a long way to solving some class of these problems. I don't know if that will be the *only* answer. -Barry
Chris McDonough <chrism <at> plope.com> writes:
On 06/21/2012 04:45 AM, Nick Coghlan wrote:
A packaging PEP needs to explain: - what needs to be done to eliminate any need for monkeypatching - what's involved in making sure that *.pth are *not* needed by default - making sure that executable code in implicitly loaded *.pth files isn't used *at all*
I'll note that these goals are completely sideways to any actual functional goal. It'd be a shame to have monkeypatching going on, but the other stuff I don't think are reasonable goals. Instead they represent fears, and those fears just need to be managed.
Managed how? Whose functional goals? It's good to have something that works here and now, but surely there's more to it. Presumably distutils worked for some value of "worked" up until the point where it didn't, and setuptools needed to improve on it. Oscar's example shows how setuptools is broken for some use cases. Nor does it consider, for example, the goals of OS distro packagers in the same way that packaging has tried to. You're encouraging core devs to use setuptools, but as most seem to agree that distutils is (quick-)sand and setuptools is built on sand, it's hard to see setuptools as anything other than a stopgap, the best we have until something better can be devised. The command-class based design of distutils and hence setuptools doesn't seem to be something to bet the future on. As an infrastructure concern, this area of functionality definitely needs to be supported in the stdlib, even if it's a painful process getting there. The barriers seem more social than technical, but hopefully the divide-and-conquer-with-multiple-PEPs approach will prevail. Regards, Vinay Sajip
On Jun 21, 2012, at 07:48 AM, Chris McDonough wrote:
I don't know about Red Hat but both Ubuntu and Apple put all kinds of stuff on the default sys.path of the system Python of the box that's related to their software's concerns only. I don't understand why people accept this but get crazy about the fact that installing a setuptools distribution using easy_install changes the default sys.path.
Frankly, I've long thought that distros like Debian/Ubuntu which rely so much on Python for essential system functions should basically have two Python stacks. One would be used for just those system functions and the other would be for application deployment. OTOH, I often hear from application developers on Ubuntu that they basically have to build up their own stack *anyway* if they want to ensure they've got the right suite of dependencies. This is where tools like virtualenv and buildout on the lower end and chef/puppet/juju on the higher end come into play. -Barry
On Thu, Jun 21, 2012 at 11:57 PM, Barry Warsaw <barry@python.org> wrote:
On Jun 21, 2012, at 07:48 AM, Chris McDonough wrote:
I don't know about Red Hat but both Ubuntu and Apple put all kinds of stuff on the default sys.path of the system Python of the box that's related to their software's concerns only. I don't understand why people accept this but get crazy about the fact that installing a setuptools distribution using easy_install changes the default sys.path.
Frankly, I've long thought that distros like Debian/Ubuntu which rely so much on Python for essential system functions should basically have two Python stacks. One would be used for just those system functions and the other would be for application deployment. OTOH, I often hear from application developers on Ubuntu that they basically have to build up their own stack *anyway* if they want to ensure they've got the right suite of dependencies. This is where tools like virtualenv and buildout on the lower end and chef/puppet/juju on the higher end come into play.
Yeah, I liked Hynek's method for blending a Python-centric application development approach with a system packaging centric configuration management approach: take an entire virtualenv and package *that* as a single system package. Another strategy that can work is application specific system package repos, but you have to be very committed to a particular OS and packaging system for that approach to make a lot of sense :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 06/21/2012 05:57 AM, Nick Coghlan wrote:
On Thu, Jun 21, 2012 at 3:29 AM, PJ Eby<pje@telecommunity.com> wrote:
On Wed, Jun 20, 2012 at 9:02 AM, Nick Coghlan<ncoghlan@gmail.com> wrote:
On Wed, Jun 20, 2012 at 9:46 PM, Antoine Pitrou<solipsis@pitrou.net> wrote:
Agreed, especially if the "proven in the wild" criterion is required (people won't rush to another third-party distutils replacement, IMHO).
The existence of setuptools means that "proven in the wild" is never going to fly - a whole lot of people use setuptools and easy_install happily, because they just don't care about the downsides it has in terms of loss of control of a system configuration.
Um, this may be a smidge off topic, but what "loss of control" are we talking about here? AFAIK, there isn't anything it does that you can't override with command line options or the config file. (In most cases, standard distutils options or config files.) Do you just mean that most people use the defaults and don't care about there being other options? And if that's the case, which other options are you referring to?
No, I mean there are design choices in setuptools that explain why many people don't like it and are irritated when software they want to use depends on it without a good reason. Clearly articulating the reasons that "just include setuptools" is no longer being considered as an option should be one of the goals of any PEPs associated with adding packaging back for 3.4.
The reasons I'm personally aware of: - it's a unilateral runtime fork of the standard library that bears a lot of responsibility for the ongoing feature freeze in distutils. Standard assumptions about the behaviour of site and distutils cease to be valid once setuptools is installed - overuse of "*.pth" files and the associated sys.path changes for all Python programs running on a system. setuptools gleefully encourages the inclusion of non-trivial code snippets in *.pth files that will be executed by all programs. - advocacy for the "egg" format and the associated sys.path changes that result for all Python programs running on a system - too much magic that is enabled by default and is hard to switch off (e.g. http://rhodesmill.org/brandon/2009/eby-magic/)
System administrators (and developers that think like system administrators when it comes to configuration management) *hate* what setuptools (and setuptools based installers) can do to their systems. It doesn't matter that package developers don't *have* to do those things - what matters is that the needs and concerns of system administrators simply don't appear to have been anywhere on the radar when setuptools was being designed. (If those concerns actually were taken into account at some point, it's sure hard to tell from the end result and the choices of default behaviour)
David Cournapeau's Bento project takes the opposite approach, everything is explicit and without any magic. http://cournape.github.com/Bento/ It had its 0.1.0 release a week ago. Please, I don't want to reopen any discussions about Bento here -- distutils2 vs. Bento discussions have been less than constructive in the past -- I just wanted to make sure everybody is aware that distutils2 isn't the only horse in this race. I don't know if there are others too? -- Dag Sverre Seljebotn
On 6/21/12 11:08 AM, Dag Sverre Seljebotn wrote:
... David Cournapeau's Bento project takes the opposite approach, everything is explicit and without any magic.
http://cournape.github.com/Bento/
It had its 0.1.0 release a week ago.
Please, I don't want to reopen any discussions about Bento here -- distutils2 vs. Bento discussions have been less than constructive in the past -- I just wanted to make sure everybody is aware that distutils2 isn't the only horse in this race. I don't know if there are others too?
That's *exactly* the kind of approach that has made me not want to continue. People are too focused on implementations, and 'how distutils sucks' 'how setuptools sucks' etc 'I'll do better' etc Instead of having all the folks involved in packaging sit down together and try to fix the issues together by building PEPs describing what would be a common set of standards, they want to create their own tools from scratch. That will not work. And I will say here again what I think we should do imho: 1/ take all the packaging PEPs and rework them until everyone is happy (compilation sucks in distutils ? write a PEP !!!) 2/ once we have a consensus, write as many tools as you want, if they rely on the same standards => interoperability => win. But I must be naive because everytime I tried to reach people that were building their own tools to ask them to work with us on the PEPs, all I was getting was "distutils sucks!' It worked with the OS packagers guys though, we have built a great data files managment system in packaging + the versions (386)
-- Dag Sverre Seljebotn _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
On 06/21/2012 01:56 PM, Tarek Ziadé wrote:
On 6/21/12 11:08 AM, Dag Sverre Seljebotn wrote:
... David Cournapeau's Bento project takes the opposite approach, everything is explicit and without any magic.
http://cournape.github.com/Bento/
It had its 0.1.0 release a week ago.
Please, I don't want to reopen any discussions about Bento here -- distutils2 vs. Bento discussions have been less than constructive in the past -- I just wanted to make sure everybody is aware that distutils2 isn't the only horse in this race. I don't know if there are others too?
That's *exactly* the kind of approach that has made me not want to continue.
People are too focused on implementations, and 'how distutils sucks' 'how setuptools sucks' etc 'I'll do better' etc
Instead of having all the folks involved in packaging sit down together and try to fix the issues together by building PEPs describing what would be a common set of standards, they want to create their own tools from scratch.
Guido was asked about build issues and scientific software at PyData this spring, and his take was that "if scientific users have concerns that are that special, perhaps you just need to go and do your own thing". Which is what David is doing. Trailing Q&A session here: http://www.youtube.com/watch?v=QjXJLVINsSA Generalizing a bit I think it's "web developers" and "scientists" typically completely failing to see each others' usecases. I don't know if that bridge can be crossed through mailing list discussion alone. I know that David tried but came to a point where he just had to unsubscribe to distutils-sig. Sometimes design by committee is just what you want, and sometimes design by committee doesn't work. ZeroMQ, for instance, is a great piece of software resulting from dropping out of the AQMP committee.
That will not work. And I will say here again what I think we should do imho:
1/ take all the packaging PEPs and rework them until everyone is happy (compilation sucks in distutils ? write a PEP !!!)
I think the only way of making scientists happy is to make the build tool choice arbitrary (and allow the use of waf, scons, cmake, jam, ant, etc. for the build). After all, many projects contains more C++ and Fortran code than Python code. (Of course, one could make a PEP saying that.) Right now things are so horribly broken for the scientific community that I'm not sure if one *can* sanely specify PEPs. It's more a question of playing around and throwing things at the wall and see what sticks -- 5 years from now one is perhaps in a position where the problem is really understood and one can write PEPs. Perhaps the "web developers" are at the PEP-ing stage already. Great for you. But the usecases are really different. Anyway: I really don't want to start a flame-war here. So let's accept up front that we likely won't agree here; I just wanted to clarify my position. (Some context: I might have funding to work 2 months full-time on distributing Python software on HPC clusters this autumn. It's not really related to Bento (or distutils though, more of a client tool using those libraries) Dag Sverre Seljebotn
On 6/21/12 2:45 PM, Dag Sverre Seljebotn wrote:
Guido was asked about build issues and scientific software at PyData this spring, and his take was that "if scientific users have concerns that are that special, perhaps you just need to go and do your own thing". Which is what David is doing.
Trailing Q&A session here: http://www.youtube.com/watch?v=QjXJLVINsSA
if you know what you want and have a tool that does it, why bother using distutils ? But then, what your community will do with the guy that create packages with distutils ? just tell him he suck ? The whole idea is *interoperability*, not the tool used.
Generalizing a bit I think it's "web developers" and "scientists" typically completely failing to see each others' usecases. I don't know if that bridge can be crossed through mailing list discussion alone. I know that David tried but came to a point where he just had to unsubscribe to distutils-sig.
I was there, and sorry to be blunt, but he came to tell us we had to drop distutils because it sucked, and left because we did not follow that path
Sometimes design by committee is just what you want, and sometimes design by committee doesn't work. ZeroMQ, for instance, is a great piece of software resulting from dropping out of the AQMP committee.
That will not work. And I will say here again what I think we should do imho:
1/ take all the packaging PEPs and rework them until everyone is happy (compilation sucks in distutils ? write a PEP !!!)
I think the only way of making scientists happy is to make the build tool choice arbitrary (and allow the use of waf, scons, cmake, jam, ant, etc. for the build). After all, many projects contains more C++ and Fortran code than Python code. (Of course, one could make a PEP saying that.)
Right now things are so horribly broken for the scientific community that I'm not sure if one *can* sanely specify PEPs. It's more a question of playing around and throwing things at the wall and see what sticks -- 5 years from now one is perhaps in a position where the problem is really understood and one can write PEPs.
Perhaps the "web developers" are at the PEP-ing stage already. Great for you. But the usecases are really different.
If you sit down and ask your self: "what are the information a python project should give me so I can compile its extensions ?" I think this has nothing to do with the tools/implementations. And if we're able to write down in a PEP this, e.g. the information a compiler is looking for to do its job, then any tool out there waf, scons, cmake, jam, ant, etc, can do the job, no ?
Anyway: I really don't want to start a flame-war here. So let's accept up front that we likely won't agree here; I just wanted to clarify my position.
After 4 years I still don't understand what "we won't agree" means in this context. *NO ONE* ever ever came and told me : here's what I want a Python project to describe for its extensions. Just "we won't agree" or "distutils sucks" :) Gosh I hope we will overcome this lock one day, and move forward :D
On 06/21/2012 03:23 PM, Tarek Ziadé wrote:
On 6/21/12 2:45 PM, Dag Sverre Seljebotn wrote:
Guido was asked about build issues and scientific software at PyData this spring, and his take was that "if scientific users have concerns that are that special, perhaps you just need to go and do your own thing". Which is what David is doing.
Trailing Q&A session here: http://www.youtube.com/watch?v=QjXJLVINsSA
if you know what you want and have a tool that does it, why bother using distutils ?
But then, what your community will do with the guy that create packages with distutils ? just tell him he suck ?
The whole idea is *interoperability*, not the tool used.
Generalizing a bit I think it's "web developers" and "scientists" typically completely failing to see each others' usecases. I don't know if that bridge can be crossed through mailing list discussion alone. I know that David tried but came to a point where he just had to unsubscribe to distutils-sig.
I was there, and sorry to be blunt, but he came to tell us we had to drop distutils because it sucked, and left because we did not follow that path
Sometimes design by committee is just what you want, and sometimes design by committee doesn't work. ZeroMQ, for instance, is a great piece of software resulting from dropping out of the AQMP committee.
That will not work. And I will say here again what I think we should do imho:
1/ take all the packaging PEPs and rework them until everyone is happy (compilation sucks in distutils ? write a PEP !!!)
I think the only way of making scientists happy is to make the build tool choice arbitrary (and allow the use of waf, scons, cmake, jam, ant, etc. for the build). After all, many projects contains more C++ and Fortran code than Python code. (Of course, one could make a PEP saying that.)
Right now things are so horribly broken for the scientific community that I'm not sure if one *can* sanely specify PEPs. It's more a question of playing around and throwing things at the wall and see what sticks -- 5 years from now one is perhaps in a position where the problem is really understood and one can write PEPs.
Perhaps the "web developers" are at the PEP-ing stage already. Great for you. But the usecases are really different.
If you sit down and ask your self: "what are the information a python project should give me so I can compile its extensions ?" I think this has nothing to do with the tools/implementations.
I'm not sure if I understand. A project can't "give the information needed to build it". The build system is an integrated piece of the code and package itself. Making the build of library X work on some ugly HPC setup Y is part of the development of X. To my mind a solution looks something like (and Bento is close to this): Step 1) "Some standard" to do configuration of a package (--prefix and other what-goes-where options, what libraries to link with, what compilers to use...) Step 2) Launch the package's custom build system (may be Unix shell script or makefile in some cases (sometimes portability is not a goal), may be a waf build) Step 3) "Some standard" to be able to cleanly install/uninstall/upgrade the product of step 2) An attempt to do Step 2) in a major way in the packaging framework itself, and have the package just "declare" its C extensions, would not work. It's fine to have a way in the packaging framework that works for trivial cases, but it's impossible to create something that works for every case.
And if we're able to write down in a PEP this, e.g. the information a compiler is looking for to do its job, then any tool out there waf, scons, cmake, jam, ant, etc, can do the job, no ?
Anyway: I really don't want to start a flame-war here. So let's accept up front that we likely won't agree here; I just wanted to clarify my position.
After 4 years I still don't understand what "we won't agree" means in this context. *NO ONE* ever ever came and told me : here's what I want a Python project to describe for its extensions.
That's unfortunate. To be honest, it's probably partly because it's easier to say what won't work than come with a constructive suggestion. A lot of people (me included) just use waf/cmake/autotools, and forget about making the code installable through PyPI or any of the standard Python tools. Just because that works *now* for us, but we don't have any good ideas for how to make this into something that works on a wider scale. I think David is one of the few who has really dug into the matter and tried to find something that can both do builds and work through standard install mechanisms. I can't answer for why you haven't been able to understand one another. It may also be an issue with how much one can constructively do on mailing lists. Perhaps the only route forward is to to bring people together in person and walk distutils2 people through some hairy scientific HPC builds (and vice versa).
Just "we won't agree" or "distutils sucks" :)
Gosh I hope we will overcome this lock one day, and move forward :D
Well, me too. Dag
On 6/21/12 4:26 PM, Dag Sverre Seljebotn wrote:
project should give me so I can compile its extensions ?" I think this has nothing to do with the tools/implementations. If you sit down and ask your self: "what are the information a python
I'm not sure if I understand. A project can't "give the information needed to build it". The build system is an integrated piece of the code and package itself. Making the build of library X work on some ugly HPC setup Y is part of the development of X.
To my mind a solution looks something like (and Bento is close to this):
Step 1) "Some standard" to do configuration of a package (--prefix and other what-goes-where options, what libraries to link with, what compilers to use...)
Step 2) Launch the package's custom build system (may be Unix shell script or makefile in some cases (sometimes portability is not a goal), may be a waf build)
Step 3) "Some standard" to be able to cleanly install/uninstall/upgrade the product of step 2)
An attempt to do Step 2) in a major way in the packaging framework itself, and have the package just "declare" its C extensions, would not work. It's fine to have a way in the packaging framework that works for trivial cases, but it's impossible to create something that works for every case.
I think we should, as you proposed, list a few projects w/ compilation needs -- from the simplest to the more complex, then see how a standard *description* could be used by any tool
And if we're able to write down in a PEP this, e.g. the information a compiler is looking for to do its job, then any tool out there waf, scons, cmake, jam, ant, etc, can do the job, no ?
Anyway: I really don't want to start a flame-war here. So let's accept up front that we likely won't agree here; I just wanted to clarify my position.
After 4 years I still don't understand what "we won't agree" means in this context. *NO ONE* ever ever came and told me : here's what I want a Python project to describe for its extensions.
That's unfortunate. To be honest, it's probably partly because it's easier to say what won't work than come with a constructive suggestion. A lot of people (me included) just use waf/cmake/autotools, and forget about making the code installable through PyPI or any of the standard Python tools. Just because that works *now* for us, but we don't have any good ideas for how to make this into something that works on a wider scale.
I think David is one of the few who has really dug into the matter and tried to find something that can both do builds and work through standard install mechanisms. I can't answer for why you haven't been able to understand one another.
It may also be an issue with how much one can constructively do on mailing lists. Perhaps the only route forward is to to bring people together in person and walk distutils2 people through some hairy scientific HPC builds (and vice versa).
Like versions scheme, I think it's fine if you guys have a more complex system to build software. But there should be a way to share a common standard for complation, even if people that uses distutils2 or xxx, are just doing the dumbest things, like simple C libs compilation.
Just "we won't agree" or "distutils sucks" :)
Gosh I hope we will overcome this lock one day, and move forward :D
Well, me too.
The other thing is, the folks in distutils2 and myself, have zero knowledge about compilers. That's why we got very frustrated not to see people with that knowledge come and help us in this area. So, I reiterate my proposal, and it could also be expressed like this: 1/ David writes a PEP where he describes how Bento interact with a project -- metadata, description files, etc 2/ Someone from distutils2 completes the PEP by describing how setup.cfg works wrt Extensions 3/ we see if we can have a common standard even if it's a subset of bento capabilities
Dag _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
On 06/21/2012 09:05 PM, Tarek Ziadé wrote:
On 6/21/12 4:26 PM, Dag Sverre Seljebotn wrote:
project should give me so I can compile its extensions ?" I think this has nothing to do with the tools/implementations. If you sit down and ask your self: "what are the information a python
I'm not sure if I understand. A project can't "give the information needed to build it". The build system is an integrated piece of the code and package itself. Making the build of library X work on some ugly HPC setup Y is part of the development of X.
To my mind a solution looks something like (and Bento is close to this):
Step 1) "Some standard" to do configuration of a package (--prefix and other what-goes-where options, what libraries to link with, what compilers to use...)
Step 2) Launch the package's custom build system (may be Unix shell script or makefile in some cases (sometimes portability is not a goal), may be a waf build)
Step 3) "Some standard" to be able to cleanly install/uninstall/upgrade the product of step 2)
An attempt to do Step 2) in a major way in the packaging framework itself, and have the package just "declare" its C extensions, would not work. It's fine to have a way in the packaging framework that works for trivial cases, but it's impossible to create something that works for every case.
I think we should, as you proposed, list a few projects w/ compilation needs -- from the simplest to the more complex, then see how a standard *description* could be used by any tool
It's not clear to me what you mean by description. Package metadata, install information or description of what/how to build? I hope you don't mean the latter, that would be insane...it would effectively amount to creating a build tool that's both more elegant and more powerful than any option that's currently already out there. Assuming you mean the former, that's what David did to create Bento. Reading and understanding Bento and the design decisions going into it would be a better use of time than redoing a discussion, and would at least be a very good starting point. But anyway, some project types from simple to advanced: - Simple library using Cython + NumPy C API - Wrappers around HPC codes like mpi4py, petsc4py - NumPy - SciPy (uses Fortran compilers too) - Library using code generation, Cython, NumPy C API, Fortran 90 code, some performance tuning with CPU characteristics (instruction set, cache size, optimal loop structure) decided compile-time
And if we're able to write down in a PEP this, e.g. the information a compiler is looking for to do its job, then any tool out there waf, scons, cmake, jam, ant, etc, can do the job, no ?
Anyway: I really don't want to start a flame-war here. So let's accept up front that we likely won't agree here; I just wanted to clarify my position.
After 4 years I still don't understand what "we won't agree" means in this context. *NO ONE* ever ever came and told me : here's what I want a Python project to describe for its extensions.
That's unfortunate. To be honest, it's probably partly because it's easier to say what won't work than come with a constructive suggestion. A lot of people (me included) just use waf/cmake/autotools, and forget about making the code installable through PyPI or any of the standard Python tools. Just because that works *now* for us, but we don't have any good ideas for how to make this into something that works on a wider scale.
I think David is one of the few who has really dug into the matter and tried to find something that can both do builds and work through standard install mechanisms. I can't answer for why you haven't been able to understand one another.
It may also be an issue with how much one can constructively do on mailing lists. Perhaps the only route forward is to to bring people together in person and walk distutils2 people through some hairy scientific HPC builds (and vice versa).
Like versions scheme, I think it's fine if you guys have a more complex system to build software. But there should be a way to share a common standard for complation, even if people that uses distutils2 or xxx, are just doing the dumbest things, like simple C libs compilation.
Just "we won't agree" or "distutils sucks" :)
Gosh I hope we will overcome this lock one day, and move forward :D
Well, me too.
The other thing is, the folks in distutils2 and myself, have zero knowledge about compilers. That's why we got very frustrated not to see people with that knowledge come and help us in this area.
Here's the flip side: If you have zero knowledge about compilers, it's going to be almost impossible to have a meaningful discussion about a compilation PEP. It's very hard to discuss standards unless everybody involved have the necessary prerequisite knowledge. You don't go discussing details of the Linux kernel without some solid C experience either. The necessary prerequisites in this case is not merely "knowledge of compilers". To avoid repeating mistakes of the past, the prerequisites for a meaningful discussion is years of hard-worn experience building software in various languages, on different platforms, using different build tools. Look, these problems are really hard to deal with. Myself I have experience with building 2-3 languages using 2-3 build tools on 2 platforms, and I consider myself a complete novice and usually decide to trust David's instincts over trying to make up an opinion of my own -- simply because I know he's got a lot more experience than I have. Theoretically it is possible to separate and isolate concerns so that one set of people discuss build integration and another set of people discuss installation. Problem is that all the problems tangle -- in particular when the starting point is distutils! That's why *sometimes*, not always, design by committee is the wrong approach, and one-man-shows is what brings technology forwards.
So, I reiterate my proposal, and it could also be expressed like this:
1/ David writes a PEP where he describes how Bento interact with a project -- metadata, description files, etc 2/ Someone from distutils2 completes the PEP by describing how setup.cfg works wrt Extensions 3/ we see if we can have a common standard even if it's a subset of bento capabilities
bento isn't a build tool, it's a packaging tool, competing directly with distutils2. It can deal with simple distutils-like builds using a bundled build tool, and currently has integration with waf for complicated builds; integration with other build systems will presumably be added later as people need it (the main point is that bento is designed for it). Dag
On 6/21/12 10:46 PM, Dag Sverre Seljebotn wrote: ...
I think we should, as you proposed, list a few projects w/ compilation needs -- from the simplest to the more complex, then see how a standard *description* could be used by any tool
It's not clear to me what you mean by description. Package metadata, install information or description of what/how to build?
I hope you don't mean the latter, that would be insane...it would effectively amount to creating a build tool that's both more elegant and more powerful than any option that's currently already out there.
Assuming you mean the former, that's what David did to create Bento. Reading and understanding Bento and the design decisions going into it would be a better use of time than redoing a discussion, and would at least be a very good starting point.
What I mean is : what would it take to use Bento (or another tool) as the compiler in a distutils-based project, without having to change the distutils metadata.
But anyway, some project types from simple to advanced:
- Simple library using Cython + NumPy C API - Wrappers around HPC codes like mpi4py, petsc4py - NumPy - SciPy (uses Fortran compilers too) - Library using code generation, Cython, NumPy C API, Fortran 90 code, some performance tuning with CPU characteristics (instruction set, cache size, optimal loop structure) decided compile-time
I'd add: - A Distutils project with a few ExtensionsThe other thing is, the folks in distutils2 and myself, have zero
knowledge about compilers. That's why we got very frustrated not to see people with that knowledge come and help us in this area.
Here's the flip side: If you have zero knowledge about compilers, it's going to be almost impossible to have a meaningful discussion about a compilation PEP. It's very hard to discuss standards unless everybody involved have the necessary prerequisite knowledge. You don't go discussing details of the Linux kernel without some solid C experience either. Consider me as the end user that want to have his 2 C modules compiled in their Python project.
The necessary prerequisites in this case is not merely "knowledge of compilers". To avoid repeating mistakes of the past, the prerequisites for a meaningful discussion is years of hard-worn experience building software in various languages, on different platforms, using different build tools.
Look, these problems are really hard to deal with. Myself I have experience with building 2-3 languages using 2-3 build tools on 2 platforms, and I consider myself a complete novice and usually decide to trust David's instincts over trying to make up an opinion of my own -- simply because I know he's got a lot more experience than I have.
Theoretically it is possible to separate and isolate concerns so that one set of people discuss build integration and another set of people discuss installation. Problem is that all the problems tangle -- in particular when the starting point is distutils!
That's why *sometimes*, not always, design by committee is the wrong approach, and one-man-shows is what brings technology forwards.
I am not saying this should be designed by a commitee, but rather - if such a tool can be made compatible with simple Distutils project, the guy behind this tool can probably help on a PEP with feedback from a larger audience than the sci community. What bugs me is to say that we live in two separate worlds and cannot build common pieces. This is not True.
So, I reiterate my proposal, and it could also be expressed like this:
1/ David writes a PEP where he describes how Bento interact with a project -- metadata, description files, etc 2/ Someone from distutils2 completes the PEP by describing how setup.cfg works wrt Extensions 3/ we see if we can have a common standard even if it's a subset of bento capabilities
bento isn't a build tool, it's a packaging tool, competing directly with distutils2. It can deal with simple distutils-like builds using a bundled build tool, and currently has integration with waf for complicated builds; integration with other build systems will presumably be added later as people need it (the main point is that bento is designed for it).
I am not interested in Bento-the-tool. I am interested in what such a tool needs from a project to use it => "It can deal with simple distutils-like builds using a bundled build tool" => If I understand this correctly, does that mean that Bento can build a distutils project with the distutils Metadata ? If this is the case it means that there a piece of function that translates Distutils metadata into something Bento deals with. That's the part I am interested in for interoperability.
Dag _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
On Thu, Jun 21, 2012 at 10:04 PM, Tarek Ziadé <tarek@ziade.org> wrote:
On 6/21/12 10:46 PM, Dag Sverre Seljebotn wrote: ...
I think we should, as you proposed, list a few projects w/ compilation
needs -- from the simplest to the more complex, then see how a standard *description* could be used by any tool
It's not clear to me what you mean by description. Package metadata, install information or description of what/how to build?
I hope you don't mean the latter, that would be insane...it would effectively amount to creating a build tool that's both more elegant and more powerful than any option that's currently already out there.
Assuming you mean the former, that's what David did to create Bento. Reading and understanding Bento and the design decisions going into it would be a better use of time than redoing a discussion, and would at least be a very good starting point.
What I mean is : what would it take to use Bento (or another tool) as the compiler in a distutils-based project, without having to change the distutils metadata.
I think there is a misunderstanding of what bento is: bento is not a compiler or anything like that. It is a set of libraries that work together to configure, build and install a python project. Concretely, in bento, there is - a part that build a packager description (Distribution-like in distutils-parlance) from a bento.info (a bite like setup.cfg) - a set of tools of commands around this package description. - a set of "backends" to e.g. use waf to build C extension with full and automatic dependency analysis (rebuild this if this other thing is out of date), parallel builds and configuration. Bento scripts build numpy more efficiently and reliable while being 50 % shorter than our setup.py. - a small library to build a distutils-compatible Distribution so that you can write a 3 lines setup.py that takes all its info from bento.infoand allow for pip to work. Now, you could produce a similar package description from the setup.cfg to be fed to bento, but I don't really see the point since AFAIK, bento.infois strictly more powerful as a format than setup.cfg. Another key point is that the commands around this package description are almost entirely decoupled from each other: this is the hard part, and something that is not really possible to do with the current distutils design in an incremental way. - Command don't know about each other and dependencies between commands are *external* to commands. You say command "build" depends on command "configure", those dependencies are resolved at runtime. This allows for 3rd parties to insert new command without interfering with each other. - options are registered and handled outside command as well: each command can query any other command options. I believe something similar is now available in distutils2, though. Bento allow to add arbitrary configure options to customize library directories (ala autoconf). - bento internally has an explicit "database" of built files, with associated categories, and the build command produces a build "manifest". The build manifest + the build tree defines completely the input for install and installers command. The different binary installers use the same build manifest, and the build manifest is actually designed as to allow lossless convertion between different installers (e.g. wininst <-> msi, egg <-> mpkg on mac, etc…). This is what allows in principle to use make, gyp, etc… to produce this build manifest
"It can deal with simple distutils-like builds using a bundled build tool" => If I understand this correctly, does that mean that Bento can build a distutils project with the distutils Metadata ?
I think Dag meant that bento has a system where you can basically do # setup.py from distutils.core import setup import bento.distutils bento.distutils.monkey_patch() setup() and this setup.py will automatically build a distutils Distribution populated from bento.info. This allows a bento package to be installable with pip or anything that expected a setup.py This allows for interoperability without having to depend on all the distutils issues. David
On 6/21/12 11:55 PM, David Cournapeau wrote:
I think there is a misunderstanding of what bento is: bento is not a compiler or anything like that. It is a set of libraries that work together to configure, build and install a python project.
Concretely, in bento, there is - a part that build a packager description (Distribution-like in distutils-parlance) from a bento.info <http://bento.info> (a bite like setup.cfg) - a set of tools of commands around this package description. - a set of "backends" to e.g. use waf to build C extension with full and automatic dependency analysis (rebuild this if this other thing is out of date), parallel builds and configuration. Bento scripts build numpy more efficiently and reliable while being 50 % shorter than our setup.py. - a small library to build a distutils-compatible Distribution so that you can write a 3 lines setup.py that takes all its info from bento.info <http://bento.info> and allow for pip to work.
Now, you could produce a similar package description from the setup.cfg to be fed to bento, but I don't really see the point since AFAIK, bento.info <http://bento.info> is strictly more powerful as a format than setup.cfg.
So that means that *today*, Bento can consume Distutils2 project and compiles them, just by reading their setup.cfg, right ? And the code you have to convert setup.cfg into bento.info is what I was talking about. It means that I can create a project without a setup.py file, and just setup.cfg, and have it working with distutils2 *or* bento That's *exactly* what I was talking about. the setup.cfg is the *common* standard, and is planned to be published at PyPI statically. Let people out there use their tool of their choice to install a project defined by a setup.cfg so 2 questions: 1/ does Bento install things following PEP 376 ? 2/ how does the setup.cfg hooks work wrt Bento ? and last one proposal: how a PEP that defines a setup.cfg standard that is Bento-friendly, but still distutils2-friendly would sound ?
On 06/21/2012 11:04 PM, Tarek Ziadé wrote:
On 6/21/12 10:46 PM, Dag Sverre Seljebotn wrote: ...
I think we should, as you proposed, list a few projects w/ compilation needs -- from the simplest to the more complex, then see how a standard *description* could be used by any tool
It's not clear to me what you mean by description. Package metadata, install information or description of what/how to build?
I hope you don't mean the latter, that would be insane...it would effectively amount to creating a build tool that's both more elegant and more powerful than any option that's currently already out there.
Assuming you mean the former, that's what David did to create Bento. Reading and understanding Bento and the design decisions going into it would be a better use of time than redoing a discussion, and would at least be a very good starting point.
What I mean is : what would it take to use Bento (or another tool) as the compiler in a distutils-based project, without having to change the distutils metadata.
As for current distutils/setuptools/distribute metadata, the idea is you run the bento conversion utility to convert it to Bento metadata, then use Bento. Please read http://cournape.github.com/Bento/ There may be packages where this doesn't work and you'd need to tweak the results yourself though.
Here's the flip side: If you have zero knowledge about compilers, it's going to be almost impossible to have a meaningful discussion about a compilation PEP. It's very hard to discuss standards unless everybody involved have the necessary prerequisite knowledge. You don't go discussing details of the Linux kernel without some solid C experience either. Consider me as the end user that want to have his 2 C modules compiled in their Python project.
OK, so can I propose that you kill off distutils2 and use bento wholesale instead? Obviously not. So you're not just an end-user. That illusion would wear rather thin very quickly.
The necessary prerequisites in this case is not merely "knowledge of compilers". To avoid repeating mistakes of the past, the prerequisites for a meaningful discussion is years of hard-worn experience building software in various languages, on different platforms, using different build tools.
Look, these problems are really hard to deal with. Myself I have experience with building 2-3 languages using 2-3 build tools on 2 platforms, and I consider myself a complete novice and usually decide to trust David's instincts over trying to make up an opinion of my own -- simply because I know he's got a lot more experience than I have.
Theoretically it is possible to separate and isolate concerns so that one set of people discuss build integration and another set of people discuss installation. Problem is that all the problems tangle -- in particular when the starting point is distutils!
That's why *sometimes*, not always, design by committee is the wrong approach, and one-man-shows is what brings technology forwards.
I am not saying this should be designed by a commitee, but rather - if such a tool can be made compatible with simple Distutils project, the guy behind this tool can probably help on a PEP with feedback from a larger audience than the sci community.
What bugs me is to say that we live in two separate worlds and cannot build common pieces. This is not True.
I'm not saying it's *impossible* to build common pieces, I'm suggesting that it's not cost-effective in terms of man-hours going into it. And the problem isn't technical as much as social and the mix of people and skill sets involved. But David really made that decision for me when he left distutils-sig, I'm not going to spend my own time and energy trying to get decent builds shoehorned into distutils2 when he is busy working on a solution. (David already spent loads of time on trying to integrate scons with distutils (the numscons project) and maintained numpy.distutils and scipy builds for years; I trust his judgement above pretty much anybody else's.)
So, I reiterate my proposal, and it could also be expressed like this:
1/ David writes a PEP where he describes how Bento interact with a project -- metadata, description files, etc 2/ Someone from distutils2 completes the PEP by describing how setup.cfg works wrt Extensions 3/ we see if we can have a common standard even if it's a subset of bento capabilities
bento isn't a build tool, it's a packaging tool, competing directly with distutils2. It can deal with simple distutils-like builds using a bundled build tool, and currently has integration with waf for complicated builds; integration with other build systems will presumably be added later as people need it (the main point is that bento is designed for it). I am not interested in Bento-the-tool. I am interested in what such a tool needs from a project to use it =>
Again, you should read the elevator pitch at http://cournape.github.com/Bento/ + the Bento documentation.
"It can deal with simple distutils-like builds using a bundled build tool" => If I understand this correctly, does that mean that Bento can build a distutils project with the distutils Metadata ?
Sorry, what I meant with "distutils-like builds" is "two simple C extensions", i.e. the trivial build case. Dag
On 06/22/2012 12:05 AM, Dag Sverre Seljebotn wrote:
On 06/21/2012 11:04 PM, Tarek Ziadé wrote:
On 6/21/12 10:46 PM, Dag Sverre Seljebotn wrote: ...
I think we should, as you proposed, list a few projects w/ compilation needs -- from the simplest to the more complex, then see how a standard *description* could be used by any tool
It's not clear to me what you mean by description. Package metadata, install information or description of what/how to build?
I hope you don't mean the latter, that would be insane...it would effectively amount to creating a build tool that's both more elegant and more powerful than any option that's currently already out there.
Assuming you mean the former, that's what David did to create Bento. Reading and understanding Bento and the design decisions going into it would be a better use of time than redoing a discussion, and would at least be a very good starting point.
What I mean is : what would it take to use Bento (or another tool) as the compiler in a distutils-based project, without having to change the distutils metadata.
As for current distutils/setuptools/distribute metadata, the idea is you run the bento conversion utility to convert it to Bento metadata, then use Bento.
Please read
http://cournape.github.com/Bento/
There may be packages where this doesn't work and you'd need to tweak the results yourself though.
Here's the flip side: If you have zero knowledge about compilers, it's going to be almost impossible to have a meaningful discussion about a compilation PEP. It's very hard to discuss standards unless everybody involved have the necessary prerequisite knowledge. You don't go discussing details of the Linux kernel without some solid C experience either. Consider me as the end user that want to have his 2 C modules compiled in their Python project.
OK, so can I propose that you kill off distutils2 and use bento wholesale instead?
Obviously not. So you're not just an end-user. That illusion would wear rather thin very quickly.
I regret this comment, it's not helpful to the discussion. Trying again: David's numscons project was a large effort and it tried to integrate a proper build system (scons) with distutils. That effort didn't in the end go anywhere. But I think it did show that everything is coupled to everything, and that build system integration (and other "special" needs of the scipy community) affects everything in the package system. It's definitely not as simple as having somebody with compiler experience chime in on the isolated topic of how to build extensions. It's something that needs to drive the entire design process. Which is perhaps why it is difficult to have a package system designed by people who don't know compilers to be usable by people who need to use them in non-trivial ways. Dag
The necessary prerequisites in this case is not merely "knowledge of compilers". To avoid repeating mistakes of the past, the prerequisites for a meaningful discussion is years of hard-worn experience building software in various languages, on different platforms, using different build tools.
Look, these problems are really hard to deal with. Myself I have experience with building 2-3 languages using 2-3 build tools on 2 platforms, and I consider myself a complete novice and usually decide to trust David's instincts over trying to make up an opinion of my own -- simply because I know he's got a lot more experience than I have.
Theoretically it is possible to separate and isolate concerns so that one set of people discuss build integration and another set of people discuss installation. Problem is that all the problems tangle -- in particular when the starting point is distutils!
That's why *sometimes*, not always, design by committee is the wrong approach, and one-man-shows is what brings technology forwards.
I am not saying this should be designed by a commitee, but rather - if such a tool can be made compatible with simple Distutils project, the guy behind this tool can probably help on a PEP with feedback from a larger audience than the sci community.
What bugs me is to say that we live in two separate worlds and cannot build common pieces. This is not True.
I'm not saying it's *impossible* to build common pieces, I'm suggesting that it's not cost-effective in terms of man-hours going into it. And the problem isn't technical as much as social and the mix of people and skill sets involved.
But David really made that decision for me when he left distutils-sig, I'm not going to spend my own time and energy trying to get decent builds shoehorned into distutils2 when he is busy working on a solution.
(David already spent loads of time on trying to integrate scons with distutils (the numscons project) and maintained numpy.distutils and scipy builds for years; I trust his judgement above pretty much anybody else's.)
So, I reiterate my proposal, and it could also be expressed like this:
1/ David writes a PEP where he describes how Bento interact with a project -- metadata, description files, etc 2/ Someone from distutils2 completes the PEP by describing how setup.cfg works wrt Extensions 3/ we see if we can have a common standard even if it's a subset of bento capabilities
bento isn't a build tool, it's a packaging tool, competing directly with distutils2. It can deal with simple distutils-like builds using a bundled build tool, and currently has integration with waf for complicated builds; integration with other build systems will presumably be added later as people need it (the main point is that bento is designed for it). I am not interested in Bento-the-tool. I am interested in what such a tool needs from a project to use it =>
Again, you should read the elevator pitch at http://cournape.github.com/Bento/ + the Bento documentation.
"It can deal with simple distutils-like builds using a bundled build tool" => If I understand this correctly, does that mean that Bento can build a distutils project with the distutils Metadata ?
Sorry, what I meant with "distutils-like builds" is "two simple C extensions", i.e. the trivial build case.
Dag
On Thu, 21 Jun 2012 22:46:58 +0200 Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
The other thing is, the folks in distutils2 and myself, have zero knowledge about compilers. That's why we got very frustrated not to see people with that knowledge come and help us in this area.
Here's the flip side: If you have zero knowledge about compilers, it's going to be almost impossible to have a meaningful discussion about a compilation PEP.
If a PEP is being discussed, even a packaging PEP, it involves all of python-dev, so Tarek and Éric not being knowledgeable in compilers is not a big problem.
The necessary prerequisites in this case is not merely "knowledge of compilers". To avoid repeating mistakes of the past, the prerequisites for a meaningful discussion is years of hard-worn experience building software in various languages, on different platforms, using different build tools.
This is precisely the kind of knowledge that a PEP is aimed at distilling. Regards Antoine.
On Thu, Jun 21, 2012 at 11:00 PM, Antoine Pitrou <solipsis@pitrou.net>wrote:
On Thu, 21 Jun 2012 22:46:58 +0200 Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
The other thing is, the folks in distutils2 and myself, have zero knowledge about compilers. That's why we got very frustrated not to see people with that knowledge come and help us in this area.
Here's the flip side: If you have zero knowledge about compilers, it's going to be almost impossible to have a meaningful discussion about a compilation PEP.
If a PEP is being discussed, even a packaging PEP, it involves all of python-dev, so Tarek and Éric not being knowledgeable in compilers is not a big problem.
The necessary prerequisites in this case is not merely "knowledge of compilers". To avoid repeating mistakes of the past, the prerequisites for a meaningful discussion is years of hard-worn experience building software in various languages, on different platforms, using different build tools.
This is precisely the kind of knowledge that a PEP is aimed at distilling.
What would you imagine such a PEP would contain ? If you don't need to customize the compilation, then I would say refactoring what's in distutils is good enough. If you need customization, then I am convinced one should just use one of the existing build tools (waf, fbuild, scons, etc…). Python has more than enough of them already. By refactoring, I mean extracting it completely from command, and have an API similar to e.g. fbuild ( https://github.com/felix-lang/fbuild/blob/master/examples/c/fbuildroot.py), i.e. you basically have a class PythonBuilder.build_extension(name, sources, options). The key point is to remove any dependency on commands. If fbuild were not python3-specific, I would say just use that. It would cover most usecases. Actually,
Regards
Antoine.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/cournape%40gmail.com
Hi, On 6/21/12 7:56 AM, Tarek Ziadé wrote:
On 6/21/12 11:08 AM, Dag Sverre Seljebotn wrote:
... David Cournapeau's Bento project takes the opposite approach, everything is explicit and without any magic.
http://cournape.github.com/Bento/
It had its 0.1.0 release a week ago.
Please, I don't want to reopen any discussions about Bento here -- distutils2 vs. Bento discussions have been less than constructive in the past -- I just wanted to make sure everybody is aware that distutils2 isn't the only horse in this race. I don't know if there are others too?
That's *exactly* the kind of approach that has made me not want to continue.
People are too focused on implementations, and 'how distutils sucks' 'how setuptools sucks' etc 'I'll do better' etc
Instead of having all the folks involved in packaging sit down together and try to fix the issues together by building PEPs describing what would be a common set of standards, they want to create their own tools from scratch.
That will not work.
But you can't tell someone or some group of folks that, and expect them to listen. Most times NIH is pejorative[1], but sometimes something positive comes out of it.
And I will say here again what I think we should do imho:
1/ take all the packaging PEPs and rework them until everyone is happy (compilation sucks in distutils ? write a PEP !!!)
2/ once we have a consensus, write as many tools as you want, if they rely on the same standards => interoperability => win.
But I must be naive because everytime I tried to reach people that were building their own tools to ask them to work with us on the PEPs, all I was getting was "distutils sucks!'
And that's the best you can do: give your opinion. I understand the frustration, but we have to let people succeed and/or fail on their own[2].
It worked with the OS packagers guys though, we have built a great data files managment system in packaging + the versions (386)
Are you referring to "the" packaging/distutils2 or something else? Alex [1] http://en.wikipedia.org/wiki/Not_invented_here [2] http://docs.pythonpackages.com/en/latest/advanced.html#buildout-easy-install...
-- Dag Sverre Seljebotn _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
-- Alex Clark · http://pythonpackages.com
On Wed, Jun 20, 2012 at 11:57 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Right - clearly enumerating the features that draw people to use setuptools over just using distutils should be a key element in any PEP for 3.4
I honestly think a big part of why packaging ended up being incomplete for 3.3 is that we still don't have a clearly documented answer to two critical questions: 1. Why do people choose setuptools over distutils?
Some of the reasons: * Dependencies * Namespace packages * Less boilerplate in setup.py (revision control, data files support, find_packages(), etc.) * Entry points system for creating extensible applications and frameworks that need runtime plugin discovery * Command-line script wrappers * Binary plugin installation system for apps (i.e. dump eggs in a directory and let pkg_resources figure out what to put on sys.path) * "Test" command * Easy distribution of (and runtime access to) static data resources Of these, automatic dependency resolution with as close to 100% backward compatibility for installing other projects on PyPI was almost certainly the #1 factor driving setuptools' initial adoption. The 20% that drives the 80%, as it were. The rest are the 80% that brings in the remaining 20%.
2. What's wrong with setuptools that meant the idea of including it directly in the stdlib was ultimately dropped and eventually replaced with the goal of incorporating distutils2?
Based on the feedback from Python-Dev, I withdrew setuptools from 2.5 because of what I considered valid concerns raised regarding: 1. Lack of available persons besides myself familiar with the code base and design 2. Lack of design documents to remedy #1 3. Lack of unified end-user documentation And there was no time for me to fix all of that before 2.5 came out, although I did throw together the EggFormats documentation. After that, the time window where I was being paid (by OSAF) for setuptools improvements came to an end, and other projects started taking precedence. Since then, setuptools *itself* has become stable legacy code in much the same way that the distutils has: pip, buildout, and virtualenv all built on top of it, as it built on top of the distutils. Problem #3 remains, but at least now there are other people working on the codebase.
If the end goal is "the bulk of the setuptools feature set without the problematic features and default behaviours that make system administrators break out the torches and pitchforks", then we should *write that down* (and spell out the implications) rather than assuming that everyone knows the purpose of the exercise.
That's why I brought this up. ISTM that far too much of the knowledge of what those use cases and implications are, has been either buried in my head or spread out among diverse user communities in the past. Luckily, a lot of people from those communities are now getting considerably more involved in this effort. At the time of, say, the 2.5 setuptools question, there wasn't anybody around but me who was able to argue the "why eggs are good and useful" side of the discussion, for example. (If you look back to the early days of setuptools, I often asked on distutils-sig for people who could help assemble specs for various things... which I ended up just deciding for myself, because nobody was there to comment on them. It took *years* of setuptools actually being in the field and used before enough people knew enough to *want* to take part in the design discussions. The versioning and metadata PEPs were things I asked about many years prior, but nobody knew what they wanted yet, or even knew yet why they should care.) Similarly, in the years since then, MvL -- who originally argued against all things setuptools at 2.5 time -- actually proposed the original namespace package PEP. So I don't think it's unfair to say that, seven years ago, the ideas in setuptools were still a few years ahead of their "time". Today, console script generation, virtual environments, namespace packages, entry point discovery, setup.py-driven testing tools, static file inclusion, etc. are closer to "of course we should have that/everybody uses that" features, rather than esoteric oddities. That being said, setuptools *itself* is not such a good thing. It was originally a *private* add-on to distutils (like numpy's distutils extensions) and a prototyping sandbox for additions to the distutils. (E.g. setuptools features were added to distutils in 2.4 and 2.5.) I honestly didn't think at the time that I was writing those features (or even the egg stuff), that the *long term* goal would be for those things to be maintained in a separate package. Instead, I (rather optimistically) assumed that the value of the approaches would be self-evident, and copied the way the other setuptools features were. (To this day, there are an odd variety of other little experimental "future distutils enhancements" still living in the setuptools code base, like support for building shared libraries to be used in common between multiple C extensions.) By the way, for an overview of setuptools' components and use cases, and what happened with 2.5, see here: http://mail.python.org/pipermail/python-dev/2006-April/064145.html The plan I proposed was to phase out setuptools and merge its functionality into distutils for 2.6, but as I mentioned above, my available bandwidth to work on the project essentially vanished shortly after the above post; setuptools was pretty much "good enough" for OSAF's needs at the time, and they had other development priorities for my time. So, if we are to draw any lesson from the past, it would seem to be, "make sure that the people who'll be doing the work are actually going to be available through to the next Python version". After all, if they are not, it may not much matter whether the code is in the stdlib or not. ;-)
On Thu, Jun 21, 2012 at 11:31 PM, PJ Eby <pje@telecommunity.com> wrote:
So, if we are to draw any lesson from the past, it would seem to be, "make sure that the people who'll be doing the work are actually going to be available through to the next Python version".
Thanks for that write-up - I learned quite a few things I didn't know, even though I was actually around for 2.5 development (the fact I had less of a vested interest in packaging issues then probably made a big difference, too).
After all, if they are not, it may not much matter whether the code is in the stdlib or not. ;-)
Yeah, I think Tarek had the right idea with working through the slow painful process of reaching consensus from the bottom up, feature by feature - we just got impatient and tried to skip to the end without working through the rest of the list. It's worth reflecting on the progress we've made so far, and looking ahead to see what else remains In the standard library for 3.3: - native namespace packages (PEP 420) - native venv support (PEP 405) Packaging tool interoperability standards as Accepted PEPs (may still require further tweaks): - updated PyPI metadata standard (PEP 345) - PyPI enforced orderable dist versioning standard (PEP 386) - common dist installation database format (PEP 376) As I noted earlier in the thread, it would be good to see the components of distutils2/packaging aimed at this interoperability level split out as a separate utility library that can more easily be shared between projects (distmeta was my suggested name for such a PyPI project) Other components where python-dev has a role to play as an interoperability clearing house: - improved command and compiler extension API Other components where python-dev has a role to play in smoothing the entry of beginners into the Python ecosystem: - a package installer shipped with Python to reduce bootstrapping issues - a pypi client for the standard library - dependency graph builder - reduced boilerplate in package definition (setup.cfg should help there) Other components where standard library inclusion is a "nice-to-have" but not critical: - most of the other convenience features in setuptools Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Jun 21, 2012 at 12:57 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Standard assumptions about the behaviour of site and distutils cease to be valid once setuptools is installed
…
- advocacy for the "egg" format and the associated sys.path changes that result for all Python programs running on a system … System administrators (and developers that think like system administrators when it comes to configuration management) *hate* what setuptools (and setuptools based installers) can do to their systems.
I have extensive experience with this, including quite a few bug reports and a few patches in setuptools and distribute, plus maintaining my own fork of setuptools to build and deploy my own projects, plus interviewing quite a few Python developers about why they hated setuptools, plus supporting one of them who hates setuptools even though he and I use it in a build system (https://tahoe-lafs.org). I believe that 80% to 90% of the hatred alluded to above is due to a single issue: the fact that setuptools causes your Python interpreter to disrespect the PYTHONPATH, in violation of the documentation in http://docs.python.org/release/2.7.2/install/index.html#inst-search-path , which says: """ The PYTHONPATH variable can be set to a list of paths that will be added to the beginning of sys.path. For example, if PYTHONPATH is set to /www/python:/opt/py, the search path will begin with ['/www/python', '/opt/py']. (Note that directories must exist in order to be added to sys.path; the site module removes paths that don’t exist.) """ Fortunately, this issue is fixable! I opened a bug report and I and a others have provided patches that makes setuptools stop doing this behavior. This makes the above documentation true again. The negative impact on features or backwards-compatibility doesn't seem to be great. http://bugs.python.org/setuptools/issue53 Philip J. Eby provisionally approved of one of the patches, except for some specific requirement that I didn't really understand how to fix and that now I don't exactly remember: http://mail.python.org/pipermail/distutils-sig/2009-January/010880.html Regards, Zooko
On Thu, 21 Jun 2012 12:02:58 -0300 "Zooko Wilcox-O'Hearn" <zooko@zooko.com> wrote:
Fortunately, this issue is fixable! I opened a bug report and I and a others have provided patches that makes setuptools stop doing this behavior. This makes the above documentation true again. The negative impact on features or backwards-compatibility doesn't seem to be great.
http://bugs.python.org/setuptools/issue53
Philip J. Eby provisionally approved of one of the patches, except for some specific requirement that I didn't really understand how to fix and that now I don't exactly remember:
http://mail.python.org/pipermail/distutils-sig/2009-January/010880.html
These days, I think you should really target distribute, not setuptools. Regards Antoine.
On Jun 21, 2012 11:02 AM, "Zooko Wilcox-O'Hearn" <zooko@zooko.com> wrote:
Philip J. Eby provisionally approved of one of the patches, except for some specific requirement that I didn't really understand how to fix and that now I don't exactly remember:
http://mail.python.org/pipermail/distutils-sig/2009-January/010880.html
I don't remember either; I just reviewed the patch and discussion, and I'm not finding what the holdup was, exactly. Looking at it now, it looks to me like a good idea... oh wait, *now* I remember the problem, or at least, what needs reviewing. Basically, the challenge is that it doesn't allow an .egg in a PYTHONPATH directory to take precedence over that *specific* PYTHONPATH directory. With the perspective of hindsight, this was purely a transitional concern, since it only *really* mattered for site-packages; anyplace else you could just delete the legacy package if it was a problem. (And your patch works fine for that case.) However, for setuptools as it was when you proposed this, it was a potential backwards-compatibility problem. My best guess is that I was considering the approach for 0.7... which never got any serious development time. (It may be too late to fix the issue, in more than one sense. Even if the problem ceased to be a problem today, nobody's going to re-evaluate their position on setuptools, especially if their position wasn't even based on a personal experience with the issue.)
On 06/21/2012 11:37 AM, PJ Eby wrote:
On Jun 21, 2012 11:02 AM, "Zooko Wilcox-O'Hearn" <zooko@zooko.com <mailto:zooko@zooko.com>> wrote:
Philip J. Eby provisionally approved of one of the patches, except for some specific requirement that I didn't really understand how to fix and that now I don't exactly remember:
http://mail.python.org/pipermail/distutils-sig/2009-January/010880.html
I don't remember either; I just reviewed the patch and discussion, and I'm not finding what the holdup was, exactly. Looking at it now, it looks to me like a good idea... oh wait, *now* I remember the problem, or at least, what needs reviewing.
Basically, the challenge is that it doesn't allow an .egg in a PYTHONPATH directory to take precedence over that *specific* PYTHONPATH directory.
With the perspective of hindsight, this was purely a transitional concern, since it only *really* mattered for site-packages; anyplace else you could just delete the legacy package if it was a problem. (And your patch works fine for that case.)
However, for setuptools as it was when you proposed this, it was a potential backwards-compatibility problem. My best guess is that I was considering the approach for 0.7... which never got any serious development time.
(It may be too late to fix the issue, in more than one sense. Even if the problem ceased to be a problem today, nobody's going to re-evaluate their position on setuptools, especially if their position wasn't even based on a personal experience with the issue.)
A minor backwards incompat here to fix that issue would be appropriate, if only to be able to say "hey, that issue no longer exists" to folks who condemn the entire ecosystem based on that bug. At least, that is, if there will be another release of setuptools. Is that likely? - C
On 6/21/12 5:50 PM, Chris McDonough wrote:
A minor backwards incompat here to fix that issue would be appropriate, if only to be able to say "hey, that issue no longer exists" to folks who condemn the entire ecosystem based on that bug. At least, that is, if there will be another release of setuptools. Is that likely? or simply do that fix in distribute since it's Python 3 compatible -- and have setuptools officially discontinued for the sake of clarity.
On Thu, Jun 21, 2012 at 11:50 AM, Chris McDonough <chrism@plope.com> wrote:
On 06/21/2012 11:37 AM, PJ Eby wrote:
On Jun 21, 2012 11:02 AM, "Zooko Wilcox-O'Hearn" <zooko@zooko.com <mailto:zooko@zooko.com>> wrote:
Philip J. Eby provisionally approved of one of the patches, except for some specific requirement that I didn't really understand how to fix and that now I don't exactly remember:
January/010880.html<http://mail.python.org/pipermail/distutils-sig/2009-January/010880.html>
I don't remember either; I just reviewed the patch and discussion, and I'm not finding what the holdup was, exactly. Looking at it now, it looks to me like a good idea... oh wait, *now* I remember the problem, or at least, what needs reviewing.
Basically, the challenge is that it doesn't allow an .egg in a PYTHONPATH directory to take precedence over that *specific* PYTHONPATH directory.
With the perspective of hindsight, this was purely a transitional concern, since it only *really* mattered for site-packages; anyplace else you could just delete the legacy package if it was a problem. (And your patch works fine for that case.)
However, for setuptools as it was when you proposed this, it was a potential backwards-compatibility problem. My best guess is that I was considering the approach for 0.7... which never got any serious development time.
(It may be too late to fix the issue, in more than one sense. Even if the problem ceased to be a problem today, nobody's going to re-evaluate their position on setuptools, especially if their position wasn't even based on a personal experience with the issue.)
A minor backwards incompat here to fix that issue would be appropriate, if only to be able to say "hey, that issue no longer exists" to folks who condemn the entire ecosystem based on that bug. At least, that is, if there will be another release of setuptools. Is that likely?
Yes. At the very least, there will be updated development snapshots (which are what buildout uses anyway). (Official releases are in a bit of a weird holding pattern. distribute's versioning scheme leads to potential confusion: if I release e.g. 0.6.1, then it sounds like it's a lesser version than whatever distribute is up to now. OTOH, releasing a later version number than distribute implies that I'm supporting their feature enhancements, and I really don't want to add new features to 0.6... but don't have time right now to clean up all the stuff I started in the 0.7 line either, since I've been *hoping* that the work on packaging would make 0.7 unnecessary. And let's not even get started on the part where system-installed copies of distribute can prevent people from downloading or installing setuptools in the first place.) Anyway, changing this in a snapshot release shouldn't be a big concern; the main user of snapshots is buildout, and buildout doesn't use .pth files anyway, it just writes scripts that do sys.path manipulation. (A better approach, for everything except having stuff importable from the standard interpreter.) Of course, the flip side is that it means there won't be many people testing the fix.
On 06/21/2012 12:26 PM, PJ Eby wrote:
On Thu, Jun 21, 2012 at 11:50 AM, Chris McDonough <chrism@plope.com <mailto:chrism@plope.com>> wrote:
On 06/21/2012 11:37 AM, PJ Eby wrote:
On Jun 21, 2012 11:02 AM, "Zooko Wilcox-O'Hearn" <zooko@zooko.com <mailto:zooko@zooko.com> <mailto:zooko@zooko.com <mailto:zooko@zooko.com>>> wrote: > > Philip J. Eby provisionally approved of one of the patches, except for > some specific requirement that I didn't really understand how to fix > and that now I don't exactly remember: > > http://mail.python.org/__pipermail/distutils-sig/2009-__January/010880.html <http://mail.python.org/pipermail/distutils-sig/2009-January/010880.html> >
I don't remember either; I just reviewed the patch and discussion, and I'm not finding what the holdup was, exactly. Looking at it now, it looks to me like a good idea... oh wait, *now* I remember the problem, or at least, what needs reviewing.
Basically, the challenge is that it doesn't allow an .egg in a PYTHONPATH directory to take precedence over that *specific* PYTHONPATH directory.
With the perspective of hindsight, this was purely a transitional concern, since it only *really* mattered for site-packages; anyplace else you could just delete the legacy package if it was a problem. (And your patch works fine for that case.)
However, for setuptools as it was when you proposed this, it was a potential backwards-compatibility problem. My best guess is that I was considering the approach for 0.7... which never got any serious development time.
(It may be too late to fix the issue, in more than one sense. Even if the problem ceased to be a problem today, nobody's going to re-evaluate their position on setuptools, especially if their position wasn't even based on a personal experience with the issue.)
A minor backwards incompat here to fix that issue would be appropriate, if only to be able to say "hey, that issue no longer exists" to folks who condemn the entire ecosystem based on that bug. At least, that is, if there will be another release of setuptools. Is that likely?
Yes. At the very least, there will be updated development snapshots (which are what buildout uses anyway).
(Official releases are in a bit of a weird holding pattern. distribute's versioning scheme leads to potential confusion: if I release e.g. 0.6.1, then it sounds like it's a lesser version than whatever distribute is up to now. OTOH, releasing a later version number than distribute implies that I'm supporting their feature enhancements, and I really don't want to add new features to 0.6... but don't have time right now to clean up all the stuff I started in the 0.7 line either, since I've been *hoping* that the work on packaging would make 0.7 unnecessary. And let's not even get started on the part where system-installed copies of distribute can prevent people from downloading or installing setuptools in the first place.)
Welp, I don't want to get in the middle of that whole mess. But maybe the distribute folks would be kind enough to do a major version bump in their next release; e.g. 1.67 instead of 0.67. That said, I don't think anyone would be confused by overlapping version numbers between the two projects. It's known that they have been diverging for a while. - C
On 6/21/12 6:44 PM, Chris McDonough wrote:
Yes. At the very least, there will be updated development snapshots (which are what buildout uses anyway).
(Official releases are in a bit of a weird holding pattern. distribute's versioning scheme leads to potential confusion: if I release e.g. 0.6.1, then it sounds like it's a lesser version than whatever distribute is up to now. OTOH, releasing a later version number than distribute implies that I'm supporting their feature enhancements, and I really don't want to add new features to 0.6... but don't have time right now to clean up all the stuff I started in the 0.7 line either, since I've been *hoping* that the work on packaging would make 0.7 unnecessary. And let's not even get started on the part where system-installed copies of distribute can prevent people from downloading or installing setuptools in the first place.)
Welp, I don't want to get in the middle of that whole mess. But maybe the distribute folks would be kind enough to do a major version bump in their next release; e.g. 1.67 instead of 0.67. That said, I don't think anyone would be confused by overlapping version numbers between the two projects.
It's known that they have been diverging for a while. Yeah the biggest difference is Py3 compat, other than that afaik I don't
Oh yeah no problem, if Philip backports all the things we've done like Py3 compat, and bless more people to maintain setuptools, we can even discontinue distribute ! If not, I think you are just being joking here -- we don't want to go back into the lock situation we've suffered for many years were PJE is the only maintainer then suddenly disappears for a year, telling us no one that is willing to maintain setuptools is able to do so. (according to him) think any API has been removed or modified. In my opinion, distribute is the only project that should go forward since it's actively maintained and does not suffer from the bus factor.
Hi, On 6/21/12 1:20 PM, Tarek Ziadé wrote:
On 6/21/12 6:44 PM, Chris McDonough wrote:
Yes. At the very least, there will be updated development snapshots (which are what buildout uses anyway).
(Official releases are in a bit of a weird holding pattern. distribute's versioning scheme leads to potential confusion: if I release e.g. 0.6.1, then it sounds like it's a lesser version than whatever distribute is up to now. OTOH, releasing a later version number than distribute implies that I'm supporting their feature enhancements, and I really don't want to add new features to 0.6... but don't have time right now to clean up all the stuff I started in the 0.7 line either, since I've been *hoping* that the work on packaging would make 0.7 unnecessary. And let's not even get started on the part where system-installed copies of distribute can prevent people from downloading or installing setuptools in the first place.)
Welp, I don't want to get in the middle of that whole mess. But maybe the distribute folks would be kind enough to do a major version bump in their next release; e.g. 1.67 instead of 0.67. That said, I don't think anyone would be confused by overlapping version numbers between the two projects.
Oh yeah no problem, if Philip backports all the things we've done like Py3 compat, and bless more people to maintain setuptools, we can even discontinue distribute !
If not, I think you are just being joking here -- we don't want to go back into the lock situation we've suffered for many years were PJE is the only maintainer then suddenly disappears for a year, telling us no one that is willing to maintain setuptools is able to do so. (according to him)
It's known that they have been diverging for a while. Yeah the biggest difference is Py3 compat, other than that afaik I don't think any API has been removed or modified.
In my opinion, distribute is the only project that should go forward since it's actively maintained and does not suffer from the bus factor.
+1. I can't help but cringe when I read this (sorry, PJ Eby!): "Official releases are in a bit of a weird holding pattern." due to distribute. Weren't they in a bit of a weird holding pattern before distribute? Haven't they always been in a bit of a weird holding pattern? Let's let setuptools be setuptools and distribute be distribute i.e. as long as distribute exists, I don't care at all about setuptools' release schedule (c.f. PIL/Pillow) and I like it that way :-). If one day setuptools or packaging/distutils2 comes along and fixes everything, then distribute can cease to exist. Alex -- Alex Clark · http://pythonpackages.com
On Thu, Jun 21, 2012 at 1:20 PM, Tarek Ziadé <tarek@ziade.org> wrote:
telling us no one that is willing to maintain setuptools is able to do so. (according to him)
Perhaps there is some confusion or language barrier here: what I said at that time was that the only people who I already *knew* to be capable of taking on full responsibility for *continued development* of setuptools, were not available/interested in the job, to my knowledge. Specifically, the main people I had in mind were Ian Bicking and/or Jim Fulton, both of whom had developed extensions to or significant chunks of setuptools' functionality themselves, during which they demonstrated exemplary levels of understanding both of the code base and the wide variety of scenarios in which that code base had to operate. They also both demonstrated conservative, user-oriented design choices, that made me feel comfortable that they would not do anything to disrupt the existing user base, and that if they made any compatibility-breaking changes, they would do so in a way that avoided disruption. (I believe I also gave Philip Jenvey as an example of someone who, while not yet proven at that level, was someone I considered a good potential candidate as well.) This was not a commentary on anyone *else's* ability, only on my then-present *knowledge* of clearly-suitable persons and their availability, or lack thereof. I would guess that the pool of qualified persons is even larger now, but the point is moot: my issue was never about who would "maintain" setuptools, but who would *develop* it. And I expect that we would at this point agree that future *development* of setuptools is not something either of us are seeking. Rather, we should be seeking to develop tools that can properly supersede it. This is why I participated in Distutils-SIG discussion of the various packaging PEPs, and hope to see more of them there.
On 6/21/12 7:49 PM, PJ Eby wrote:
On Thu, Jun 21, 2012 at 1:20 PM, Tarek Ziadé <tarek@ziade.org <mailto:tarek@ziade.org>> wrote:
telling us no one that is willing to maintain setuptools is able to do so. (according to him)
Perhaps there is some confusion or language barrier here: what I said at that time was that the only people who I already *knew* to be capable of taking on full responsibility for *continued development* of setuptools, were not available/interested in the job, to my knowledge.
Specifically, the main people I had in mind were Ian Bicking and/or Jim Fulton, both of whom had developed extensions to or significant chunks of setuptools' functionality themselves, during which they demonstrated exemplary levels of understanding both of the code base and the wide variety of scenarios in which that code base had to operate. They also both demonstrated conservative, user-oriented design choices, that made me feel comfortable that they would not do anything to disrupt the existing user base, and that if they made any compatibility-breaking changes, they would do so in a way that avoided disruption. (I believe I also gave Philip Jenvey as an example of someone who, while not yet proven at that level, was someone I considered a good potential candidate as well.)
This was not a commentary on anyone *else's* ability, only on my then-present *knowledge* of clearly-suitable persons and their availability, or lack thereof. Yes, so I double-checked my sentence, I think we are in agreement: you would not let folks that *wanted* to maintain it back then, do it. Sorry if this was not clear to you.
But let's forget about this, old story I guess.
I would guess that the pool of qualified persons is even larger now, but the point is moot: my issue was never about who would "maintain" setuptools, but who would *develop* it.
And I expect that we would at this point agree that future *development* of setuptools is not something either of us are seeking. Rather, we should be seeking to develop tools that can properly supersede it.
This is why I participated in Distutils-SIG discussion of the various packaging PEPs, and hope to see more of them there.
I definitely agree, and I think your feedback on the various PEPs were very important. My point is just that, we could (and *should*) in my opinion, merge back setuptools and distribute, just to have a py3-enabled setuptools that is in maintenance mode, and work on the new stuff in packaging besides it. the merged setuptools/distribute project could also be the place were we start to do the work to be compatible with the new standards That's my proposal. Tarek
On 06/21/2012 01:20 PM, Tarek Ziadé wrote:
On 6/21/12 6:44 PM, Chris McDonough wrote:
Yes. At the very least, there will be updated development snapshots (which are what buildout uses anyway).
(Official releases are in a bit of a weird holding pattern. distribute's versioning scheme leads to potential confusion: if I release e.g. 0.6.1, then it sounds like it's a lesser version than whatever distribute is up to now. OTOH, releasing a later version number than distribute implies that I'm supporting their feature enhancements, and I really don't want to add new features to 0.6... but don't have time right now to clean up all the stuff I started in the 0.7 line either, since I've been *hoping* that the work on packaging would make 0.7 unnecessary. And let's not even get started on the part where system-installed copies of distribute can prevent people from downloading or installing setuptools in the first place.)
Welp, I don't want to get in the middle of that whole mess. But maybe the distribute folks would be kind enough to do a major version bump in their next release; e.g. 1.67 instead of 0.67. That said, I don't think anyone would be confused by overlapping version numbers between the two projects.
Oh yeah no problem, if Philip backports all the things we've done like Py3 compat, and bless more people to maintain setuptools, we can even discontinue distribute !
If not, I think you are just being joking here -- we don't want to go back into the lock situation we've suffered for many years were PJE is the only maintainer then suddenly disappears for a year, telling us no one that is willing to maintain setuptools is able to do so. (according to him)
It's known that they have been diverging for a while. Yeah the biggest difference is Py3 compat, other than that afaik I don't think any API has been removed or modified.
In my opinion, distribute is the only project that should go forward since it's actively maintained and does not suffer from the bus factor.
I'm not too interested in the drama/history of the fork situation so I don't care whether setuptools has the fix or distribute has it or both have it, but being able to point at some package which doesn't prevent folks from overriding sys.path ordering using PYTHONPATH would be a good thing. - C
On 6/21/12 7:56 PM, Chris McDonough wrote:
...
think any API has been removed or modified.
In my opinion, distribute is the only project that should go forward since it's actively maintained and does not suffer from the bus factor. Yeah the biggest difference is Py3 compat, other than that afaik I don't
I'm not too interested in the drama/history of the fork situation You are the one currently adding drama by asking for a new setuptools release and saying distribute is diverging.
so I don't care whether setuptools has the fix or distribute has it or both have it, but being able to point at some package which doesn't prevent folks from overriding sys.path ordering using PYTHONPATH would be a good thing.
It has to be in Distribute if we want it in most major Linux distros. And as I proposed to PJE I think the best thing would be to have a single project code base, working with Py3 and receiving maintenance fixes with several maintainers. Since it's clear we're not going to add feature in any of the projects, I think we can safely trust a larger list of maintainers, and just keep the project working until the replacement is used
- C _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/ziade.tarek%40gmail.com
Can I take a step back and make a somewhat different point. Developer requirements are very relevant, sure. But the most important requirements are those of the end user. The person who simply wants to *use* a distribution, couldn't care less how it was built, whether it uses setuptools, or whatever. End users should not need packaging tools on their machines. At the moment, to install from source requires the tools the developer chooses to use for his convenience (distribute/setuptools, distutils2/packaging, bento) to be installed on the target machine. And binary installers are only normally available for code that needs a C extension, and in that case the developer's choice is still visible in terms of the binary format provided. I would argue that we should only put *end user* tools in the stdlib. - A unified package format, suitable for binaries, but also for pure Python code that wants to ship that way. - Installation management tools (download, install, remove, list, and dependency management) that handle the above package format - Maybe support in the package format and/or installation tools for managing "wrapper executables" for executable scripts in distributions Development tools like distutils2, distribute/setuptools, bento would *only* be needed on developer machines, and would be purely developer choice. They would all interact with end users via the stdlib-supported standard formats. They could live outside the stdlib, and developers could use whichever tool suited them. This is a radical idea in that it does not cater for the "zipped up development directory as a distribution format" mental model that current Python uses. That model could still work, but only if all the tools generated a stdlib-supported build definition (which could simply be a Python script that runs the various compile/copy commands, plus some compiler support classes in the stdlib) in the same way that lex/yacc generate C, and projects often distribute the generated C along with the grammar files. Legacy support in the form of distutils, converters from bdist_xxx formats to the new binary format, and maybe pip-style "hide the madness under a unified interface" tools could support this, either in the stdlib or as 3rd party tools. I realise this is probably too radical to happen, but at least, it might put the debate into context if people try to remember that end users, as well as package developers, are affected by this (and there are a lot more end users than package developers...). Paul. PS I know that setuptools includes some end-user aspects - multi-versioning, entry points and optional dependencies, for example. Maybe these are needed - personally, I have never had a need for any of these, so I'm not the best person to comment.
On Thu, Jun 21, 2012 at 4:01 PM, Paul Moore <p.f.moore@gmail.com> wrote:
End users should not need packaging tools on their machines.
Well, unless they're developers. ;-) Sometimes, the "end user" is a developer making use of a library. Development tools like distutils2, distribute/setuptools, bento would
*only* be needed on developer machines, and would be purely developer choice. They would all interact with end users via the stdlib-supported standard formats. They could live outside the stdlib, and developers could use whichever tool suited them.
AFAIK, this was the goal behind setup.cfg in packaging, and it's a goal I agree with.
This is a radical idea in that it does not cater for the "zipped up development directory as a distribution format" mental model that current Python uses. That model could still work, but only if all the tools generated a stdlib-supported build definition
Again, packaging's setup.cfg is, or should be, this. I think there are some technical challenges with the current state of setup.cfg, but AFAIK they aren't anything insurmountable. (Background: the general idea is that setup.cfg contains "hooks", which name Python callables to be invoked at various stages of the process. These hooks can dynamically add to the setup.cfg data, e.g. to list newly-built files, binaries, etc., as well as to do any actual building.) PS I know that setuptools includes some end-user aspects -
multi-versioning, entry points and optional dependencies, for example. Maybe these are needed - personally, I have never had a need for any of these, so I'm not the best person to comment.
Entry points are a developer tool, and cross-project co-ordination facility. They allow packages to advertise classes, modules, functions, etc. that other projects may wish to import and use in a programmatic way. For example, a web framework may say, "if you want to provide a page template file format, register an entry point under this naming convention, and we will automatically use it when a template has a matching file extension." So entry points are not really consumed by end users; libraries and frameworks use them as ways to dynamically co-ordinate with other installed libraries, plugins, etc. Optional dependencies ("extras"), OTOH, are for end-user convenience: they allow an author to suggest configurations that might be of interest. Without them, people have to do things like this: http://pypi.python.org/pypi/celery-with-couchdb in order to advertise what else should be installed. If Celery were instead to list its couchdb and SQLAlchemy requirements as "extras" in setup.py, then one could "easy_install celery[couchdb]" or "easy_install celery[sqla]" instead of needing to register separate project names on PyPI for each of these scenarios. As it happens, however, two of the most popular setuptools add-ons (pip and buildout) either did not or still do not support "extras", because they were not frequently used. Unfortunately, this meant that projects had to do things like setup dummy projects on PyPI, because the popular tools didn't support the scenario. In short, nobody's likely to mourn the passing of extras to any great degree. They're a nice idea, but hard to bootstrap into use due to the chicken-and-egg problem. If you don't know what they're for, you won't use them, and without common naming conventions (like mypackage[c_speedups] or mypackage[test_support]), nobody will get used to asking for them. I think at some point we will end up reinventing them, but essentially the challenge is that they are a generalized solution to a variety of small problems that are not individually very motivating to anybody. They were only motivating to me in the aggregate because I saw lots of individual people being bothered by their particular variation on the theme of auxiliary dependencies or recommended options. As for multi-versioning, it's pretty clearly a dead duck, a proof-of-concept that was very quickly obsoleted by buildout and virtualenv. Buildout is a better implementation of multi-versioning for actual scripts, and virtualenvs work fine for people who haven't yet discovered the joys of buildout. (I'm a recent buildout convert, in case you can't tell. ;-) )
On Thursday, June 21, 2012 at 4:01 PM, Paul Moore wrote:
End users should not need packaging tools on their machines.
Sort of riffing on this idea, I cannot seem to find a specification for what a Python package actually is. Maybe the first effort should focus on this instead of arguing one implementation or another. As a packager: I should not (in general) care what tool (pip, pysetup, easy_install, buildout, whatever) is used to install my package, My package should just describe what to do to install itself. As a end user: I should not (in general) care what tool was used to create a package (setuptools, bento, distutils, whatever). My tool of choice should look at the package and preform the operations that the package says are needed for install. Ideally the package could have some basic primitives that are enough to tell the package installer tool what to do to install it, These primitives should be enough to cover the common cases (pure python modules at the very least, maybe additionally some C modules). Now as others have remarked it would be insane to attempt to do this in every case as it would involve writing a build system that is more advanced than anything else existing, so a required primitive would be something that allows calling out to a specific package decided build system (waf, make, whatever) to handle the build configuration. The eventual end goal here being to make a package from something that varies from implementation to implementation to a standardized format that any number of tools can build on top of. It would likely include some things defining where metadata MUST be defined. For instance, if metadata in setuptools was "compiled" down to static file, and easy_install, pip et;al used that static file to install from instead of executing setup.py, then the end user would not have required setup tools installed and instead any number of tools could have been created that utilized that data.
Hi, On 6/21/12 5:38 PM, Donald Stufft wrote:
On Thursday, June 21, 2012 at 4:01 PM, Paul Moore wrote:
End users should not need packaging tools on their machines.
Sort of riffing on this idea, I cannot seem to find a specification for what a Python package actually is.
FWIW according to distutils[1], a package is: a module or modules inside another module[2]. So e.g.:: foo.py is a module and: foo/__init__.py foo/foo.py is a simple package containing the following modules: import foo, foo.foo Alex [1] http://docs.python.org/distutils/introduction.html#general-python-terminolog... [2] And a distribution is a compressed archive of a package, in case that's not clear.
Maybe the first effort should focus on this instead of arguing one implementation or another.
As a packager: I should not (in general) care what tool (pip, pysetup, easy_install, buildout, whatever) is used to install my package, My package should just describe what to do to install itself.
As a end user: I should not (in general) care what tool was used to create a package (setuptools, bento, distutils, whatever). My tool of choice should look at the package and preform the operations that the package says are needed for install.
Ideally the package could have some basic primitives that are enough to tell the package installer tool what to do to install it, These primitives should be enough to cover the common cases (pure python modules at the very least, maybe additionally some C modules). Now as others have remarked it would be insane to attempt to do this in every case as it would involve writing a build system that is more advanced than anything else existing, so a required primitive would be something that allows calling out to a specific package decided build system (waf, make, whatever) to handle the build configuration.
The eventual end goal here being to make a package from something that varies from implementation to implementation to a standardized format that any number of tools can build on top of. It would likely include some things defining where metadata MUST be defined.
For instance, if metadata in setuptools was "compiled" down to static file, and easy_install, pip et;al used that static file to install from instead of executing setup.py, then the end user would not have required setup tools installed and instead any number of tools could have been created that utilized that data.
-- Alex Clark · http://pythonpackages.com
On Thursday, June 21, 2012 at 7:34 PM, Alex Clark wrote:
Hi,
On 6/21/12 5:38 PM, Donald Stufft wrote:
On Thursday, June 21, 2012 at 4:01 PM, Paul Moore wrote:
End users should not need packaging tools on their machines.
Sort of riffing on this idea, I cannot seem to find a specification for what a Python package actually is.
FWIW according to distutils[1], a package is: a module or modules inside another module[2]. So e.g.::
foo.py is a module
and:
foo/__init__.py foo/foo.py
is a simple package containing the following modules:
import foo, foo.foo
Alex
[1] http://docs.python.org/distutils/introduction.html#general-python-terminolog...
[2] And a distribution is a compressed archive of a package, in case that's not clear.
Right, i'm actually talking about distributions. (As is everyone else in this thread). And that a definition is not a specification. What i'm trying to get at is with a standard package format where all the metadata is able to get gotten at without the packaging lib (distutils/setuptools cannot get at metadata without using distutils or setuptools). It would need to be required that this serves as the one true source of metadata and that other tools can add certain types of metadata to this format. If say distutils2 wrote a package that adhered to a certain standard, and wrote all the information that distutils2 knows about how to install said package (what files, names, versions, dependencies etc) to a file (say PKG-INFO) that contained only "common" standard information then another tool (say bento) could take that package, and install it. The idea i'm hoping for is to stop worrying about one implementation over another and hoping to create a common format that all the tools can agree upon and create/install.
On Fri, Jun 22, 2012 at 10:01 AM, Donald Stufft <donald.stufft@gmail.com> wrote:
The idea i'm hoping for is to stop worrying about one implementation over another and hoping to create a common format that all the tools can agree upon and create/install.
Right, and this is where it encouraged me to see in the Bento docs that David had cribbed from RPM in this regard (although I don't believe he has cribbed *enough*). A packaging system really needs to cope with two very different levels of packaging: 1. Source distributions (e.g. SRPMs). To get from this to useful software requires developer tools. 2. "Binary" distributions (e.g. RPMs). To get from this to useful software mainly requires a "file copy" utility (well, that and an archive decompressor). An SRPM is *just* a SPEC file and source tarball. That's it. To get from that to an installed product, you have a bunch of additional "BuildRequires" dependencies, along with %build and %install scripts and a %files definition that define what will be packaged up and included in the binary RPM. The exact nature of the metadata format doesn't really matter, what matters is that it's a documented standard that multiple tools can read. An RPM includes files that actually get installed on the target system. An RPM can be arch specific (if they include built binary bits) or "noarch" if they're platform neutral. distutils really only plays at the SRPM level - there is no defined OS neutral RPM equivalent. That's why I brought up the bdist_simple discussion earlier in the thread - if we can agree on a standard bdist_simple format, then we can more cleanly decouple the "build" step from the "install" step. I think one of the key things to learn from the SPEC file format is the configuration language it used for the various build phases: sh (technically, any shell on the system, but almost everyone just uses the default system shell) This is why you can integrate whatever build system you like with it: so long as you can invoke the build from the shell, then you can use it to make your RPM. Now, there's an obvious problem with this: it's completely useless from a *cross-platform* building point of view. Isn't it a shame there's no language we could use that would let us invoke build systems in a cross platform way? Oh, wait... So here's some sheer pie-in-the-sky speculation. If people like elements of this idea enough to run with it, great. If not... oh well: - I believe the "egg" term has way too much negative baggage (courtesy of easy_install), and find the full term Distribution to be too easily confused with "Linux distribution". However, "Python dist" is unambiguous (since the more typical abbreviation for an aggregate distribution is "distro"). Thus, I attempt to systematically refer to the objects used to distribute Python software from developers to users as "dists". In practice, this terminology is already used in many places (distutils, sdist, bdist_msi, bdist_rpm, the .dist-info format in PEP 376 etc). Thus, Python software is distributed as dists (either sdists or bdists), which may in turn be converted to distro packages (e.g. SRPMs and RPMs) for deployment to particular environments. - I reject setup.cfg, as I believe ini-style configuration files are not appropriate for a metadata format that needs to include file listings and code fragments - I reject bento.info, as I think if we accept yet-another-custom-configuration-file-format into the standard library instead of just using YAML, we're even crazier than is already apparent - I shall use "dist.yaml" as my proposed name for my "I wish I could define packages like this" format (and yes, that means adding yaml support to the standard library is part of the wish) - many of the details below will be flawed, but I want to give a clear idea for how a concept like this might work in practice - we need to define a clear set of build phases, and then design the dist metadata format accordingly. For example: - source - uses a "source" section in dist.yaml - "source/install" maps source files directly to desired install locations - essentially what the setup.cfg Resources section tries to do - used for pure Python code, documentation, etc - See below for example - "source/files" defines a list of extra files to be included - "source/exclude" defines the list of files to be excluded - "source/run" defines a Python fragment to be executed - serves a similar purpose to the "files" section in setup.cfg - creates a temporary directory (and sets it as the working directory) - dist.yaml is copied to the temporary directory - all files to be installed are copied to the temporary directory - all extra files are copied to the temporary directory - the Python fragment in "source/run" is executed (which can thus easily add more files) - if sdist archive creation is requested, entire contents of temporary directory are included - build - uses a "build" section in dist.yaml - "build/install" maps built files to desired install locations - like source/install, but for build artifacts - compiled C extensions, .pyc and .pyo files, etc would all go here - "build/run" defines a Python fragment to be executed - "build/files" defines the list of files to be included - "build/exclude" defines the list of files to be excluded - "build/requires" defines extra dependencies not needed at runtime - starting environment is a source directory that is either: - preexisting (e.g. to allow building in-place in the source tree) - created by running source first - created by unpacking an sdist archive - the Python fragment in "build/run" is executed to trigger the build - if the build succeeds (i.e. doesn't throw an exception) - create a temporary directory - copy dist.yaml - copy all specified files - this is the easiest way to exclude build artifacts from the distribution, while still keeping them around to enable incremental builds - if bdist_simple archive creation is requested, entire contents of temporary directory are included - other bdist formats (such as bdist_rpm) will have their own rules for getting from the bdist_simple format to the platform specific format - install - uses an "install" section in dist.yaml - "install/pre" defines a Python fragment to be executed before copying files - "install/post" defines a Python fragment to be executed after copying files - starting environment is a bdist_simple directory that is either: - preexisting (e.g. to allow creation by system packaging tools) - created by running build first - created by unpacking a bdist_simple archive - end result is a fully installed and usable piece of software - test - uses a "test" section in dist.yaml - "test/run" defines a Python fragment to be executed to start the tests - "test/requires" defines extra dependencies needed to run the test suite - Example "source/install" based on http://alexis.notmyidea.org/distutils2/setupcfg.html#complete-example (my YAML may be a bit dodgy). - With this scheme, module installation is just another install category. - A solution for easily installing entire subtrees is desirable. I propose the recursive glob ** syntax for that purpose. - Unlike setup.cfg, every category would have an "-excluded" counterpart to filter unwanted files. Explicit is better than implicit. source: install: modules: example.py example_pkg/*.py example_pkg/**/*.py example_pkg/resource.txt doc: README doc/* doc-excluded: doc/man man: doc/man scripts: # Directory details are stripped automatically scripts/LAUNCH scripts/*.{sh,bat} # But subdirectories can be made explicit extras/: scripts/extras/*.{sh,bat} - the goal of a dist.yaml syntax would be to be *explicit* and *comprehensive*. If this gets too verbose, then the solution would be dist.yaml generators that are less expressive, but also reduce the necessary boilerplate. - a typical "sdist" will now just be an archive consisting of: - the project's dist.yaml file - all files created by the "source" phase - the "bdist_simple" format will just be an archive consisting of: - the project's dist.yaml file - all files created by the "build" phase - the source and build run hooks and install pre and post hooks become the way you integrate with arbitrary build systems. No fancy command or compiler system or anything like that, you just import whatever you need and call it with the appropriate arguments. To other tools, they will just be opaque chunks of text, but to the build system, they're executable pieces of Python code, just as RPM includes executable scripts. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Friday, June 22, 2012 at 1:05 AM, Nick Coghlan wrote:
- I reject setup.cfg, as I believe ini-style configuration files are not appropriate for a metadata format that needs to include file listings and code fragments
- I reject bento.info (http://bento.info), as I think if we accept yet-another-custom-configuration-file-format into the standard library instead of just using YAML, we're even crazier than is already apparent
- I shall use "dist.yaml" as my proposed name for my "I wish I could define packages like this" format (and yes, that means adding yaml support to the standard library is part of the wish)
- many of the details below will be flawed, but I want to give a clear idea for how a concept like this might work in practice
- we need to define a clear set of build phases, and then design the dist metadata format accordingly. For example: - source - uses a "source" section in dist.yaml - "source/install" maps source files directly to desired install locations - essentially what the setup.cfg Resources section tries to do - used for pure Python code, documentation, etc - See below for example - "source/files" defines a list of extra files to be included - "source/exclude" defines the list of files to be excluded - "source/run" defines a Python fragment to be executed - serves a similar purpose to the "files" section in setup.cfg - creates a temporary directory (and sets it as the working directory) - dist.yaml is copied to the temporary directory - all files to be installed are copied to the temporary directory - all extra files are copied to the temporary directory - the Python fragment in "source/run" is executed (which can thus easily add more files) - if sdist archive creation is requested, entire contents of temporary directory are included - build - uses a "build" section in dist.yaml - "build/install" maps built files to desired install locations - like source/install, but for build artifacts - compiled C extensions, .pyc and .pyo files, etc would all go here - "build/run" defines a Python fragment to be executed - "build/files" defines the list of files to be included - "build/exclude" defines the list of files to be excluded - "build/requires" defines extra dependencies not needed at runtime - starting environment is a source directory that is either: - preexisting (e.g. to allow building in-place in the source tree) - created by running source first - created by unpacking an sdist archive - the Python fragment in "build/run" is executed to trigger the build - if the build succeeds (i.e. doesn't throw an exception) - create a temporary directory - copy dist.yaml - copy all specified files - this is the easiest way to exclude build artifacts from the distribution, while still keeping them around to enable incremental builds - if bdist_simple archive creation is requested, entire contents of temporary directory are included - other bdist formats (such as bdist_rpm) will have their own rules for getting from the bdist_simple format to the platform specific format - install - uses an "install" section in dist.yaml - "install/pre" defines a Python fragment to be executed before copying files - "install/post" defines a Python fragment to be executed after copying files - starting environment is a bdist_simple directory that is either: - preexisting (e.g. to allow creation by system packaging tools) - created by running build first - created by unpacking a bdist_simple archive - end result is a fully installed and usable piece of software - test - uses a "test" section in dist.yaml - "test/run" defines a Python fragment to be executed to start the tests - "test/requires" defines extra dependencies needed to run the test suite
I dislike some of the (implementation) details, but in general I think this is a good direction to go in. Less trying to force tools to work together by hijacking setup.py or something and more "this is a package, it contains the data you need to install, and how to install it, you installation tool can use this data however it pleases to make sure it is installed." I feel like this is (one of?) the missing piece of the puzzle to define a set of standards that _any_ package creation, or installation tool can implement and gain interoperability. I don't want to argue over implementation details as I think that is premature right now, so this concept has a big +1 from me. RPM, deb, etc has a long history and a lot of shared knowledge so looking at them and adapting it to work cross platform is likely to be huge win.
On Fri, Jun 22, 2012 at 3:20 PM, Donald Stufft <donald.stufft@gmail.com> wrote:
I don't want to argue over implementation details as I think that is premature right now, so this concept has a big +1 from me. RPM, deb, etc has a long history and a lot of shared knowledge so looking at them and adapting it to work cross platform is likely to be huge win.
Right, much of what I wrote in that email should be taken as "this is one way I think it *could* work", rather than "this is the way I think it *should* work". In particular, any realistic attempt should also look at what Debian based systems do differently from RPM based systems. I think the key elements are recognising that: - an "sdist" contains three kinds of file: - package metadata - files to be installed directly on the target system - files needed to build other files - a "bdist" also contains three kinds of file: - package metadata - files to be installed directly on the target system - files needed to correctly install and update other files That means the key transformations to be defined are: - source checkout -> sdist - need to define contents of sdist - need to define where any directly installed files are going to end up - sdist -> bdist - need to define contents of bdist - need to define how to create the build artifacts - need to define where any installed build artifacts are going to end up - bdist -> installed software - need to allow application developers to customise the installation process - need to allow system packages to customise where certain kinds of file end up The one *anti-pattern* I think we really want to avoid is a complex registration system where customisation isn't as simple as saying either: - run this inline piece of code; or - invoke this named function or class that implements the appropriate interface The other main consideration is that we want the format to be easy to read with general purpose tools, and that means something based on a configuration file standard. YAML is the obvious choice at that point. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 6/22/12 7:05 AM, Nick Coghlan wrote:
..
- I reject setup.cfg, as I believe ini-style configuration files are not appropriate for a metadata format that needs to include file listings and code fragments I don't understand what's the problem is with ini-style files, as they are suitable for multi-line variables etc. (see zc.buildout)
yaml vs ini vs xxx seems to be an implementation detail, and my take on this is that we have ConfigParser in the stdlib
On Fri, Jun 22, 2012 at 4:42 PM, Tarek Ziadé <tarek@ziade.org> wrote:
On 6/22/12 7:05 AM, Nick Coghlan wrote: I don't understand what's the problem is with ini-style files, as they are suitable for multi-line variables etc. (see zc.buildout)
yaml vs ini vs xxx seems to be an implementation detail, and my take on this is that we have ConfigParser in the stdlib
You can't do more than one layer of nested data structures cleanly with an ini-style solution, and some aspects of packaging are just crying out for metadata that nests more deeply than that. The setup.cfg format for specifying installation layouts doesn't even come *close* to being intuitively readable - using a format with better nesting support has some hope of fixing that, since filesystem layouts are naturally hierarchical. A JSON based format would also be acceptable to me from a functional point of view, although in that case, asking people to edit it directly would be cruel - you would want to transform it to YAML in order to actually read it or write it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 6/22/12 9:11 AM, Nick Coghlan wrote:
On Fri, Jun 22, 2012 at 4:42 PM, Tarek Ziadé<tarek@ziade.org> wrote:
On 6/22/12 7:05 AM, Nick Coghlan wrote: I don't understand what's the problem is with ini-style files, as they are suitable for multi-line variables etc. (see zc.buildout)
yaml vs ini vs xxx seems to be an implementation detail, and my take on this is that we have ConfigParser in the stdlib You can't do more than one layer of nested data structures cleanly with an ini-style solution, and some aspects of packaging are just crying out for metadata that nests more deeply than that. The setup.cfg format for specifying installation layouts doesn't even come *close* to being intuitively readable - using a format with better nesting support has some hope of fixing that, since filesystem layouts are naturally hierarchical.
A JSON based format would also be acceptable to me from a functional point of view, although in that case, asking people to edit it directly would be cruel - you would want to transform it to YAML in order to actually read it or write it.
I still think this is an implementation detail, and that ini can work here, as they have proven to work with buildout and look very clean to me. But I guess that's not important -- looking forward for you changes proposals on packaging. I am now wondering why we don't have a yaml module in the stdlib btw :)
Cheers, Nick.
On Fri, Jun 22, 2012 at 5:24 PM, Tarek Ziadé <tarek@ziade.org> wrote:
On 6/22/12 9:11 AM, Nick Coghlan wrote:
On Fri, Jun 22, 2012 at 4:42 PM, Tarek Ziadé<tarek@ziade.org> wrote:
On 6/22/12 7:05 AM, Nick Coghlan wrote: I don't understand what's the problem is with ini-style files, as they are suitable for multi-line variables etc. (see zc.buildout)
yaml vs ini vs xxx seems to be an implementation detail, and my take on this is that we have ConfigParser in the stdlib
You can't do more than one layer of nested data structures cleanly with an ini-style solution, and some aspects of packaging are just crying out for metadata that nests more deeply than that. The setup.cfg format for specifying installation layouts doesn't even come *close* to being intuitively readable - using a format with better nesting support has some hope of fixing that, since filesystem layouts are naturally hierarchical.
A JSON based format would also be acceptable to me from a functional point of view, although in that case, asking people to edit it directly would be cruel - you would want to transform it to YAML in order to actually read it or write it.
I still think this is an implementation detail, and that ini can work here, as they have proven to work with buildout and look very clean to me.
Yeah, and I later realised that RPM also uses a flat format. I think nested is potentially cleaner, but that's the kind of thing a PEP can thrash out.
I am now wondering why we don't have a yaml module in the stdlib btw :)
ini-style is often good enough, and failing that there's json. Or, you just depend on PyYAML :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
I think json probably makes the most sense, it's already part of the stdlib for 2.6+ and while it has some issues with human editablity, there's no reason why this json file couldn't be auto generated from another data structure by the "package creation tool" that exists outside of the stdlib (or inside, but outside the scope of this proposal). Which is really part of what I like a lot about this proposal, how you come about the final product doesn't matter, distutils, bento, yet-uncreated-tool, manually crafting tar balls and files, you could describe your data in yaml, python, or going towards more magical ends of things, it could be automatically generated from your filesystem. It doesn't matter, all that matters is you create your final archive with the agreed upon structure and the agreed upon dist.(yml|json|ini) and any compliant installer should be able to install it. On Friday, June 22, 2012 at 3:56 AM, Vinay Sajip wrote:
Nick Coghlan <ncoghlan <at> gmail.com (http://gmail.com)> writes:
ini-style is often good enough, and failing that there's json. Or, you just depend on PyYAML :)
Except when PyYAML is packaged and distributed using dist.yaml :-)
Regards,
Vinay Sajip
_______________________________________________ Python-Dev mailing list Python-Dev@python.org (mailto:Python-Dev@python.org) http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald.stufft%40gmail.com
Nick Coghlan <ncoghlan <at> gmail.com> writes:
On Fri, Jun 22, 2012 at 4:42 PM, Tarek Ziadé <tarek <at> ziade.org> wrote:
On 6/22/12 7:05 AM, Nick Coghlan wrote: I don't understand what's the problem is with ini-style files, as they are suitable for multi-line variables etc. (see zc.buildout)
yaml vs ini vs xxx seems to be an implementation detail, and my take on this is that we have ConfigParser in the stdlib
You can't do more than one layer of nested data structures cleanly with an ini-style solution, and some aspects of packaging are just crying out for metadata that nests more deeply than that. The setup.cfg format for specifying installation layouts doesn't even come *close* to being intuitively readable - using a format with better nesting support has some hope of fixing that, since filesystem layouts are naturally hierarchical.
A JSON based format would also be acceptable to me from a functional point of view, although in that case, asking people to edit it directly would be cruel - you would want to transform it to YAML in order to actually read it or write it.
The format-neutral alternative I used for logging configuration was a dictionary schema - JSON, YAML and Python code can all be mapped to that. Perhaps the relevant APIs can work at the dict layer. I agree that YAML is the human-friendliest "one obvious" format for review/edit, though. +1 to the overall approach suggested, it makes a lot of sense. Simple is better than complex, and all that :-) Regards, Vinay Sajip
On Jun 22, 2012, at 07:49 AM, Vinay Sajip wrote:
The format-neutral alternative I used for logging configuration was a dictionary schema - JSON, YAML and Python code can all be mapped to that. Perhaps the relevant APIs can work at the dict layer.
I don't much care whether it's ini, json, or yaml, but I do think it needs to be declarative and language neutral. I don't want to lock up all that metadata into Python data structures. There are valid use cases for being able to access the data from outside of Python. And please give some thought to test declarations. We need a standard way to declare how a package's tests should be run, and what the test dependencies are, so that we can improve the quality of all distro/binary packages by running the test suite at build time. Having to guess whether it's `python setup.py test` or `python -m unittest discover` or whether nose or py.test is required, etc. etc. is no good. -Barry
On Fri, Jun 22, 2012 at 6:05 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Fri, Jun 22, 2012 at 10:01 AM, Donald Stufft <donald.stufft@gmail.com> wrote:
The idea i'm hoping for is to stop worrying about one implementation over another and hoping to create a common format that all the tools can agree upon and create/install.
Right, and this is where it encouraged me to see in the Bento docs that David had cribbed from RPM in this regard (although I don't believe he has cribbed *enough*).
A packaging system really needs to cope with two very different levels of packaging: 1. Source distributions (e.g. SRPMs). To get from this to useful software requires developer tools. 2. "Binary" distributions (e.g. RPMs). To get from this to useful software mainly requires a "file copy" utility (well, that and an archive decompressor).
An SRPM is *just* a SPEC file and source tarball. That's it. To get from that to an installed product, you have a bunch of additional "BuildRequires" dependencies, along with %build and %install scripts and a %files definition that define what will be packaged up and included in the binary RPM. The exact nature of the metadata format doesn't really matter, what matters is that it's a documented standard that multiple tools can read.
An RPM includes files that actually get installed on the target system. An RPM can be arch specific (if they include built binary bits) or "noarch" if they're platform neutral.
distutils really only plays at the SRPM level - there is no defined OS neutral RPM equivalent. That's why I brought up the bdist_simple discussion earlier in the thread - if we can agree on a standard bdist_simple format, then we can more cleanly decouple the "build" step from the "install" step.
I think one of the key things to learn from the SPEC file format is the configuration language it used for the various build phases: sh (technically, any shell on the system, but almost everyone just uses the default system shell)
This is why you can integrate whatever build system you like with it: so long as you can invoke the build from the shell, then you can use it to make your RPM.
Now, there's an obvious problem with this: it's completely useless from a *cross-platform* building point of view. Isn't it a shame there's no language we could use that would let us invoke build systems in a cross platform way? Oh, wait...
So here's some sheer pie-in-the-sky speculation. If people like elements of this idea enough to run with it, great. If not... oh well:
- I believe the "egg" term has way too much negative baggage (courtesy of easy_install), and find the full term Distribution to be too easily confused with "Linux distribution". However, "Python dist" is unambiguous (since the more typical abbreviation for an aggregate distribution is "distro"). Thus, I attempt to systematically refer to the objects used to distribute Python software from developers to users as "dists". In practice, this terminology is already used in many places (distutils, sdist, bdist_msi, bdist_rpm, the .dist-info format in PEP 376 etc). Thus, Python software is distributed as dists (either sdists or bdists), which may in turn be converted to distro packages (e.g. SRPMs and RPMs) for deployment to particular environments.
- I reject setup.cfg, as I believe ini-style configuration files are not appropriate for a metadata format that needs to include file listings and code fragments
- I reject bento.info, as I think if we accept yet-another-custom-configuration-file-format into the standard library instead of just using YAML, we're even crazier than is already apparent
I agree having yet another format is a bit crazy, and am actually considering changing bento.info to be a yaml. I initially did got toward a cabal-like syntax instead for the following reasons: - lack of conditional (a must IMO, it is even more useful for cross -platform stuff than it is for RPM only) - yaml becomes quite a bit verbose for some cases I find JSON to be inappropriate because beyond the above issues, it does not support comments, and it is significantly more verbose. That being said, that's just syntax and what matters more is the features we allow: - I like the idea of categorizing like you did better than how it works in bento, but I think one need to be able to create its own category as well. A category is just a mapping from a name to an install directory (see http://cournape.github.com/Bento/html/tutorial.html#installed-data-files-dat..., but we could find another syntax of course). - I don't find the distinction between source and build very useful in the-yet-to-be-implemented description. Or maybe that's just a naming issue, and it is just the same distinction as extra files vs installed files I made in bento ? See next point - regarding build, I don't think we want to force people to implement target locations there. I also don't see how you want to make it work for built files (you don't know the name yet). Can you give an example of how it would work for say extension and built doc ? - regarding hooks: I think it is simpler to have a single file which contains all the hooks, if only to allow for easy communication between hooks and code reuse between hooks. I don't see any drawback to using only one file ? - Besides containing the file bits + metadata, I wonder if one should allow additional fields, that maybe would be tool specific. In bento, there are a couple of such additional fields that may not be very useful to others. - do we want to allow for recursive dist.yaml ? This numpy.distutils feature is used quite a bit, and I believe twisted has something similar. David
David Cournapeau <cournape <at> gmail.com> writes:
I agree having yet another format is a bit crazy, and am actually considering changing bento.info to be a yaml. I initially did got toward a cabal-like syntax instead for the following reasons: - lack of conditional (a must IMO, it is even more useful for cross -platform stuff than it is for RPM only)
Conditionals could perhaps be handled in different ways, e.g. 1. Markers as used in distutils2/packaging (where the condition is platform or version related) 2. A scheme to resolve variables, such as is used in PEP 391 (dictionary-based configuration for logging). If conditionals are much more involved than this, there's a possibility of introducing too much program logic - the setup.py situation
- regarding hooks: I think it is simpler to have a single file which contains all the hooks, if only to allow for easy communication between hooks and code reuse between hooks. I don't see any drawback to using only one file ?
I was assuming that the dist.yaml file would just have callable references here; I suppose having (sizable) Python fragments in dist.yaml might become unwieldy. Regards, Vinay Sajip
On 22 June 2012 06:05, Nick Coghlan <ncoghlan@gmail.com> wrote:
distutils really only plays at the SRPM level - there is no defined OS neutral RPM equivalent. That's why I brought up the bdist_simple discussion earlier in the thread - if we can agree on a standard bdist_simple format, then we can more cleanly decouple the "build" step from the "install" step.
That was essentially the key insight I was trying to communicate in my "think about the end users" comment. Thanks, Nick! Comments on the rest of your email to follow (if needed) when I've digested it... Paul
On 06/22/2012 10:40 AM, Paul Moore wrote:
On 22 June 2012 06:05, Nick Coghlan<ncoghlan@gmail.com> wrote:
distutils really only plays at the SRPM level - there is no defined OS neutral RPM equivalent. That's why I brought up the bdist_simple discussion earlier in the thread - if we can agree on a standard bdist_simple format, then we can more cleanly decouple the "build" step from the "install" step.
That was essentially the key insight I was trying to communicate in my "think about the end users" comment. Thanks, Nick!
The subtlety here is that there's no way to know before building the package what files should be installed. (For simple extensions, and perhaps documentation, you could get away with ad-hoc rules or special support for Sphinx and what-not, but there's no general solution that works in all cases.) What Bento does is have one metadata file for the source-package, and another metadata file (manifest) for the built-package. The latter is normally generated by the build process (but follows a standard nevertheless). Then that manifest is used for installation (through several available methods). Dag
On Friday, June 22, 2012 at 5:22 AM, Dag Sverre Seljebotn wrote:
What Bento does is have one metadata file for the source-package, and another metadata file (manifest) for the built-package. The latter is normally generated by the build process (but follows a standard nevertheless). Then that manifest is used for installation (through several available methods).
From what I understand, this dist.(yml|json|ini) would be replacing the
mainfest not the bento.info then. When bento builds a package compatible with the proposed format it would instead of generating it's own manifest it would generate the dist.(yml|json|ini).
On 06/22/2012 11:38 AM, Donald Stufft wrote:
On Friday, June 22, 2012 at 5:22 AM, Dag Sverre Seljebotn wrote:
What Bento does is have one metadata file for the source-package, and another metadata file (manifest) for the built-package. The latter is normally generated by the build process (but follows a standard nevertheless). Then that manifest is used for installation (through several available methods).
From what I understand, this dist.(yml|json|ini) would be replacing the mainfest not the bento.info then. When bento builds a package compatible with the proposed format it would instead of generating it's own manifest it would generate the dist.(yml|json|ini).
Well, but I think you need to care about the whole process here. Focusing only on the "end-user case" and binary installers has the flip side that smuggling in a back door is incredibly easy in compiled binaries. You simply upload a binary that doesn't match the source. The reason PyPI isn't one big security risk is that packages are built from source, and so you can have some confidence that backdoors would be noticed and highlighted by somebody. Having a common standards for binary installation phase would be great sure, but security-minded users would still need to build from source in every case (or trust a 3rt party build farm that builds from source). The reason you can trust RPMs at all is because they're built from SRPMs. Dag
Dag Sverre Seljebotn <d.s.seljebotn <at> astro.uio.no> writes:
Well, but I think you need to care about the whole process here.
Focusing only on the "end-user case" and binary installers has the flip side that smuggling in a back door is incredibly easy in compiled binaries. You simply upload a binary that doesn't match the source.
The reason PyPI isn't one big security risk is that packages are built from source, and so you can have some confidence that backdoors would be noticed and highlighted by somebody.
Having a common standards for binary installation phase would be great sure, but security-minded users would still need to build from source in every case (or trust a 3rt party build farm that builds from source). The reason you can trust RPMs at all is because they're built from SRPMs.
Easy enough on Posix platforms, perhaps, but what about Windows? One can't expect a C compiler to be installed everywhere. Perhaps security against backdoors could also be provided through other mechanisms, such as signing of binary installers. Regards, Vinay Sajip
On 6/22/2012 6:09 AM, Vinay Sajip wrote:
Easy enough on Posix platforms, perhaps, but what about Windows?
Every time windows users download and install a binary, they are taking a chance. I try to use a bit more sense than some people, but I know it is not risk free. There *is* a third party site that builds installers, but should I trust it? I would prefer that (except perhaps for known and trusted authors) PyPI compile binaries, perhaps after running code through a security checker, followed by running it through one or more virus checkers.
One can't expect a C compiler to be installed everywhere.
Just having 'a C compiler' is not enough to compile on Windows. -- Terry Jan Reedy
On Friday, June 22, 2012 at 4:55 PM, Terry Reedy wrote:
Every time windows users download and install a binary, they are taking a chance. I try to use a bit more sense than some people, but I know it is not risk free. There *is* a third party site that builds installers, but should I trust it? I would prefer that (except perhaps for known and trusted authors) PyPI compile binaries, perhaps after running code through a security checker, followed by running it through one or more virus checkers.
I think you overestimate the abilities of "security checkers" and antivirus. Installing from PyPI is a risk, wether you use source or binaries. There is currently not a very good security story for installing python packages from PyPI (not all of this falls on PyPI), but even if we get to a point there is, PyPI can never be as safe as installing from RPM's or DEB and somewhat mores in the case of binaries. You _have_ to make a case by case choice if you trust the authors/maintainers of a particular package.
On Friday, June 22, 2012 at 5:52 AM, Dag Sverre Seljebotn wrote:
The reason PyPI isn't one big security risk is that packages are built from source, and so you can have some confidence that backdoors would be noticed and highlighted by somebody.
Having a common standards for binary installation phase would be great sure, but security-minded users would still need to build from source in every case (or trust a 3rt party build farm that builds from source). The reason you can trust RPMs at all is because they're built from SRPMs.
Dag
The reason you trust RPM's is not because they are built from SRPM's, but because you trust the people running the repositories. In the case of PyPI you can't make a global call to implicitly trust all packages because there is no gatekeeper as in an RPM system, so it falls to the individual to decide for him or herself which authors they trust and which authors they do not trust. But this proposal alludes to both source dists and built dists, either which may be published and installed from. In the case of a source dist the package format would include all the metadata of the package. Included in that is a python script that knows how to build this particular package (if special steps are required). This script could simply call out to an already existing build system, or if simple enough work on it's own. Source dists would also obviously contain the source. In the case of a binary dist the package format would include all the metadata of the package, plus the binary files.
On Fri, Jun 22, 2012 at 10:38 AM, Donald Stufft <donald.stufft@gmail.com> wrote:
On Friday, June 22, 2012 at 5:22 AM, Dag Sverre Seljebotn wrote:
What Bento does is have one metadata file for the source-package, and another metadata file (manifest) for the built-package. The latter is normally generated by the build process (but follows a standard nevertheless). Then that manifest is used for installation (through several available methods).
From what I understand, this dist.(yml|json|ini) would be replacing the mainfest not the bento.info then. When bento builds a package compatible with the proposed format it would instead of generating it's own manifest it would generate the dist.(yml|json|ini).
If by manifest you mean the build manifest, then that's not desirable: the manifest contains the explicit filenames, and those are platform/environment specific. You don't want this to be user-facing. The way it should work is: - package description (dist.yaml, setup.cfg, bento.info, whatever) - use this as input to the build process - build process produces a build manifest that is platform specific. It should be extremely simple, no conditional or anything, and should ideally be fed to both python and non-python programs. - build manifest is then the sole input to the process building installers (besides the actual build tree, of course). Conceptually, after the build, you can do : manifest = BuildManifest.from_file("build_manifest.json") manifest.update_path(path_configuration) # This is needed so as to allow path scheme to be changed depending on installer format for category, source, target on manifest.iter_files(): # simple case is copying source to target, potentially using the category label for category specific stuff. This was enough for me to do straight install, eggs, .exe and .msi windows installers and .mpkg from that with a relatively simple API. Bonus point, if you include this file inside the installers, you can actually losslessly convert from one to the other. David
On 06/22/2012 12:20 PM, David Cournapeau wrote:
On Fri, Jun 22, 2012 at 10:38 AM, Donald Stufft<donald.stufft@gmail.com> wrote:
On Friday, June 22, 2012 at 5:22 AM, Dag Sverre Seljebotn wrote:
What Bento does is have one metadata file for the source-package, and another metadata file (manifest) for the built-package. The latter is normally generated by the build process (but follows a standard nevertheless). Then that manifest is used for installation (through several available methods).
From what I understand, this dist.(yml|json|ini) would be replacing the mainfest not the bento.info then. When bento builds a package compatible with the proposed format it would instead of generating it's own manifest it would generate the dist.(yml|json|ini).
If by manifest you mean the build manifest, then that's not desirable: the manifest contains the explicit filenames, and those are platform/environment specific. You don't want this to be user-facing.
The way it should work is: - package description (dist.yaml, setup.cfg, bento.info, whatever) - use this as input to the build process - build process produces a build manifest that is platform specific. It should be extremely simple, no conditional or anything, and should ideally be fed to both python and non-python programs. - build manifest is then the sole input to the process building installers (besides the actual build tree, of course).
Conceptually, after the build, you can do :
manifest = BuildManifest.from_file("build_manifest.json") manifest.update_path(path_configuration) # This is needed so as to allow path scheme to be changed depending on installer format for category, source, target on manifest.iter_files(): # simple case is copying source to target, potentially using the category label for category specific stuff.
This was enough for me to do straight install, eggs, .exe and .msi windows installers and .mpkg from that with a relatively simple API. Bonus point, if you include this file inside the installers, you can actually losslessly convert from one to the other.
I think Donald's suggestion can be phrased as this: During build, copy the dist metadata (name, version, dependencies...) to the build manifest as well. Then allow to upload only the built versions for different platforms to PyPI etc. and allow relative anarchy to reign in how you create the built dists. And I'm saying that would encourage a culture that's very dangerous from a security perspective. Even if many uses binaries, it is important to encourage a culture where it is always trivial (well, as trivial as we can possibly make it, in the case of Windows) to build from source for those who wish to. Making the user-facing entry point of the dist metadata be in the source package rather than the binary package seems like a necessary (but not sufficient) condition for such a culture. Dag
On 22 June 2012 11:28, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
And I'm saying that would encourage a culture that's very dangerous from a security perspective. Even if many uses binaries, it is important to encourage a culture where it is always trivial (well, as trivial as we can possibly make it, in the case of Windows) to build from source for those who wish to.
And what I am trying to say is that no matter how much effort gets put into trying to make build from source easy, it'll pretty much always not be even remotely trivial on Windows. There has been a lot of work done to try to achieve this, but as far as I've seen, it's always failed. One external dependency, and you're in a mess. Unless you're proposing some means of Python's packaging solution encapsulating URLs for binary libraries of external packages which will be automatically downloaded - and then all the security holes open again. You have to remember that not only do many Windows users not have a compiler, but also getting a compiler is non-trivial (not hard, just download and install VS Express, but still a pain to do just to get (say) lxml installed). And there is no standard location for external libraries in Windows, so you also need the end user to specify where everything is (or guess, or mandate a directory structure). The only easy-to-use solution that has ever really worked on Windows in my experience is downloadable binaries. Blame whoever you like, point out that it's not good practice if you must, but don't provide binaries and you lose a major part of your user base. (You may choose not to care about losing that group, that's a different question). Signed binaries may be a solution. My experience with signed binaries has not been exactly positive, but it's an option. Presumably PyPI would be the trusted authority? Would PyPI and the downloaders need to use SSL? Would developers need to have signing keys to use PyPI? And more to the point, do the people designing the packaging solutions have experience with this sort of stuff (I sure don't :-))? Paul.
Paul Moore <p.f.moore <at> gmail.com> writes:
Signed binaries may be a solution. My experience with signed binaries has not been exactly positive, but it's an option. Presumably PyPI would be the trusted authority? Would PyPI and the downloaders need to use SSL? Would developers need to have signing keys to use PyPI? And more to the point, do the people designing the packaging solutions have experience with this sort of stuff (I sure don't )?
I'm curious - what problems have you had with signed binaries? I dipped my toes in this particular pool with the Python launcher installers - I got a code signing certificate and signed my MSIs with it. The process was fairly painless. As far as I know, all signing does is to indicate that the binary package hasn't been tampered with and allows the downloader to decide whether they trust the signer not to have allowed backdoors, etc. I don't see that it mandates use of SSL, or even signing, by anyone. At least some people will require that an installer be invokable with an option that causes it to bail if any part of what's being installed can't be verified (for some value of "verified"). Regards, Vinay Sajip
On 22 June 2012 13:09, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Paul Moore <p.f.moore <at> gmail.com> writes:
Signed binaries may be a solution. My experience with signed binaries has not been exactly positive, but it's an option. Presumably PyPI would be the trusted authority? Would PyPI and the downloaders need to use SSL? Would developers need to have signing keys to use PyPI? And more to the point, do the people designing the packaging solutions have experience with this sort of stuff (I sure don't )?
I'm curious - what problems have you had with signed binaries?
As a user, I guess not that much. I may be misremembering bad experiences with different things. We've had annoyances with self-signed jars, and websites. It's generally more about annoying "can't confirm this should be trusted, please verify" messages which people end up just saying "yes" to (and so ruining any value from the check). But I don't know how often I have used them, to the extent that the only time I'm aware of them is when they don't work silently (e.g., I get a prompt asking if I want to trust this publisher - this is essentially a failure, as I always say "yes" simply because I have no idea how I would go about deciding that I do trust them, beyond what I've already done in locating and downloading the software from them!)
I dipped my toes in this particular pool with the Python launcher installers - I got a code signing certificate and signed my MSIs with it. The process was fairly painless.
OK, that's a good example, I didn't even realise those installers were signed, making it an excellent example of how easy it can be when it works. But you say "I got a code signing certificate". How? When I dabbled with signing, the only option I could find that didn't involve paying and/or having a registered domain of my own was a self-signed certificate, which from a UI point of view seems of little use "Paul Moore says you should trust him. Do you? Yes/No"... If signed binaries is the way we go, then we should be aware that we exclude people who don't have certificates from uploading to PyPI. Maybe that's OK, but without some sort of check I don't know how many current developers that would exclude, let alone how many potential developers would be put off. A Python-supported build farm, which signed code on behalf of developers, might alleviate this. But then we need to protect against malicious code being submitted to the build farm, etc.
As far as I know, all signing does is to indicate that the binary package hasn't been tampered with and allows the downloader to decide whether they trust the signer not to have allowed backdoors, etc. I don't see that it mandates use of SSL, or even signing, by anyone. At least some people will require that an installer be invokable with an option that causes it to bail if any part of what's being installed can't be verified (for some value of "verified").
Fair enough. I don't object to offering the option to verify signatures (I think I said something like that in an earlier message). I do have concerns about making signed code mandatory. (Not least over whether it'd let me install my own unsigned code!) Paul
Paul Moore <p.f.moore <at> gmail.com> writes:
As a user, I guess not that much. I may be misremembering bad experiences with different things. We've had annoyances with self-signed jars, and websites. It's generally more about annoying "can't confirm this should be trusted, please verify" messages which people end up just saying "yes" to (and so ruining any value from the check).
Like those pesky EULAs ;-)
But you say "I got a code signing certificate". How? When I dabbled with signing, the only option I could find that didn't involve paying and/or having a registered domain of my own was a self-signed certificate, which from a UI point of view seems of little use "Paul Moore says you should trust him. Do you? Yes/No"...
I got mine from Certum (certum.pl) - they offer (or at least did offer, last year) free code signing certificates for Open Source developers (you have to have "Open Source Developer" in what's being certified). See: http://www.certum.eu/certum/cert,offer_en_open_source_cs.xml
If signed binaries is the way we go, then we should be aware that we exclude people who don't have certificates from uploading to PyPI.
I don't think that any exclusion would occur. It just means that there's a mechanism for people who are picky about such things to have a slightly larger comfort zone.
Maybe that's OK, but without some sort of check I don't know how many current developers that would exclude, let alone how many potential developers would be put off.
I don't think any packager need be excluded. It would be up to individual packagers and package consumers as to whether they sign packages / stick to only using signed packages. For almost everyone, life should go on as before.
A Python-supported build farm, which signed code on behalf of developers, might alleviate this. But then we need to protect against malicious code being submitted to the build farm, etc.
There is IMO neither the will nor the resource to do any sort of policing. Caveat emptor (or caveat user, rather). Let's not forget, all of this software is without warranty of any kind.
Fair enough. I don't object to offering the option to verify signatures (I think I said something like that in an earlier message). I do have concerns about making signed code mandatory. (Not least over whether it'd let me install my own unsigned code!)
Any workable mechanism would need to be optional (the user doing the installing would be the decider as to whether to go ahead and install, with signature, or lack thereof, in mind). Regards, Vinay Sajip
On Jun 22, 2012, at 12:27 PM, Paul Moore wrote:
And what I am trying to say is that no matter how much effort gets put into trying to make build from source easy, it'll pretty much always not be even remotely trivial on Windows.
It seems to me that a "Windows build service" is something the Python infrastructure could support. This would be analogous to the types of binary build services Linux distros provide, e.g. the normal Ubuntu workflow of uploading a source package to a build daemon, which churns away for a while, and results in platform-specific binary packages which can be directly installed on an end-user system. -Barry
On 22/06/2012 13:14, Barry Warsaw wrote:
On Jun 22, 2012, at 12:27 PM, Paul Moore wrote:
And what I am trying to say is that no matter how much effort gets put into trying to make build from source easy, it'll pretty much always not be even remotely trivial on Windows.
It seems to me that a "Windows build service" is something the Python infrastructure could support. This would be analogous to the types of binary build services Linux distros provide, e.g. the normal Ubuntu workflow of uploading a source package to a build daemon, which churns away for a while, and results in platform-specific binary packages which can be directly installed on an end-user system.
The devil would be in the details. As Paul Moore pointed out earlier, building *any* extension which relies on some 3rd-party library on Windows (mysql, libxml, sdl, whatever) can be an exercise in iterative frustration as you discover build requirements on build requirements. This isn't just down to Python: try building TortoiseSvn by yourself, for example. That's not say that this is insurmountable. Christopher Gohlke has for a long while maintained an unofficial binary store at his site: http://www.lfd.uci.edu/~gohlke/pythonlibs/ but I've no idea how much work he's had to put in to get all the dependencies built. Someone who just turned up with a new build: "Here's a Python interface for ToastRack -- the new card-printing service" would need a way to provide the proposed build infrastructure with what was needed to build the library behind the Python extension. Little fleas have smaller fleas... and so on. TJG
On Fri, 22 Jun 2012 12:27:19 +0100 Paul Moore <p.f.moore@gmail.com> wrote:
Signed binaries may be a solution. My experience with signed binaries has not been exactly positive, but it's an option. Presumably PyPI would be the trusted authority? Would PyPI and the downloaders need to use SSL? Would developers need to have signing keys to use PyPI? And more to the point, do the people designing the packaging solutions have experience with this sort of stuff (I sure don't :-))?
The ones signing the binaries would have to be the packagers, not PyPI. Also, if packages are signed, you arguably don't need to use SSL when downloading them (but SSL can still be useful for other purposes e.g. navigating in the catalog). PyPI-signing of packages would not achieve anything, since PyPI cannot vouch for the quality and non-maliciousness of uploaded files. It would only serve as a replacement for SSL downloads. Regards Antoine.
Zitat von Antoine Pitrou <solipsis@pitrou.net>:
On Fri, 22 Jun 2012 12:27:19 +0100 Paul Moore <p.f.moore@gmail.com> wrote:
Signed binaries may be a solution. My experience with signed binaries has not been exactly positive, but it's an option. Presumably PyPI would be the trusted authority? Would PyPI and the downloaders need to use SSL? Would developers need to have signing keys to use PyPI? And more to the point, do the people designing the packaging solutions have experience with this sort of stuff (I sure don't :-))?
The ones signing the binaries would have to be the packagers, not PyPI.
It depends. PyPI already signs all binaries (essentially) as part of the mirror protocol. What this proves is that the mirror has not modified the data compared to the copy of PyPI. If PyPI can be trusted not to modify the binaries, then this also proves that the binaries are the same as originally uploaded. What this doesn't prove is that the upload was really made by the declared author of the package (which could be prevented by signing the packages by the original author); it also doesn't prove that the binaries are free of malicous code (which no amount of signing can prove).
PyPI-signing of packages would not achieve anything, since PyPI cannot vouch for the quality and non-maliciousness of uploaded files.
That's just not true. It can prove that the files have not been modified by mirrors, caches, and the like, of which there are plenty in practice.
It would only serve as a replacement for SSL downloads.
See above. Also notice that such signing is already implemented, as part of PEP 381. Regards, Martin
<martin <at> v.loewis.de> writes:
See above. Also notice that such signing is already implemented, as part of PEP 381.
BTW, I notice that the certificate for https://pypi.python.org/ expired a week ago ... Regards, Vinay Sajip
Ideally authors will be signing their packages (using gpg keys). Of course how to distribute keys is an exercise left to the reader. On Friday, June 22, 2012 at 11:48 AM, Vinay Sajip wrote:
<martin <at> v.loewis.de (http://v.loewis.de)> writes:
See above. Also notice that such signing is already implemented, as part of PEP 381.
BTW, I notice that the certificate for https://pypi.python.org/ expired a week ago ...
Regards,
Vinay Sajip
_______________________________________________ Python-Dev mailing list Python-Dev@python.org (mailto:Python-Dev@python.org) http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald.stufft%40gmail.com
On Fri, Jun 22, 2012 at 9:35 AM, Donald Stufft <donald.stufft@gmail.com> wrote:
Ideally authors will be signing their packages (using gpg keys). Of course how to distribute keys is an exercise left to the reader.
Key distribution is the real issue though. If there isn't a key distribution infrastructure in place, we might as well not bother with signatures. PyPI could issue x509 certs to packagers. You wouldn't be able to verify that the name given is accurate, but you would be able to verify that all packages with the same listed author are actually by that author.
On Friday, June 22, 2012 at 11:48 AM, Vinay Sajip wrote:
<martin <at> v.loewis.de> writes:
See above. Also notice that such signing is already implemented, as part of PEP 381.
BTW, I notice that the certificate for https://pypi.python.org/ expired a week ago ...
Regards,
Vinay Sajip
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald.stufft%40gmail.com
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/alexandre.zani%40gmail.com
On Friday, June 22, 2012 at 12:54 PM, Alexandre Zani wrote:
Key distribution is the real issue though. If there isn't a key distribution infrastructure in place, we might as well not bother with signatures. PyPI could issue x509 certs to packagers. You wouldn't be able to verify that the name given is accurate, but you would be able to verify that all packages with the same listed author are actually by that author.
I've been sketching out ideas for key distribution, but it's very much a chicken and egg problem, very few people sign their packages (because nothing uses it currently), and nobody is motivated to work on infrastructure or tooling because no one signs their packages.
On Fri, Jun 22, 2012 at 9:56 AM, Donald Stufft <donald.stufft@gmail.com> wrote:
On Friday, June 22, 2012 at 12:54 PM, Alexandre Zani wrote:
Key distribution is the real issue though. If there isn't a key distribution infrastructure in place, we might as well not bother with signatures. PyPI could issue x509 certs to packagers. You wouldn't be able to verify that the name given is accurate, but you would be able to verify that all packages with the same listed author are actually by that author.
I've been sketching out ideas for key distribution, but it's very much a chicken and egg problem, very few people sign their packages (because nothing uses it currently), and nobody is motivated to work on infrastructure or tooling because no one signs their packages.
Are those ideas available publicly? I would love to chip in.
Not at the moment, but I could gather them up and make them public later today. They are very rough draft at the moment. On Friday, June 22, 2012 at 1:09 PM, Alexandre Zani wrote:
On Fri, Jun 22, 2012 at 9:56 AM, Donald Stufft <donald.stufft@gmail.com (mailto:donald.stufft@gmail.com)> wrote:
On Friday, June 22, 2012 at 12:54 PM, Alexandre Zani wrote:
Key distribution is the real issue though. If there isn't a key distribution infrastructure in place, we might as well not bother with signatures. PyPI could issue x509 certs to packagers. You wouldn't be able to verify that the name given is accurate, but you would be able to verify that all packages with the same listed author are actually by that author.
I've been sketching out ideas for key distribution, but it's very much a chicken and egg problem, very few people sign their packages (because nothing uses it currently), and nobody is motivated to work on infrastructure or tooling because no one signs their packages.
Are those ideas available publicly? I would love to chip in.
On 22 June 2012 17:56, Donald Stufft <donald.stufft@gmail.com> wrote:
On Friday, June 22, 2012 at 12:54 PM, Alexandre Zani wrote:
Key distribution is the real issue though. If there isn't a key distribution infrastructure in place, we might as well not bother with signatures. PyPI could issue x509 certs to packagers. You wouldn't be able to verify that the name given is accurate, but you would be able to verify that all packages with the same listed author are actually by that author.
I've been sketching out ideas for key distribution, but it's very much a chicken and egg problem, very few people sign their packages (because nothing uses it currently), and nobody is motivated to work on infrastructure or tooling because no one signs their packages.
I'm surprised gpg hasn't been mentioned here. I think these are all solved problems, most free software that is signed signs it with the gpg key of the author. In that case all that is needed is that the cheeseshop allows the uploading of the signature. As for key distribution, the keyservers take care of that just fine and we'd probably see more and better attended signing parties at python conferences. Regards, Floris
Oh sorry, having read the thread this spawned from I see you're taking about MS Windows singed binaries. Something I know next to nothing about, so ignore my babbling. On 23 June 2012 11:52, Floris Bruynooghe <flub@devork.be> wrote:
On 22 June 2012 17:56, Donald Stufft <donald.stufft@gmail.com> wrote:
On Friday, June 22, 2012 at 12:54 PM, Alexandre Zani wrote:
Key distribution is the real issue though. If there isn't a key distribution infrastructure in place, we might as well not bother with signatures. PyPI could issue x509 certs to packagers. You wouldn't be able to verify that the name given is accurate, but you would be able to verify that all packages with the same listed author are actually by that author.
I've been sketching out ideas for key distribution, but it's very much a chicken and egg problem, very few people sign their packages (because nothing uses it currently), and nobody is motivated to work on infrastructure or tooling because no one signs their packages.
I'm surprised gpg hasn't been mentioned here. I think these are all solved problems, most free software that is signed signs it with the gpg key of the author. In that case all that is needed is that the cheeseshop allows the uploading of the signature. As for key distribution, the keyservers take care of that just fine and we'd probably see more and better attended signing parties at python conferences.
Regards, Floris
-- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org
I'm surprised gpg hasn't been mentioned here. I think these are all solved problems, most free software that is signed signs it with the gpg key of the author. In that case all that is needed is that the cheeseshop allows the uploading of the signature.
For the record, the cheeseshop has been supporting pgp signatures for about ten years now. Several projects have been using that for quite a while in their releases. Regards, Martin
Am 23.06.12 14:03, schrieb martin@v.loewis.de:
I'm surprised gpg hasn't been mentioned here. I think these are all solved problems, most free software that is signed signs it with the gpg key of the author. In that case all that is needed is that the cheeseshop allows the uploading of the signature. For the record, the cheeseshop has been supporting pgp signatures for about ten years now. Several projects have been using that for quite a while in their releases.
Also for the record, it’s broken as of Python 3.2. See http://bugs.python.org/issue10571
Zitat von Hynek Schlawack <hs@ox.cx>:
Am 23.06.12 14:03, schrieb martin@v.loewis.de:
I'm surprised gpg hasn't been mentioned here. I think these are all solved problems, most free software that is signed signs it with the gpg key of the author. In that case all that is needed is that the cheeseshop allows the uploading of the signature. For the record, the cheeseshop has been supporting pgp signatures for about ten years now. Several projects have been using that for quite a while in their releases.
Also for the record, it?s broken as of Python 3.2. See http://bugs.python.org/issue10571
That's different, though: PyPI continues to support it just fine. It's only distutils which has it broken. If you manually run gpg, and manually upload through the web interface, it still works. Regards, Martin
On Friday, June 22, 2012 at 6:20 AM, David Cournapeau wrote:
If by manifest you mean the build manifest, then that's not desirable: the manifest contains the explicit filenames, and those are platform/environment specific. You don't want this to be user-facing.
It appears I misunderstood the files that bento uses then ;) It is late (well early now) and I have not used bento extensively. What I suggest mirrors RPM's similarly except the build step (when there is indeed a build step) is handled by a python script included in the package by the author of said package.
On Fri, Jun 22, 2012 at 5:22 AM, Dag Sverre Seljebotn < d.s.seljebotn@astro.uio.no> wrote:
On 06/22/2012 10:40 AM, Paul Moore wrote:
On 22 June 2012 06:05, Nick Coghlan<ncoghlan@gmail.com> wrote:
distutils really only plays at the SRPM level - there is no defined OS neutral RPM equivalent. That's why I brought up the bdist_simple discussion earlier in the thread - if we can agree on a standard bdist_simple format, then we can more cleanly decouple the "build" step from the "install" step.
That was essentially the key insight I was trying to communicate in my "think about the end users" comment. Thanks, Nick!
The subtlety here is that there's no way to know before building the package what files should be installed. (For simple extensions, and perhaps documentation, you could get away with ad-hoc rules or special support for Sphinx and what-not, but there's no general solution that works in all cases.)
What Bento does is have one metadata file for the source-package, and another metadata file (manifest) for the built-package. The latter is normally generated by the build process (but follows a standard nevertheless). Then that manifest is used for installation (through several available methods).
This is the right thing to do, IMO. Also, I think rather than bikeshedding the One Serialization To Rule Them All, it should only be the *built* manifest that is standardized for tool consumption, and leave source descriptions to end-user tools. setup.cfg, bento.info, or whatever... that part should NOT be the first thing designed, and should not be the part that's frozen in a spec, since it otherwise locks out the ability to enhance that format. There's also been a huge amount of previous discussion regarding setup.cfg, which anyone proposing to alter it should probably read. setup.cfg allows hooks to external systems, so IIUC, you should be able to write a setup.cfg file that contains little besides your publication metadata (name, version, dependencies) and a hook to invoke whatever build tools you want, as long as you're willing to write a Python hook. This means that bikeshedding the build process is totally beside the point. If people want to use distutils, bento, SCons, ... it really doesn't matter, as long as they're willing to write a hook. This is a killer app for "packaging", as it frees up the stdlib from having to do every bloody thing itself and create One Build Process To Rule Them All. I didn't invent setup.cfg or write the "packaging" code, but I support this design approach wholeheartedly. I have only the smallest of quibbles and questions with it, but they aren't showstoppers. I've already had some discussion about these points on Distutils-SIG, and I think that should be continued. If there *is* to be any major discussion about switching directions in packaging, the place to start should be *use cases* rather than file formats.
On Fri, Jun 22, 2012 at 9:11 PM, PJ Eby <pje@telecommunity.com> wrote:
On Fri, Jun 22, 2012 at 5:22 AM, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
On 06/22/2012 10:40 AM, Paul Moore wrote:
On 22 June 2012 06:05, Nick Coghlan<ncoghlan@gmail.com> wrote:
distutils really only plays at the SRPM level - there is no defined OS neutral RPM equivalent. That's why I brought up the bdist_simple discussion earlier in the thread - if we can agree on a standard bdist_simple format, then we can more cleanly decouple the "build" step from the "install" step.
That was essentially the key insight I was trying to communicate in my "think about the end users" comment. Thanks, Nick!
The subtlety here is that there's no way to know before building the package what files should be installed. (For simple extensions, and perhaps documentation, you could get away with ad-hoc rules or special support for Sphinx and what-not, but there's no general solution that works in all cases.)
What Bento does is have one metadata file for the source-package, and another metadata file (manifest) for the built-package. The latter is normally generated by the build process (but follows a standard nevertheless). Then that manifest is used for installation (through several available methods).
This is the right thing to do, IMO.
Also, I think rather than bikeshedding the One Serialization To Rule Them All, it should only be the *built* manifest that is standardized for tool consumption, and leave source descriptions to end-user tools. setup.cfg, bento.info, or whatever... that part should NOT be the first thing designed, and should not be the part that's frozen in a spec, since it otherwise locks out the ability to enhance that format.
agreed. I may not have been very clear before, but the bento.info format is really peripherical to what bento is about (it just happens that what would become bento was started as a 2 hours proof of concept for another packaging discussion 3 years ago :) ). As for the build manifest, I have a few, very out-dated notes there: http://cournape.github.com/Bento/html/hacking.html#build-manifest-and-buildi... I will try to update them this WE. I do have code to install, produce eggs, msi, .exe and .mpkg from this format. The API is kind of crappy/inconsistent, but the features are there, and there are even some tests around it. I don't think it would be very difficult to hack distutils2 to produce this build manifest. David
On Fri, 22 Jun 2012 15:05:08 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
So here's some sheer pie-in-the-sky speculation. If people like elements of this idea enough to run with it, great. If not... oh well:
Could this kind of discussion perhaps go on python-ideas? Thanks Antoine.
Hi, On 6/22/12 1:05 AM, Nick Coghlan wrote:
On Fri, Jun 22, 2012 at 10:01 AM, Donald Stufft <donald.stufft@gmail.com> wrote:
The idea i'm hoping for is to stop worrying about one implementation over another and hoping to create a common format that all the tools can agree upon and create/install.
Right, and this is where it encouraged me to see in the Bento docs that David had cribbed from RPM in this regard (although I don't believe he has cribbed *enough*).
A packaging system really needs to cope with two very different levels of packaging: 1. Source distributions (e.g. SRPMs). To get from this to useful software requires developer tools. 2. "Binary" distributions (e.g. RPMs). To get from this to useful software mainly requires a "file copy" utility (well, that and an archive decompressor).
An SRPM is *just* a SPEC file and source tarball. That's it. To get from that to an installed product, you have a bunch of additional "BuildRequires" dependencies, along with %build and %install scripts and a %files definition that define what will be packaged up and included in the binary RPM. The exact nature of the metadata format doesn't really matter, what matters is that it's a documented standard that multiple tools can read.
An RPM includes files that actually get installed on the target system. An RPM can be arch specific (if they include built binary bits) or "noarch" if they're platform neutral.
distutils really only plays at the SRPM level - there is no defined OS neutral RPM equivalent. That's why I brought up the bdist_simple discussion earlier in the thread - if we can agree on a standard bdist_simple format, then we can more cleanly decouple the "build" step from the "install" step.
I think one of the key things to learn from the SPEC file format is the configuration language it used for the various build phases: sh (technically, any shell on the system, but almost everyone just uses the default system shell)
This is why you can integrate whatever build system you like with it: so long as you can invoke the build from the shell, then you can use it to make your RPM.
Now, there's an obvious problem with this: it's completely useless from a *cross-platform* building point of view. Isn't it a shame there's no language we could use that would let us invoke build systems in a cross platform way? Oh, wait...
So here's some sheer pie-in-the-sky speculation. If people like elements of this idea enough to run with it, great. If not... oh well:
- I believe the "egg" term has way too much negative baggage (courtesy of easy_install), and find the full term Distribution to be too easily confused with "Linux distribution". However, "Python dist" is unambiguous (since the more typical abbreviation for an aggregate distribution is "distro"). Thus, I attempt to systematically refer to the objects used to distribute Python software from developers to users as "dists". In practice, this terminology is already used in many places (distutils, sdist, bdist_msi, bdist_rpm, the .dist-info format in PEP 376 etc). Thus, Python software is distributed as dists (either sdists or bdists), which may in turn be converted to distro packages (e.g. SRPMs and RPMs) for deployment to particular environments.
+0.5. There is definitely a problem with the term "egg", but I don't think negative baggage is it. Rather, I think "egg" is just plain too confusing, and perhaps too "cutsie", too. A blurb from the internet[1]: "An egg is a bundle that contains all the package data. In the ideal case, an egg is a zip-compressed file with all the necessary package files. But in some cases, setuptools decides (or is told by switches) that a package should not be zip-compressed. In those cases, an egg is simply an uncompressed subdirectory, but with the same contents. The single file version is handy for transporting, and saves a little bit of disk space, but an egg directory is functionally and organizationally identical." Compared to the definitions of package and distribution I posted earlier in this thread, the confusion is: - A package is one or more modules inside another module, a distribution is a compressed archive of those modules, but an egg is either or both. - The blurb author uses the term "package data" presumably to refer to package modules, package data (i.e. resources like templates, etc), and package metadata. So to avoid this confusion I've personally stopped using the term "egg" in favor of "package". (Outside a computer context, everyone knows a package is something "with stuff in it") But as Donald said, what we are all talking about is technically called a "distribution". ("Honey, a distribution arrived for you in the mail today!" :-)) I love that Nick is thinking "outside the box" re: terminology, but I'm not 100% convinced the new term should be "dist". Rather I propose: - Change the definition of package to: a module (or modules) plus package data and package metadata inside another module. - Refer to source dists as "source packages" i.e. packages containing source code. - Refer to binary dists as "binary packages" i.e. packages containing byte code and executables. I believe this is the most "human" thing we can do[2]. Alex [1] http://www.ibm.com/developerworks/linux/library/l-cppeak3/index.html [2] http://python-for-humans.heroku.com
- I reject setup.cfg, as I believe ini-style configuration files are not appropriate for a metadata format that needs to include file listings and code fragments
- I reject bento.info, as I think if we accept yet-another-custom-configuration-file-format into the standard library instead of just using YAML, we're even crazier than is already apparent
- I shall use "dist.yaml" as my proposed name for my "I wish I could define packages like this" format (and yes, that means adding yaml support to the standard library is part of the wish)
- many of the details below will be flawed, but I want to give a clear idea for how a concept like this might work in practice
- we need to define a clear set of build phases, and then design the dist metadata format accordingly. For example: - source - uses a "source" section in dist.yaml - "source/install" maps source files directly to desired install locations - essentially what the setup.cfg Resources section tries to do - used for pure Python code, documentation, etc - See below for example - "source/files" defines a list of extra files to be included - "source/exclude" defines the list of files to be excluded - "source/run" defines a Python fragment to be executed - serves a similar purpose to the "files" section in setup.cfg - creates a temporary directory (and sets it as the working directory) - dist.yaml is copied to the temporary directory - all files to be installed are copied to the temporary directory - all extra files are copied to the temporary directory - the Python fragment in "source/run" is executed (which can thus easily add more files) - if sdist archive creation is requested, entire contents of temporary directory are included - build - uses a "build" section in dist.yaml - "build/install" maps built files to desired install locations - like source/install, but for build artifacts - compiled C extensions, .pyc and .pyo files, etc would all go here - "build/run" defines a Python fragment to be executed - "build/files" defines the list of files to be included - "build/exclude" defines the list of files to be excluded - "build/requires" defines extra dependencies not needed at runtime - starting environment is a source directory that is either: - preexisting (e.g. to allow building in-place in the source tree) - created by running source first - created by unpacking an sdist archive - the Python fragment in "build/run" is executed to trigger the build - if the build succeeds (i.e. doesn't throw an exception) - create a temporary directory - copy dist.yaml - copy all specified files - this is the easiest way to exclude build artifacts from the distribution, while still keeping them around to enable incremental builds - if bdist_simple archive creation is requested, entire contents of temporary directory are included - other bdist formats (such as bdist_rpm) will have their own rules for getting from the bdist_simple format to the platform specific format - install - uses an "install" section in dist.yaml - "install/pre" defines a Python fragment to be executed before copying files - "install/post" defines a Python fragment to be executed after copying files - starting environment is a bdist_simple directory that is either: - preexisting (e.g. to allow creation by system packaging tools) - created by running build first - created by unpacking a bdist_simple archive - end result is a fully installed and usable piece of software - test - uses a "test" section in dist.yaml - "test/run" defines a Python fragment to be executed to start the tests - "test/requires" defines extra dependencies needed to run the test suite
- Example "source/install" based on http://alexis.notmyidea.org/distutils2/setupcfg.html#complete-example (my YAML may be a bit dodgy). - With this scheme, module installation is just another install category. - A solution for easily installing entire subtrees is desirable. I propose the recursive glob ** syntax for that purpose. - Unlike setup.cfg, every category would have an "-excluded" counterpart to filter unwanted files. Explicit is better than implicit.
source: install: modules: example.py example_pkg/*.py example_pkg/**/*.py example_pkg/resource.txt doc: README doc/* doc-excluded: doc/man man: doc/man scripts: # Directory details are stripped automatically scripts/LAUNCH scripts/*.{sh,bat} # But subdirectories can be made explicit extras/: scripts/extras/*.{sh,bat}
- the goal of a dist.yaml syntax would be to be *explicit* and *comprehensive*. If this gets too verbose, then the solution would be dist.yaml generators that are less expressive, but also reduce the necessary boilerplate.
- a typical "sdist" will now just be an archive consisting of: - the project's dist.yaml file - all files created by the "source" phase
- the "bdist_simple" format will just be an archive consisting of: - the project's dist.yaml file - all files created by the "build" phase
- the source and build run hooks and install pre and post hooks become the way you integrate with arbitrary build systems. No fancy command or compiler system or anything like that, you just import whatever you need and call it with the appropriate arguments. To other tools, they will just be opaque chunks of text, but to the build system, they're executable pieces of Python code, just as RPM includes executable scripts.
Cheers, Nick.
-- Alex Clark · http://pythonpackages.com
Paul Moore writes:
End users should not need packaging tools on their machines.
I think this desideratum is close to obsolete these days, with webapps in "the cloud" downloading resources (including, but not limited to, code) on an as-needed basis. If you're *not* obtaining resources as-needed, but instead installing an everything-you-could-ever-need SUMO, I don't see the problem with including packaging tools as well. Not to mention that "end user" isn't a permanent property of a person, but rather a role that they can change at will and sometimes may be forced to. What is desirable is that such tools be kept in the back of a closet where people currently in the "end user" role don't need to see them at all, but developers can get them immediately when needed.
On Fri, Jun 22, 2012 at 4:25 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Paul Moore writes:
> End users should not need packaging tools on their machines.
I think this desideratum is close to obsolete these days, with webapps in "the cloud" downloading resources (including, but not limited to, code) on an as-needed basis.
There's still a lot more to the software world than what happens on the public internet. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan writes:
On Fri, Jun 22, 2012 at 4:25 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Paul Moore writes:
> End users should not need packaging tools on their machines.
I think this desideratum is close to obsolete these days, with webapps in "the cloud" downloading resources (including, but not limited to, code) on an as-needed basis.
There's still a lot more to the software world than what happens on the public internet.
That's taking just one extreme out of context. The other extreme I mentioned is a whole (virtual) Python environment to go with your app. And I don't really see a middle ground, unless you're delivering a non-standard stdlib anyway, with all the stuff that end users don't need stripped out of it. They'll get the debugger and the profiler with Python; should we excise them from the stdlib just because end users don't need them? How about packaging diagnostic tools, especially in the early days of the new module? I agreed that end users should not need to download the packaging tools separately or in advance. But that's rather different from having a *requirement* that the tools not be included, or that installers should have no dependencies on the toolset outside of a minimal and opaque runtime module.
On 22 June 2012 13:39, Stephen J. Turnbull <stephen@xemacs.org> wrote:
Nick Coghlan writes: > On Fri, Jun 22, 2012 at 4:25 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote: > > Paul Moore writes: > > > > > End users should not need packaging tools on their machines. > > > > I think this desideratum is close to obsolete these days, with webapps > > in "the cloud" downloading resources (including, but not limited to, > > code) on an as-needed basis. > > There's still a lot more to the software world than what happens on > the public internet.
That's taking just one extreme out of context. The other extreme I mentioned is a whole (virtual) Python environment to go with your app.
And I don't really see a middle ground, unless you're delivering a non-standard stdlib anyway, with all the stuff that end users don't need stripped out of it. They'll get the debugger and the profiler with Python; should we excise them from the stdlib just because end users don't need them? How about packaging diagnostic tools, especially in the early days of the new module?
I agreed that end users should not need to download the packaging tools separately or in advance. But that's rather different from having a *requirement* that the tools not be included, or that installers should have no dependencies on the toolset outside of a minimal and opaque runtime module.
I suppose if you're saying that "pip install lxml" should download and install for me Visual Studio, libxml2 sources and any dependencies, and run all the builds, then you're right. But I assume you're not. So why should I need to install Visual Studio just to *use* lxml? On the other hand, I concede that there are some grey areas between the 2 extremes. I don't know enough to do a proper review of the various cases. But I do think that there's a risk that the discussion, because it is necessarily driven by developers, forgets that "end users" really don't have some tools that a developer would consider "trivial" to have. Paul.
On Fri, Jun 22, 2012 at 2:24 PM, Paul Moore <p.f.moore@gmail.com> wrote:
I suppose if you're saying that "pip install lxml" should download and install for me Visual Studio, libxml2 sources and any dependencies, and run all the builds, then you're right. But I assume you're not. So why should I need to install Visual Studio just to *use* lxml?
On the other hand, I concede that there are some grey areas between the 2 extremes. I don't know enough to do a proper review of the various cases. But I do think that there's a risk that the discussion, because it is necessarily driven by developers, forgets that "end users" really don't have some tools that a developer would consider "trivial" to have.
Binary installers are important: if you think lxml is hard on windows, think about what it means to build fortran libraries and link them with visual studio for scipy :) That's one of the reason virtualenv + pip is not that useful for numpy/scipy end users. Bento has code to build basic binary installers in all the formats supported by distutils except for RPM, and the code is by design mostly independ of the rest. I would be happy to clean up that code to make it more reusable (most of it is extracted from distutils/setuptools anyway). But it should be completely orthogonal to the issue of package description: if there is one thing that distutils got horribly wrong, that's tying everything altogether. The uncoupling is the key, because otherwise, one keep discussing all the issues together, which is part of what makes the discussion so hard. Different people have different needs. David
Paul Moore writes:
I suppose if you're saying that "pip install lxml" should download and install for me Visual Studio, libxml2 sources and any dependencies, and run all the builds, then you're right. But I assume you're not.
Indeed, if only a source package is available, it should. That's precisely what happens for source builds that depend on a non-system compiler on say Gentoo or MacPorts. What I'm saying is that the packaging system should *always be prepared* to offer that service (perhaps via plugins to OS distros' PMSes). Whether a particular package does, or not, presumably is up to the package's maintainer. Even if a binary package is available, it may only be partial. Indeed, by some definitions it alway will be (it will depend on an OS being installed!) Such a package should *also* offer the ability to fix up an incomplete system, by building from source if needed. Why can I say "should" here? Because if the packaging standard is decent and appropriate tools provided, there will be a source package because it's the easiest way to create a distribution! Such tools will surely be able to search the system for a preinstalled dependency, or a binary package cache for an installable package. A "complete" binary package would just provide the package cache on its distribution medium.
So why should I need to install Visual Studio just to *use* lxml?
Because the packaging standard cannot mandate "high quality" packages from any given user's perspective, it can only provide the necessary features to implement them. If the lxml maintainer chooses to depend on a pre-installed libxml2, AFAICS you're SOL -- you need to go elsewhere for the library. VS + libxml2 source is just the most reliable way to go elsewhere in some sense (prebuilt binaries have a habit of showing up late or built with incompatible compiler options or the like).
But I do think that there's a risk that the discussion, because it is necessarily driven by developers, forgets that "end users" really don't have some tools that a developer would consider "trivial" to have.
I don't understand the problem. As long as binary packages have a default spec to depend on nothing but a browser to download the MSI, all you need is a buildbot that has a minimal Windows installation, and it won't be forgotten. The developers may forget to check, but the bot will remember! I certainly agree that the *default* spec should be that if you can get your hands on binary installer, that's all you need -- it will do *all* the work needed. Heck, it's not clear to me what else the default spec might be for a binary package. OTOH, if the developers make a conscious decision to depend on a given library, and/or a compiler, being pre-installed on the target system, what can Python or the packaging standard do about that?
Why do I get the feeling that most people who hate distutils and want to replace it, has transferred those feelings to distutils2/packaging, mainly because of the name? In the end, I think this discussion is very similar to all previous packaging/building/installing discussions: There is a lot of emotions, and a lot of willingness to declare that "X sucks" but very little concrete explanation of *why* X sucks and why it can't be fixed. //Lennart
On 06/23/2012 12:37 PM, Lennart Regebro wrote:
Why do I get the feeling that most people who hate distutils and want to replace it, has transferred those feelings to distutils2/packaging, mainly because of the name?
In the end, I think this discussion is very similar to all previous packaging/building/installing discussions: There is a lot of emotions, and a lot of willingness to declare that "X sucks" but very little concrete explanation of *why* X sucks and why it can't be fixed.
I think David has been pretty concrete in a lot of his criticism too (though he has refrained from repeating himself too much in this thread). Some of the criticism is spelled out here: http://bento.readthedocs.org/en/latest/faq.html This blog post is even more concrete (but perhaps outdated?): http://cournape.wordpress.com/2010/10/13/271/ As for me, I believe I've been rather blunt and direct in my criticism in this thread: It's been said by Tarek that distutils2 authors that they don't know anything about compilers. Therefore it's almost unconceivable to me that much good can come from distutils2 *for my needs*. Even if packaging and building isn't the same, the two issues do tangle at a fundamental level, *and* most existing solutions already out there (RPM, MSI..) distribute compiled software and therefore one needs a solid understanding of build processes to also understand these tools fully and draw on their experiences and avoid reinventing the wheel. Somebody with a deep understanding of 3-4 existing build systems and long experience in cross-platform builds and cross-architecture builds would need to be on board for me to take it seriously (even the packaging parts). As per Tarek's comments, I'm therefore pessimistic about the distutils2 efforts. (You can always tell me that I shouldn't criticise unless I'm willing to join and do something about it. That's fair. I'm just saying that my unwillingness to cheer for distutils2 is NOT based on the name only!) Dag
Dag Sverre Seljebotn <d.s.seljebotn <at> astro.uio.no> writes:
As for me, I believe I've been rather blunt and direct in my criticism in this thread: It's been said by Tarek that distutils2 authors that they don't know anything about compilers. Therefore it's almost unconceivable to me that much good can come from distutils2 *for my needs*. Even if packaging and building isn't the same, the two issues do tangle at a fundamental level, *and* most existing solutions already out there (RPM, MSI..) distribute compiled software and therefore one needs a solid understanding of build processes to also understand these tools fully and draw on their experiences and avoid reinventing the wheel.
But packaging/distutils2 contains functionality for hooks, which can be used to implement custom builds using tools that packaging/distutils2 doesn't need to know or care about (a hook will do that). One can imagine that a set of commonly used templates would become available over time, so that some problems wouldn't need to have solutions re-invented.
Somebody with a deep understanding of 3-4 existing build systems and long experience in cross-platform builds and cross-architecture builds would need to be on board for me to take it seriously (even the packaging parts). As per Tarek's comments, I'm therefore pessimistic about the distutils2 efforts.
This deep understanding is not essential in the packaging/distutil2 team, AFAICT. They just need to make sure that the hook APIs are sufficiently flexible, that the hooks invoked at the appropriate time, and that they are adequately documented with appropriate examples. For me, the bigger problem with the present distutils2/packaging implementation is that it propagates the command-class style of design which IMO caused so much pain in extending distutils. Perhaps some of the dafter limitations have been removed, and no doubt the rationale was to get to something usable more quickly, but it seems a bit like papering over cracks. The basic low-level building blocks like versioning, metadata and markers should be fine, but I'm not convinced that the command-class paradigm is appropriate in this case. The whole intricate "initialize_options"/"finalize_options"/"set_undefined_options" /"get_finalized_command"/"reinitialize_command" dance just makes me say, "Seriously?". Regards, Vinay Sajip
On Sat, 23 Jun 2012 12:27:52 +0000 (UTC) Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
For me, the bigger problem with the present distutils2/packaging implementation is that it propagates the command-class style of design which IMO caused so much pain in extending distutils. Perhaps some of the dafter limitations have been removed, and no doubt the rationale was to get to something usable more quickly, but it seems a bit like papering over cracks.
Remember that distutils2 was at first distutils. It was only decided to be forked as a "new" package when some people complained. This explains a lot of the methodology. Also, forking distutils helped maintain a strong level of compatibility. Apparently people now think it's time to redesign it all. That's fine, but it won't work without a huge amount of man-hours. It's not like you can write a couple of PEPs and call it done. Regards Antoine.
Antoine Pitrou <solipsis <at> pitrou.net> writes:
Remember that distutils2 was at first distutils. It was only decided to be forked as a "new" package when some people complained. This explains a lot of the methodology. Also, forking distutils helped maintain a strong level of compatibility.
Right, but distutils was hard to extend for a reason, even though designed with extensibility in mind; hence the rise of setuptools. I understand the pragmatic nature of the design decisions in packaging, but in this case a little too much purity was sacrificed for practicality. Compatibility at a command-line level should be possible to achieve even with a quite different internal design.
Apparently people now think it's time to redesign it all. That's fine, but it won't work without a huge amount of man-hours. It's not like you can write a couple of PEPs and call it done.
Surely. But more than implementation man-hours, it requires that people are willing to devote some time and expertise in firming up the requirements, use cases etc. to go into the PEPs. It's classic chicken-and-egg; no-one wants to invest that time until they know a project's going somewhere and will have widespread backing, but the project won't go anywhere quickly unless they step up and invest the time up front. Kudos to Tarek, Éric and others for taking this particular world on their shoulders and re-energizing the discussion and development work to date, but it seems the net needs to be spread even wider to ensure that all constituencies are represented (for example, features needed only on Windows, such as binary distributions and executable scripts, have lagged a little bit behind). Regards, Vinay Sajip
On Sat, 23 Jun 2012 13:14:42 +0000 (UTC) Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Kudos to Tarek, Éric and others for taking this particular world on their shoulders and re-energizing the discussion and development work to date, but it seems the net needs to be spread even wider to ensure that all constituencies are represented (for example, features needed only on Windows, such as binary distributions and executable scripts, have lagged a little bit behind).
But what makes you think that redesigning everything would make those Windows features magically available? This isn't about "representing" "constitutencies". python-dev is not a bureaucracy, it needs people doing actual work. People could have proposed patches for these features and they didn't do it (AFAIK). Like WSGI2 and other similar things, this is the kind of discussion that will peter out in a few weeks and fall into oblivion. Regards Antoine.
Antoine Pitrou <solipsis <at> pitrou.net> writes:
But what makes you think that redesigning everything would make those Windows features magically available?
Nothing at all.
This isn't about "representing" "constitutencies". python-dev is not a bureaucracy, it needs people doing actual work. People could have
Well, for example, PEP 397 was proposed by Mark Hammond to satisfy a particular constituency (people using multiple Python versions on Windows). Interested parties added their input. Then it got implemented and integrated. You can see from some of the Numpy/Scipy developer comments that some in that constituency feel that their needs aren't/weren't being addressed.
proposed patches for these features and they didn't do it (AFAIK).
Sure they did - for example, I implemented Windows executable script handling as part of my work on testing venv operation with pysetup, and linked to it on the relevant ticket. It wasn't exactly rejected, though perhaps it wasn't reviewed because of lack of time, other priorities etc. and fell between the cracks. But I'm not making any complaint about this; there were definitely bigger issues to work on. I only bring it up in response to the "don't just talk about it; code doesn't write itself, you know" I read in your comment.
Like WSGI2 and other similar things, this is the kind of discussion that will peter out in a few weeks and fall into oblivion.
Quite possibly, but that'll be the chicken-and-egg thing I mentioned. Some projects can be worked on in comparative isolation; other things, like packaging, need inputs from a wider range of people to gain the necessary credibility. Regards, Vinay Sajip
On Sat, 23 Jun 2012 14:14:46 +0000 (UTC) Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
Some projects can be worked on in comparative isolation; other things, like packaging, need inputs from a wider range of people to gain the necessary credibility.
packaging already improves a lot over distutils. I don't see where there is a credibility problem, except for people who think "distutils is sh*t". Regards Antoine.
Antoine Pitrou <solipsis <at> pitrou.net> writes:
packaging already improves a lot over distutils. I don't see where
I don't dispute that.
there is a credibility problem, except for people who think "distutils is sh*t".
I don't think you have to take such an extreme position in order to suggest that there might be problems with its basic design. Regards, Vinay Sajip
On 06/23/2012 02:27 PM, Vinay Sajip wrote:
Dag Sverre Seljebotn<d.s.seljebotn<at> astro.uio.no> writes:
As for me, I believe I've been rather blunt and direct in my criticism in this thread: It's been said by Tarek that distutils2 authors that they don't know anything about compilers. Therefore it's almost unconceivable to me that much good can come from distutils2 *for my needs*. Even if packaging and building isn't the same, the two issues do tangle at a fundamental level, *and* most existing solutions already out there (RPM, MSI..) distribute compiled software and therefore one needs a solid understanding of build processes to also understand these tools fully and draw on their experiences and avoid reinventing the wheel.
But packaging/distutils2 contains functionality for hooks, which can be used to implement custom builds using tools that packaging/distutils2 doesn't need to know or care about (a hook will do that). One can imagine that a set of commonly used templates would become available over time, so that some problems wouldn't need to have solutions re-invented.
Of course you can always do anything, as numpy.distutils is a living proof of. Question is if it is good design. Can I be confident that the hooks are well-designed for my purposes? I think Bento's hook concept was redesigned 2 or 3 times to make sure it fit well...
Somebody with a deep understanding of 3-4 existing build systems and long experience in cross-platform builds and cross-architecture builds would need to be on board for me to take it seriously (even the packaging parts). As per Tarek's comments, I'm therefore pessimistic about the distutils2 efforts.
This deep understanding is not essential in the packaging/distutil2 team, AFAICT. They just need to make sure that the hook APIs are sufficiently flexible, that the hooks invoked at the appropriate time, and that they are adequately documented with appropriate examples.
For me, the bigger problem with the present distutils2/packaging implementation is that it propagates the command-class style of design which IMO caused so much pain in extending distutils. Perhaps some of the dafter limitations have been removed, and no doubt the rationale was to get to something usable more quickly, but it seems a bit like papering over cracks. The basic low-level building blocks like versioning, metadata and markers should be fine, but I'm not convinced that the command-class paradigm is appropriate in this case. The whole intricate "initialize_options"/"finalize_options"/"set_undefined_options" /"get_finalized_command"/"reinitialize_command" dance just makes me say,
And of course, propagating compilation options/configuration, and auto-detecting configuration options, is one of the most important parts of a complex build. Thus this seem to contradict what you say above. (Sorry all, I felt like I should answer to a direct challenge. This will be my last post in this thread; I've subscribed to distutils-sig.) Dag
Dag Sverre Seljebotn <d.s.seljebotn <at> astro.uio.no> writes:
Of course you can always do anything, as numpy.distutils is a living proof of. Question is if it is good design. Can I be confident that the hooks are well-designed for my purposes?
Only you can look at the design to determine that.
And of course, propagating compilation options/configuration, and auto-detecting configuration options, is one of the most important parts of a complex build. Thus this seem to contradict what you say above.
I'm not talking about those needs being invalid; just that distutils' way of fulfilling those needs is too fragile - else, why do you need things like Bento? Regards, Vinay Sajip
On 06/23/2012 03:20 PM, Vinay Sajip wrote:
Dag Sverre Seljebotn<d.s.seljebotn<at> astro.uio.no> writes:
Of course you can always do anything, as numpy.distutils is a living proof of. Question is if it is good design. Can I be confident that the hooks are well-designed for my purposes?
Only you can look at the design to determine that.
This deep understanding is not essential in the packaging/distutil2 team, AFAICT. They just need to make sure that the hook APIs are sufficiently flexible, that the hooks invoked at the appropriate time, and that
But the point is I can't! I don't trust myself to do that. I've many times wasted days and weeks on starting to use tools that looked quite nice to me, but suddenly I get bit by needing to do something turns out to be almost impossible to do cleanly due to design constraints. That's why it's so import to me to rely on experts that have *more* experience than I do (such as David). (That's of course also why it's so important to copy designs from what works elsewhere. And have a really deep knowledge of those designs and their rationale.) On 06/23/2012 02:27 PM, Vinay Sajip wrote: they are
adequately documented with appropriate examples.
All I'm doing is expressing my doubts that "making the hook API sufficiently flexible" and "invoked at the appropriate time" (and I'll add "has the right form") can be achieved at all without having subject expertise covering all the relevant usecases. Of course I can't mathematically prove this, it's just my experience as a software developer. Dag
On Sat, Jun 23, 2012 at 8:37 PM, Lennart Regebro <regebro@gmail.com> wrote:
In the end, I think this discussion is very similar to all previous packaging/building/installing discussions: There is a lot of emotions, and a lot of willingness to declare that "X sucks" but very little concrete explanation of *why* X sucks and why it can't be fixed.
If you think that, you haven't read the whole thread. Thanks to this discussion, I now have a *much* clearer idea of what's broken, and a few ideas on what can be done to fix it. However, distutils-sig and python-ideas will be the place to post about those. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, Jun 23, 2012 at 12:25 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sat, Jun 23, 2012 at 8:37 PM, Lennart Regebro <regebro@gmail.com> wrote:
In the end, I think this discussion is very similar to all previous packaging/building/installing discussions: There is a lot of emotions, and a lot of willingness to declare that "X sucks" but very little concrete explanation of *why* X sucks and why it can't be fixed.
If you think that, you haven't read the whole thread. Thanks to this discussion, I now have a *much* clearer idea of what's broken, and a few ideas on what can be done to fix it.
However, distutils-sig and python-ideas will be the place to post about those.
Nick, I am unfamiliar with python-ideas rules: should we continue discussion in distutils-sig entirely, or are there some specific topics that are more appropriate for python-ideas ? David
On Sat, Jun 23, 2012 at 9:53 PM, David Cournapeau <cournape@gmail.com> wrote:
Nick, I am unfamiliar with python-ideas rules: should we continue discussion in distutils-sig entirely, or are there some specific topics that are more appropriate for python-ideas ?
No, I think I just need to join distutils-sig. python-ideas is more for ideas that don't have an appropriate SIG list. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia