Improving distutils vs redesigning it (was people want CPAN)
On Thu, Nov 12, 2009 at 6:59 AM, Tarek Ziadé
And let's drop the backward compat issues in these discussions, so we don't burn out in details.
That's the part I don't understand. If backward compatibility is not a concern, why keeping distutils ? If you change the command and Distribution class design, what remains of the original code ? You are changing the API and the implementation (which are quite tangled with each other in distutils case), almost none of the original code would remain. It really feels to me like you are getting the pain of backward compatibility without the gains. What am I missing ? David
On 2009-11-11 17:18 PM, David Cournapeau wrote:
On Thu, Nov 12, 2009 at 6:59 AM, Tarek Ziadé
wrote: And let's drop the backward compat issues in these discussions, so we don't burn out in details.
That's the part I don't understand. If backward compatibility is not a concern, why keeping distutils ? If you change the command and Distribution class design, what remains of the original code ? You are changing the API and the implementation (which are quite tangled with each other in distutils case), almost none of the original code would remain.
It really feels to me like you are getting the pain of backward compatibility without the gains. What am I missing ?
I think Tarek wants to avoid the Second System Effect and related problems. http://en.wikipedia.org/wiki/Second-system_effect http://www.joelonsoftware.com/articles/fog0000000069.html While that is usually a good habit to cultivate and a good default position, it's not an unyielding law or anything. You have think deeply about whether the code is the way it is because it contains useful knowledge or if it is just constrained by ossified decisions from the past. I tend to think that the useful knowledge can be extracted from distutils and applied well in a rewrite. The most important useful knowledge is the extension building flags, and I think you have done a good job of transplanting that information into the entirely different build system of SCons. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Thu, Nov 12, 2009 at 12:18 AM, David Cournapeau
On Thu, Nov 12, 2009 at 6:59 AM, Tarek Ziadé
wrote: And let's drop the backward compat issues in these discussions, so we don't burn out in details.
That's the part I don't understand. If backward compatibility is not a concern, why keeping distutils ? If you change the command and Distribution class design, what remains of the original code ? You are changing the API and the implementation (which are quite tangled with each other in distutils case), almost none of the original code would remain.
It really feels to me like you are getting the pain of backward compatibility without the gains. What am I missing ?
What you are missing is that : - you are convinced that distutils should be written from scratch. I am not for many reasons. Some others are not either. it won't happen. the only thing that could make it happen is the replacement of distutils by another tool that is used by the majority of the community for several years. - you are convinced that the design is flawed and should be changed. I partially agree. And I also think this can happen in distutils, at a slow pace. but not in the way you've described it. So, instead on jumping in *your* conclusions (==let's drop distutils) or in *mine*, and in order to make some progress together, I am suggesting that we discuss the design flaws you've mentioned. And see what we come up with, then refocus on the big picture later (that is backward compat etc..) Working on the build_ext and Extensions part with your use case is where we can share some knowledge imho. Tarek
On 2009-11-11 18:04 PM, Tarek Ziadé wrote:
On Thu, Nov 12, 2009 at 12:18 AM, David Cournapeau
wrote: On Thu, Nov 12, 2009 at 6:59 AM, Tarek Ziadé
wrote: And let's drop the backward compat issues in these discussions, so we don't burn out in details.
That's the part I don't understand. If backward compatibility is not a concern, why keeping distutils ? If you change the command and Distribution class design, what remains of the original code ? You are changing the API and the implementation (which are quite tangled with each other in distutils case), almost none of the original code would remain.
It really feels to me like you are getting the pain of backward compatibility without the gains. What am I missing ?
What you are missing is that :
- you are convinced that distutils should be written from scratch. I am not for many reasons. Some others are not either. it won't happen. the only thing that could make it happen is the replacement of distutils by another tool that is used by the majority of the community for several years.
That's basically what David is suggesting when he says "rewrite". He doesn't mean to replace distutils with another package named "distutils". -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Thu, Nov 12, 2009 at 1:08 AM, Robert Kern
- you are convinced that distutils should be written from scratch. I am not for many reasons. Some others are not either. it won't happen. the only thing that could make it happen is the replacement of distutils by another tool that is used by the majority of the community for several years.
That's basically what David is suggesting when he says "rewrite". He doesn't mean to replace distutils with another package named "distutils".
Well, good luck with that new project then. I am here to improve the existing Distutils. And that doesn't forbid us to do some redesign in it. e.g. unlike this thread title implies, improvement of an existing system is not opposed to its partial redesign so it answers to some use cases it was not working well with before. Now the partial redesign of the build_ext/Extension part is something we can work on together. The problem of command extensibility is also a topic we have been discussed last year at Pycon, and if you read the wiki you'll see that I have proposed some changes so any command could be extended in a pluggable way. But this has to be added as new features, on the top of something that already works. You have to consider the PoV of many, many developers that happily use Distutils every day. They don't care about the edge cases we are discussing right now. They wouldn't mind having some improvements that would solve some edge cases of course. But they will not understand why Distutils is dropped in favor of a tool we write from scratch, because its design doesnt' work well when the FooBar compiler cannot get the YuYu option at the right time. And even if it happened, that would mean waiting for its maturity outside the stdlib for some years. I want to improve Distutils, but not for the price of a complete drop. I don't think the edge cases we are discussing worth it, and I still fail to see why we can't work them out in the context of the existing tool. Now, maybe that's just because I didn't spent a decade on the monster like you did. I have +6 more years to go ;) Tarek
On 2009-11-11 18:48 PM, Tarek Ziadé wrote:
I want to improve Distutils, but not for the price of a complete drop. I don't think the edge cases we are discussing worth it, and I still fail to see why we can't work them out in the context of the existing tool.
Mostly because I'm entirely uninterested in helping you make incremental improvements that are going to break all the hard work we've already done just to get things working as it is. I find that prospect incredibly frustrating. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Thu, Nov 12, 2009 at 1:59 AM, Robert Kern
On 2009-11-11 18:48 PM, Tarek Ziadé wrote:
I want to improve Distutils, but not for the price of a complete drop. I don't think the edge cases we are discussing worth it, and I still fail to see why we can't work them out in the context of the existing tool.
Mostly because I'm entirely uninterested in helping you make incremental improvements that are going to break all the hard work we've already done just to get things working as it is. I find that prospect incredibly frustrating.
I can understand that. I am very frustrated too because in the last threads, whenever we are speaking about design, it seems that on your side and David side, dropping Distutils seems like a post-condition to everything. Even if I have made some proposals on some concrete design changes. Who's "we" by the way ? The Scons project ? or the numpy.distutils project ? Tarek
Tarek Ziadé wrote:
On Thu, Nov 12, 2009 at 1:59 AM, Robert Kern
wrote: On 2009-11-11 18:48 PM, Tarek Ziadé wrote:
I want to improve Distutils, but not for the price of a complete drop. I don't think the edge cases we are discussing worth it, and I still fail to see why we can't work them out in the context of the existing tool. Mostly because I'm entirely uninterested in helping you make incremental improvements that are going to break all the hard work we've already done just to get things working as it is. I find that prospect incredibly frustrating.
I can understand that. I am very frustrated too because in the last threads, whenever we are speaking about design, it seems that on your side and David side, dropping Distutils seems like a post-condition to everything. Even if I have made some proposals on some concrete design changes.
In our considered opinion, piecemeal changes probably aren't going to solve the significant problems that we face. At best, they simply aren't going to help; we wouldn't be able to use the new features until we can drop support for Python 2.6. numpy and scipy still need to support Python 2.4. At worst, they would introduce incompatibilities that we will have to work around somehow.
Who's "we" by the way ? The Scons project ? or the numpy.distutils project ?
numpy and scipy. While I hesitate to speak for an entire community, I must say that David and my opinions on distutils are shared by a good portion of our community that has to deal with building and packaging. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Nov 11, 2009, at 10:04 PM, Robert Kern wrote:
In our considered opinion, piecemeal changes probably aren't going to solve the significant problems that we face. At best, they simply aren't going to help; we wouldn't be able to use the new features until we can drop support for Python 2.6.
While I can understand your frustration, it's important to step back and think about whether your problems are really impossible to solve. In particular, I take issue with this idea that you can't work on stuff that's distributed with Python but depend on newer versions. Isn't the whole point of much of setuptools' complexity supposed to be the fact that you can have side-by-side multi-version installations? Even assuming that this functionality doesn't work at _all_, who is to say that you can't ask users to upgrade distutils? or do a --prefix installation of distutils into a different directory? or have a build-time option that installs the 'distutils' package as 'distutils_plus_plus' and rewrites things as necessary? or implement an alternative to require() which *does* work? or, assuming require() works for some cases but not yours, adapt it to your needs? There are probably a dozen other ways that you *could* work on distutils and benefit more immediately from your efforts than the next Python release. To think otherwise is a simply a failure of imagination. Now, if you think it's *too hard* to do that, it might be interesting to hear why you think that, and what exactly the effort would be; a nebulous assertion that it's just too hard and we should throw our hands up (while I can definitely understand the impulse to make such an assertion) serves only to discourage everyone. The fact that a package is in the standard library is not a death sentence. Releases can be made separately. Heck, if you are doing good work on trunk but the release cycles are taking too long, quite frequently distributors will make packages out of your code at some revision of trunk rather than a release. I maintained software for _years_ that required a more recent version of pysqlite bindings than were available in the standard library's 'sqlite3' module. The 'pysqlite2' project is alive and well, and we didn't have any significant problems. Now, as Tarek suggests, it would be more worthwhile to discuss the *specifics* of the problems that you assert require blowing up the world, as more detailed understanding of those specifics will allow both people who want rewrites *and* people who want incremental improvements to proceed with better-informed. Any language environment's package/distribute/build/install/run pipeline is complicated enough that one can have a lot of productive discussion just nailing down exactly what is wrong with it, before even talking about solutions, and Python is no exception.
Who's "we" by the way ? The Scons project ? or the numpy.distutils project
?
numpy and scipy. While I hesitate to speak for an entire community, I must say that David and my opinions on distutils are shared by a good portion of our community that has to deal with building and packaging.
+1
Glyph Lefkowitz wrote:
There are probably a dozen other ways that you *could* work on distutils and benefit more immediately from your efforts than the next Python release. To think otherwise is a simply a failure of imagination. Now, if you think it's *too hard* to do that, it might be interesting to hear why you think that, and what exactly the effort would be; a nebulous assertion that it's just too hard and we should throw our hands up (while I can definitely understand the impulse to make such an assertion) serves only to discourage everyone.
I am trying to understand what is 'nebulous' about our claims. We have given plenty of hard and concrete examples of things which are problematic in distutils. The major progress in our build issues have been achieved by dropping distutils. Up to now, the only people who have claimed that distutils can solve our problems are the people who are not involved at all with our projects, and the people who claim distutils cannot solve our problems are the people involved with it. That's odd to say the least. Now, I am ready to accept that we are missing the big picture and the rest of the community knows more about it. But I certainly have not seen strong arguments to believe it so far. cheers, David
Tarek Ziadé wrote:
I want to improve Distutils, but not for the price of a complete drop. I don't think the edge cases we are discussing worth it, and I still fail to see why we can't work them out in the context of the existing tool.
Well, the scientific python edge case is what Guido asked about in the original thread. That's our answer: we don't think something which is not fundamentally different than distutils will solve our problems. If you think that what we considered as fundamental issues are corner cases and implementation details, we can only agree to disagree, I guess. Now, it is legitimate to dismiss our use-cases as edge cases and not worthwhile to worry about. We are not certainly not as big as say the web development community. But the question was directed at us, as I understood it :) cheers, David
2009/11/12 David Cournapeau
I am trying to understand what is 'nebulous' about our claims. We have given plenty of hard and concrete examples of things which are problematic in distutils. The major progress in our build issues have been achieved by dropping distutils. Up to now, the only people who have claimed that distutils can solve our problems are the people who are not involved at all with our projects, and the people who claim distutils cannot solve our problems are the people involved with it. That's odd to say the least.
Now, I am ready to accept that we are missing the big picture and the rest of the community knows more about it. But I certainly have not seen strong arguments to believe it so far.
Look, there is only one way this argument can be solved, and that is by building something better than distutils. Honestly. Now, I'm sure as heck not going to spend time on that, and Tarek don't think it's a good idea, so it's up to you guys. So I again ask: 1. Do you think the new PEPs in development should be followed? In that case, what is the benefit of rewriting, instead of fixing? 2. When are you done? Bitching that distutils needs to be scratched and rewritten is not going to help. You need to DO it. -- Lennart Regebro: Python, Zope, Plone, Grok http://regebro.wordpress.com/ +33 661 58 14 64
Lennart Regebro wrote:
2009/11/12 David Cournapeau
: I am trying to understand what is 'nebulous' about our claims. We have given plenty of hard and concrete examples of things which are problematic in distutils. The major progress in our build issues have been achieved by dropping distutils. Up to now, the only people who have claimed that distutils can solve our problems are the people who are not involved at all with our projects, and the people who claim distutils cannot solve our problems are the people involved with it. That's odd to say the least.
Now, I am ready to accept that we are missing the big picture and the rest of the community knows more about it. But I certainly have not seen strong arguments to believe it so far.
Look, there is only one way this argument can be solved, and that is by building something better than distutils. Honestly. Now, I'm sure as heck not going to spend time on that, and Tarek don't think it's a good idea, so it's up to you guys.
I think you're missing the point of our statements. We're not asking you to work or not work on anything. Guido asked, in response to a comment from our community: "Is the work on distutils-sig going to be enough?" And our answer is that no, it's probably not. It's not addressing our most significant problems. David's work, e.g. on numscons, is helping, but it is still constrained by the requirement of working within distutils' framework.
Bitching that distutils needs to be scratched and rewritten is not going to help. You need to DO it.
Pardon us for sincerely answering the questions asked of us. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Thu, Nov 12, 2009 at 2:28 PM, Lennart Regebro
Bitching that distutils needs to be scratched and rewritten is not going to help. You need to DO it.
Nobody asked you (or anyone else) to do anything. Guido asked a question, we answered with our rationale. There is no need to be rude. David
2009/11/12 David Cournapeau
On Thu, Nov 12, 2009 at 2:28 PM, Lennart Regebro
wrote: Bitching that distutils needs to be scratched and rewritten is not going to help. You need to DO it.
Nobody asked you (or anyone else) to do anything. Guido asked a question, we answered with our rationale.
There is no need to be rude.
It was not my intention to be rude, and I don't believe I was. This discussion has been ongoing for a long time, and I have just recently asked questions (which of course has been ignored) about this. This is a discussion that has been going on for a long time, more than a year, at least, yet noone who says distutils needs to be scratched and something else written does something about it. I've pointed this out repeatedly, and been ignored. Perhaps I used to strong language this time, but on the other hand, I wasn't ignored this time. The discussion is pointless. If you want something better than distutils, do it. -- Lennart Regebro: Python, Zope, Plone, Grok http://regebro.wordpress.com/ +33 661 58 14 64
Lennart Regebro wrote:
2009/11/12 David Cournapeau
: On Thu, Nov 12, 2009 at 2:28 PM, Lennart Regebro
wrote: Bitching that distutils needs to be scratched and rewritten is not going to help. You need to DO it. Nobody asked you (or anyone else) to do anything. Guido asked a question, we answered with our rationale.
There is no need to be rude.
It was not my intention to be rude, and I don't believe I was. This discussion has been ongoing for a long time, and I have just recently asked questions (which of course has been ignored) about this. This is a discussion that has been going on for a long time, more than a year, at least, yet noone who says distutils needs to be scratched and something else written does something about it.
I've pointed this out repeatedly, and been ignored. Perhaps I used to strong language this time, but on the other hand, I wasn't ignored this time.
The discussion is pointless. If you want something better than distutils, do it.
David's working on it: http://pypi.python.org/pypi/numscons -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
2009/11/12 Robert Kern
The discussion is pointless. If you want something better than distutils, do it.
David's working on it:
"Enable to use scons within distutils to build extensions" I'm confused now. -- Lennart Regebro: Python, Zope, Plone, Grok http://regebro.wordpress.com/ +33 661 58 14 64
Lennart Regebro wrote:
2009/11/12 Robert Kern
: The discussion is pointless. If you want something better than distutils, do it. David's working on it:
"Enable to use scons within distutils to build extensions"
I'm confused now.
It's a start. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
Lennart Regebro wrote:
2009/11/12 Robert Kern
: The discussion is pointless. If you want something better than distutils, do it. David's working on it:
"Enable to use scons within distutils to build extensions"
I'm confused now.
In fact, David wouldn't have bothered with the distutils bits at all if I hadn't been so insistent that the standard "python setup.py install", etc. would continue to work. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Nov 12, 2009, at 12:02 AM, David Cournapeau wrote:
Glyph Lefkowitz wrote:
There are probably a dozen other ways that you *could* work on distutils and benefit more immediately from your efforts than the next Python release. To think otherwise is a simply a failure of imagination. Now, if you think it's *too hard* to do that, it might be interesting to hear why you think that, and what exactly the effort would be; a nebulous assertion that it's just too hard and we should throw our hands up (while I can definitely understand the impulse to make such an assertion) serves only to discourage everyone.
I am trying to understand what is 'nebulous' about our claims. We have given plenty of hard and concrete examples of things which are problematic in distutils.
I'm sorry if I gave the impression that I was contesting that particular assertion. We all agree that distutils has deep problems. And, I don't think that everything that has been said is overgeneral or unhelpful. Before I dive into more criticism, let me just say that I agree 100% with Robert Kern's message where he says:
In order to integrate this with setuptools' develop command (...) we need to create a subclass of setuptool's develop command that will reinitialize build_src with the appropriate option. Then we need to conditionally place the develop command into the set of command classes so as not to introduce a setuptools dependency on those people who don't want to use it.
This is nuts.
This is completely correct. I've done stuff like this, we've all probably done stuff like this. Conditional monkeypatching and dynamic subclassing is all over the place in distutils extension code, and it is *completely* nuts. Still, it would have been more helpful to point out how exactly this problem could be solved, and to present (for example) a description of similar objects politely interacting and delegating responsibility to one another to accomplish the same task. I would definitely characterize these assertion from Robert as "nebulous", given that the prior messages in the thread (as far as I can tell) do not describe the kind of massive-overhaul changes which would fix things, only the problems that currently exist:
In our considered opinion, piecemeal changes probably aren't going to solve the significant problems that we face.
Why not? The whole of computer history is the story of piecemeal improvements of one kind or another; despite perennial claims that, for example, hierarchical filesystems or bit-mapped displays "fundamentally" cannot support one type of data or another, here we are. Or this one, also from Robert:
Mostly because I'm entirely uninterested in helping you make incremental improvements that are going to break all the hard work we've already done just to get things working as it is.
Why do incremental improvements have to break all the hard work that has already been done? Surely this is what a compatibility policy is about. Or this one, from you, which is more specific to a topic, but still doesn't really say anything useful that I can discern:
I think the compiler class and the likes should simply be removed ... There should not be any objects / classes for compilers, it is not flexible enough ... You cannot obtain this with classes and objects (especially when you start talking about performance: ...).
It's clear to me from the responses in this thread that I'm not the only one who is only getting vague hints of what you're actually talking about from language like this. "classes and objects" have been used in many high-performance systems. Personally I find "classes and objects" fairly flexible as well. In fact, if *I* were to make a nebulous claim about distutils' design structure, it would be that the parsimony with creating whole new classes and instantiating multiple objects is the problem; there should be more classes, more objects, less inheritance and fewer methods. So why can't
The major progress in our build issues have been achieved by dropping distutils. Up to now, the only people who have claimed that distutils can solve our problems are the people who are not involved at all with our projects, and the people who claim distutils cannot solve our problems are the people involved with it. That's odd to say the least.
I'm not asserting that distutils can fix your problems; I don't know enough about your problems to say for sure. Certainly it seems clear that present-day distutils cannot. I just know that there are many people on this list who are committed to a particular approach to evolving distutils, and while there is a lot of value in clearly explaining problems with that approach so they can be addressed, it's unhelpful to keep asserting (as has been done many times in this thread) that incremental evolution cannot address these problems. It's a religious belief either way: my experience suggests that rewrites are never, ever a good idea, but others' experience may differ. However, I feel compelled to repeat that it is a matter of historical fact and, I suspect, a corollary of the Church-Turing thesis that pretty much any software system *can* be changed into just about any other software system through a series of evolutionary steps where the system does something useful at each step; it is a question of whether you believe this approach requires an unreasonable amount of effort, and how big the steps need to be. If you believe the effort required would be unreasonable, then let's see if we can find a radical, incompatible change to distutils that we all agree would be an improvement, and see if we also agree that the effort would be impractical. Right now I can't find any such change looking through the recent history of this discussion, just a smattering of half-expressed ideas about how certain things in distutils are crummy and should look totally different than they are.
2009/11/12 Robert Kern
It's a start.
OK, good enough. -- Lennart Regebro: Python, Zope, Plone, Grok http://regebro.wordpress.com/ +33 661 58 14 64
Glyph Lefkowitz wrote:
On Nov 12, 2009, at 12:02 AM, David Cournapeau wrote:
Glyph Lefkowitz wrote:
There are probably a dozen other ways that you *could* work on distutils and benefit more immediately from your efforts than the next Python release. To think otherwise is a simply a failure of imagination. Now, if you think it's *too hard* to do that, it might be interesting to hear why you think that, and what exactly the effort would be; a nebulous assertion that it's just too hard and we should throw our hands up (while I can definitely understand the impulse to make such an assertion) serves only to discourage everyone.
I am trying to understand what is 'nebulous' about our claims. We have given plenty of hard and concrete examples of things which are problematic in distutils.
I'm sorry if I gave the impression that I was contesting that particular assertion. We all agree that distutils has deep problems.
And, I don't think that everything that has been said is overgeneral or unhelpful. Before I dive into more criticism, let me just say that I agree 100% with Robert Kern's message where he says:
In order to integrate this with setuptools' develop command (...) we need to create a subclass of setuptool's develop command that will reinitialize build_src with the appropriate option. Then we need to conditionally place the develop command into the set of command classes so as not to introduce a setuptools dependency on those people who don't want to use it.
This is nuts.
This is completely correct. I've done stuff like this, we've all probably done stuff like this. Conditional monkeypatching and dynamic subclassing is all over the place in distutils extension code, and it is *completely* nuts.
Still, it would have been more helpful to point out how exactly this problem could be solved, and to present (for example) a description of similar objects politely interacting and delegating responsibility to one another to accomplish the same task.
I would definitely characterize these assertion from Robert as "nebulous", given that the prior messages in the thread (as far as I can tell) do not describe the kind of massive-overhaul changes which would fix things, only the problems that currently exist:
In our considered opinion, piecemeal changes probably aren't going to solve the significant problems that we face.
Why not? The whole of computer history is the story of piecemeal improvements of one kind or another; despite perennial claims that, for example, hierarchical filesystems or bit-mapped displays "fundamentally" cannot support one type of data or another, here we are.
I think Robert meant piecemeal changes from an implementation POV, not that we should ignore any history or existing design solutions. Actually, I think that distutils is victim a lot of NIH, it is totally different from any other build system I have seen.
Or this one, also from Robert:
Mostly because I'm entirely uninterested in helping you make incremental improvements that are going to break all the hard work we've already done just to get things working as it is.
Why do incremental improvements have to break all the hard work that has already been done? Surely this is what a compatibility policy is about.
Here is what *I* mean by distutils compatibility, so that we are sure to talk about the same thing: - existing setup.py should run without problem, and produce the same software when installed/produced (bdist_wininst, sdist, etc...) under the same conditions as distutils (actually, setuptools, since distribute is a fork of setuptools). - existing usage of distutils API should remain compatible. I asked in a previous email what is meant by distutils API, Tarek answered anything which does not start with an underscore. But what does that mean ? For example, in numscons, I rely on the build directory to have a certain structure (implemented defined in distutils: you can't retrieve them from distutils from public API). Is this part of the API ? Is using copied commands to get some characteristics considered part of the API ? Is the order of commands, or their attributes considered public (they all start without an underscore, but they are not documented anywhere) ? During a more precise discussion, I think we have more or less agreed with Tarek that build_ext needs a significant overhaul. Although we did not discuss concretely about other commands, I think the same kind of arguments apply to almost any command. There is then the issue of communicating between commands, through the Distribution class. Let's assume for the argument's sake that we manage to convince the community as a whole that both commands and distribution classes need to be redesigned. At that point, what's different from a newly distribution tool, with a totally different API ? (but which could reuse distutils implementations parts, of course). Certainly, unless you keep the current code and the new one, you will break almost every distutils API user out there. Python distutils (the one included in python) has broken our extensions countless times already, even though no significant feature has been added. Setuptools itself already breaks a lot of them out there. That's why I am not convinced that you can improve distutils without causing the exact same issues as a new distribution system. That's from my experience from extensively writing extensions around it.
"classes and objects" have been used in many high-performance systems. Personally I find "classes and objects" fairly flexible as well. In fact, if *I* were to make a nebulous claim about distutils' design structure, it would be that the parsimony with creating whole new classes and instantiating multiple objects is the problem; there should be more classes, more objects, less inheritance and fewer methods.
I agree this claim was vague. I was only talking about using class and object for building, not objects in a general sense. The problem with compilation is that you need almost total flexibility: you simply cannot foresee how to use tools and their interaction with the launched commands. Neither make, nor waf, nor scons does that. Instead, they provide the fundamental abstraction source -> "action" -> target, where action can really be anything, and should be decoupled from any tool definition (what's common between a C compiler and a code generator, for example ?). Concerning classes, I don't think you can have a hierarchy for compilers: they behave so differently depending on the platforms that they share very little. This is true for any tool, actually. What is common between (most) C compilers is that they produce object files from sources, and link object files together. About the performance claim: scons has speed issues (the issue keeps coming up on the user and dev ML), in part because it uses too many objects. Waf manages to be much faster (it is basically as fast as make for reasonably sized projects, and it does automatic dependency). They manage to do so thanks to agressive optimizations. They care about the number of attributes of the fundamental classes, they compile string commands into functions to avoid useless substitutions (another scons speed issue).
However, I feel compelled to repeat that it is a matter of historical fact and, I suspect, a corollary of the Church-Turing thesis that pretty much any software system *can* be changed into just about any other software system through a series of evolutionary steps where the system does something useful at each step; it is a question of whether you believe this approach requires an unreasonable amount of effort, and how big the steps need to be.
Yes, obviously you *can* go from distutils to a perfect system in piecemeal changes. The question is how long and how much effort does it take. So when I say "I don't think you can significantly improve distutils", it is to be understood as "it will take less time to go to a better system without bothering with keeping the same architecture".
If you believe the effort required would be unreasonable, then let's see if we can find a radical, incompatible change to distutils that we all agree would be an improvement, and see if we also agree that the effort would be impractical.
A few examples which are a problem for us (us being numpy/scipy/etc... developers/users here): - automatic dependency handling. If I change one header, only the files which include this header will be rebuilt; if a fortran compiler flag is changed, only fortran source files are recompiled, etc... - package description so that simple packages could be built automatically without running untrusted code - reliable parallel builds - integration with 3rd party tools To be honest, the only one I really care about is the last one: if we could find a solution so that I can build the C code from say make or scons, and build/install in a distutils compatible way through simple and stable API, it would solve all the other parts through 3rd party code. Right now, numscons depends too much on distutils implementation details, and cannot produce all the non build goodies from distutils (sdist, bdist_wininst, etc...). cheers, David
Glyph Lefkowitz wrote:
Still, it would have been more helpful to point out how exactly this problem could be solved, and to present (for example) a description of similar objects politely interacting and delegating responsibility to one another to accomplish the same task.
Sorry, I edited out the bit at the last minute where I explained that it would be great to have a centralized option-managing object such that any command can ask what options were set on any other regardless of the dependencies between commands.
I would definitely characterize these assertion from Robert as "nebulous", given that the prior messages in the thread (as far as I can tell) do not describe the kind of massive-overhaul changes which would fix things, only the problems that currently exist:
In our considered opinion, piecemeal changes probably aren't going to solve the significant problems that we face.
Why not? The whole of computer history is the story of piecemeal improvements of one kind or another; despite perennial claims that, for example, hierarchical filesystems or bit-mapped displays "fundamentally" cannot support one type of data or another, here we are.
Perhaps in my head the analogy with biological evolution is unjustifiably strong. Species can't always get from point A to point B while making viable intermediates with incremental changes. Evolutionary deadends happen frequently. In software, design decisions early on affect how much change the software can tolerate (which is why we are told to "design for change"). I think that distutils exemplifies a design that is particularly adverse to evolution.
Or this one, also from Robert:
Mostly because I'm entirely uninterested in helping you make incremental improvements that are going to break all the hard work we've already done just to get things working as it is.
Why do incremental improvements have to break all the hard work that has already been done? Surely this is what a compatibility policy is about.
Since Tarek keeps asking us to make proposals without thinking about compatibility, I wonder what policy is being kept in mind. My comment stems from my worry about that attitude. In any case, I think that keeping compatibility while making improvements to the code in situ is going to be quite difficult. The distutils API is not clean. Using distutils beyond setup() and Extension() involves too much intimate knowledge of the detailed implementation of distutils as-is. That's why we got a 2.6.4 release. This is why I think the piecemeal evolution of distutils is not going to work. The act of fixing distutils to make a cleaner API for modification and extension *triggers* the problem that the fix is supposed to address. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Nov 12, 2009, at 1:36 AM, Robert Kern wrote:
Glyph Lefkowitz wrote:
Still, it would have been more helpful to point out how exactly this problem could be solved, (...)
Sorry, I edited out the bit at the last minute where I explained that it would be great to have a centralized option-managing object such that any command can ask what options were set on any other regardless of the dependencies between commands.
I'm familiar with the hazards of over-editing :). I am often asked to shorten my messages, and when I try too hard to do so, I leave out important elements. One of the things I'm asking for, actually, is fewer, longer messages, with more substantive points in them. I know some people don't like that, but discussions about big, complex topics like this that try to address them one little conversational point at a time tend to get circular quickly.
Perhaps in my head the analogy with biological evolution is unjustifiably strong. Species can't always get from point A to point B while making viable intermediates with incremental changes. Evolutionary deadends happen frequently.
This is explicitly _not_ biological evolution. For example, when you wonder about this:
Since Tarek keeps asking us to make proposals without thinking about compatibility, I wonder what policy is being kept in mind. My comment stems from my worry about that attitude.
The way I'm interpreting Tarek's comments - and he can correct me if I'm wrong - is that the strategy is to short-circuit evolution. We should decide where we want to go - which may be an apparently discontinuous place - then path-find there. The path-finding is rarely as hard as it seems like it's going to be. The brute-force approach, which also happens to be an approximation of the Twisted framework compatibility baseline, is: release -1: import old; old.crummy_api() release 0: old.crummy_api() emits a PendingDeprecationWarning, new.good_api() introduced release 1: old.crummy_api emits DeprecationWarning pointing at new.good_api(), new.good_api() improved based on feedback from r0 release 2: old.crummy_api raises DeprecationError pointing at new.good_api() release 3: old.crummy_api removed This is almost a straw-man: you can do this for any old.crummy_api and new.good_api, regardless of whether the new thing actually satisfies the old thing's requirements at all. It is often possible to do much better. But the point is, if you have a clear new.good_api to get to, it's possible to do all *kinds* of crazy stuff in Python to emulate and re-direct the behavior of crummy_api, discover how it's being used, provide useful hints to the developer, etc. It's a matter of how much effort you want to put into it. For many distutils use-cases, it sounds to me like the path forward is to avoid using any API at all, and just ask most projects to provide static metadata. Build-heavy projects like numpy will require new integration points to do custom heavy lifting, but it will be easier to define those integration points if they're not something that every project under the sun will potentially need to interact with. But, during the creative design process of good_api, it's often helpful to pretend crummy_api doesn't even exist, so you can design something good that solves its problems well, and address the translation as a truly separate issue.
In software, design decisions early on affect how much change the software can tolerate (which is why we are told to "design for change").
Who's "we", kimosabe? *We* are told "you aren't gonna need it"; maybe some other people are told to design for change :). In fact, I think that distutils is over-designed for change. It has altogether too many different extension mechanisms, which often interfere with each other: subclassing, configuration files, including random bits of code in setup.py. And then of course there's the monkey-patching for the cases that weren't covered :).
Glyph Lefkowitz wrote:
In software, design decisions early on affect how much change the software can tolerate (which is why we are told to "design for change").
Who's "we", kimosabe?
It pops up in the Design Patterns literature. I didn't make this up. :-) http://www.google.com/search?q=%22design+for+change%22+%22design+patterns%22
In fact, I think that distutils is over-designed for change. It has altogether too many different extension mechanisms, which often interfere with each other: subclassing, configuration files, including random bits of code in setup.py. And then of course there's the monkey-patching for the cases that weren't covered :).
That's not the kind of change I'm talking about. I'm talking about the evolution of distutils itself, not the configuration and extension of distutils when it is used. Those particular mechanisms are the antithesis of designing for change because their use hampers the change of distutils itself. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
On Thu, Nov 12, 2009 at 3:04 PM, Lennart Regebro
2009/11/12 Robert Kern
: The discussion is pointless. If you want something better than distutils, do it.
David's working on it:
"Enable to use scons within distutils to build extensions"
I'm confused now.
The idea is that numpy.distutils adds a scons command, and if your setup.py contains something like: add_sconscript('SConscript') It will call scons with enough options from distutils so that you can build extensions from scons instead of distutils (it replaces all the build_* commands except build_py, basically). Numscons itself adds things like builders to handle python extensions, so that you can do things like: env.DistutilsPythonExtension('foo', source=['foo.c']) env.DistutilsCtypesExtension('c_foo', source=['c_foo.c']) Those CamelCase functions are called builders in scons parlance, and any builder starting with Distutils* is such as it will put things where distutils expects them (which is suprisingly difficult). If you call python setup.py scons, you then get your extensions in e.g. build/lib.win32-2.6/, and then all the bdist_wininst, etc... work as expected. It solves our build issues beautifully: - adding support for new compiler, new builder, new configuration check, etc... is more pleasant than with distutils - the configuration framework is much more robust: you can check things like type size, available header, available libraries and functions in a cross platform way. - parallel builds works well - any change in one source file automatically trigger a new build of the concerned files only (it goes as far as only relinking if you change your library path). - changing compiler flags with CFLAGS works as expected. But it has significant drawbacks, which means it is only useful as a build/developer tool. This is mainly because scons and distutils cannot interact 'bidirectionally' (both scons and distutils fault). In particular: - source files added in scons are not automatically added to the FileList object used for sdist production - changes in python files are not handled - we cannot integrate things like building a full .dmg or .msi from scons - we cannot build compatible eggs from scons because there is no public API to do so Most of the hard work in numscons is a 1000 LOC subpackage if I don't count all the fortran stuff. Of course, it reuses scons, so that's cheating. But note that scons has no knowledge whatsoever of python extensions, and numscons can build extensions on windows (both 32 and 64 bits), linux, mac os x, solaris, freebsd, including debug extensions. That's one of the reason why I find the claim of distutils having a deep and large knowledge hard to believe. David
2009/11/12 David Cournapeau
On Thu, Nov 12, 2009 at 3:04 PM, Lennart Regebro
wrote: I'm confused now.
The idea is that numpy.distutils adds a scons command, and if your setup.py contains something like:
What I'm confused about is that you say that distutils should be scrapped and not incrementally improved, yet your effort of doing it "the right way" builds on distutils.
That's one of the reason why I find the claim of distutils having a deep and large knowledge hard to believe.
Me too, but I think the community does. And currently that knowledge is being focused into a bunch of PEP's. Which is the reason for my previous (and so far ignored) question if that work should be ignored, or if a new solution done The Right Way should build on those PEPs. And if it should, what the difference then is to the current effort of fixing distutils. If scons is the Right Way then it seems to me yet again that the whole discussion is moot, and that what should be done is to build extensions for scons to build/install/upload Python modules, and then after doing that, just saying "Use scons instead". If the community then agrees, scons will be used. Obviously you should then here on this list say that you are building these extensions and asking for help with it. Less talk, more hockey, as we say in Sweden. -- Lennart Regebro: Python, Zope, Plone, Grok http://regebro.wordpress.com/ +33 661 58 14 64
On Thu, Nov 12, 2009 at 5:01 PM, Lennart Regebro
What I'm confused about is that you say that distutils should be scrapped and not incrementally improved, yet your effort of doing it "the right way" builds on distutils.
Maybe we do not mean the same by rewriting a distribution tool then. Numscons works by totally bypassing distutils and using its own mechanism for anything build-related. It used to use compiler options from distutils, but I have added a mode to bypass this entirely so that you can use new compilers by writing scons tools without having to care about distutils (e.g. to build numpy/scipy with Intel compilers on mac os x or windows). Now, I am trying to see how far I can go from a purely static file for package description, plus tools to convert existing setup.py to it. The hope is that I can soon get a tool which does build and packaging without using distutils at all.
That's one of the reason why I find the claim of distutils having a deep and large knowledge hard to believe.
Me too, but I think the community does. And currently that knowledge is being focused into a bunch of PEP's. Which is the reason for my previous (and so far ignored) question if that work should be ignored, or if a new solution done The Right Way should build on those PEPs.
Here are the PEP I see being discussed related to anything distribute/setuptools-related. Feel free to indicate me the ones I am missing. PEP 382: namespace packages Mostly independent from the distribution tool, I have not followed the discussion on those ones. The current setuptools implementation causes deployment issues because of file sharing which is why I mention it here. PEP 345, 390, etc...: Metadata I think a new format ala Cabal is needed for full package description. The current format is ok for distribute evolution, but retrofitting a full description feels very hackish. The one I am working on in toydist is such as many packages only require this file, no setup at all (the format is purely straw-man, to see what kind of problems need to be solved). Right now, I have enough code so that I can almost convert several simple (but not trivial, e.g. I consider sphinx to be a simple but not trivial package) packages from pypi. This is done through a distutils command, i.e. if you do: python setup.py gen_toyinfo with a setup like """ setup(name='foo', version='1.0', packages=['foo'], extensions=[Extension(_foo, sources=['bar.c', 'foo.c'])] ) """ you get a toysetup.info description file: """ Name: foo Version: 1.0 PackageAdditionalData: data/*.dat test/*.py Library: Extension: _foo src: bar.c, foo.c Packages: foo """ I have also code to build from this file alone using distutils and scons at will: this file can be fed back to a trivial setup.py, so here I get backward (and even forward if needed) compatibility. One major problem is package data: the current way to use MANIFEST and co is too complicated (and fragile when MANIFEST are not updated correctly). I don't want to retrofit it because I strongly believe distribution files should be explicitly added, but I guess it would make sense to support it at conversion time only. PEP 376: egg-info I don't personally care about the egg and egg-info format details, only that it is specified (and versioned), like ruby gems, for interoperability. So that I can build eggs from scons, for example. Enthought has some nice open source tools which give you reliable install/upgrade/removal/listing of software using eggs as a format (ala dpkg and apt-get, using eggs as the basic format). One of the major problem with eggs is that on windows at least, some people prefer using .exe/.msi because of its integration with add/remove. If the egg format is rich enough, it should be possible to get tools to convert from one format to the other.
And if it should, what the difference then is to the current effort of fixing distutils.
If scons is the Right Way then it seems to me yet again that the whole discussion is moot, and that what should be done is to build extensions for scons to build/install/upload Python modules, and then after doing that, just saying "Use scons instead".
Clearly, requiring distutils, distribute or whatever to reimplement scons or waf is insane. But today, numscons only works because it uses many implementation details of distutils, and cannot be used as it is because all the distribution/installing/packaging part from distutils is not reusable. If distribute could be improved so that it could make this easy, it would be great. But I believe doing so without being disruptive is hard - and the only way to understand why is to try doing it. David
On Thu, Nov 12, 2009 at 7:36 AM, Robert Kern
Why do incremental improvements have to break all the hard work that has already been done? Surely this is what a compatibility policy is about.
Since Tarek keeps asking us to make proposals without thinking about compatibility, I wonder what policy is being kept in mind. My comment stems from my worry about that attitude.
Come on, I answered on that already. This is getting nowhere now. I keep asking you that because, after a few mails we had, I think that's the only way we can work on your use cases efficiently. My policy is to discuss a design for a particular use case you have, before we put it back in the big picture. But that is impossible if in every answer you guys make, you just say that "distutils must be replaced". Since three days now, I have made concrete proposals on some changes, but you guys, seem not to hear it. It seems that the discussion turns into some political discussion where you just say that "distutils must be replaced". or "you can't fix it", "your can't make small steps" Shall we focus on the problems you have ? Tarek
On Thu, Nov 12, 2009 at 7:34 AM, David Cournapeau
- existing usage of distutils API should remain compatible. I asked in a previous email what is meant by distutils API, Tarek answered anything which does not start with an underscore. But what does that mean ? For example, in numscons, I rely on the build directory to have a certain structure (implemented defined in distutils: you can't retrieve them from distutils from public API). Is this part of the API ? Is using copied commands to get some characteristics considered part of the API ? Is the order of commands, or their attributes considered public (they all start without an underscore, but they are not documented anywhere) ?
The build and instalation scheme should be part of the API. If you look in sysconfig, you get most of the install schemes via the API. If anything is missing, tell us. Now for the build I am not sure what you are trying to perform here. I'll ask you for a precise example here. But are you willing to do this finish the discussion of this precise example ? I mean, if I give you an answer, *or* a proposal for a change in Distutils, is your answer is going to be "distutils must be replaced anyway" ? or are you willing to hear my proposal and work it out with me ? Because, Distutils is not going to be replaced. [..]
During a more precise discussion, I think we have more or less agreed with Tarek that build_ext needs a significant overhaul. Although we did not discuss concretely about other commands, I think the same kind of arguments apply to almost any command. There is then the issue of communicating between commands, through the Distribution class.
And I made a proposal for this last one too in this thread. But, like the Extensions proposal I have made, you seem not to hear them. I don't get it. It seems to me that you just don't want to see Distutils improved, since everytime you mention a problem, and I start to make proposals, we don't discuss it further and you jump back on the 'distutils must be replaced' line. But what is going to happen is that, Distutils will not be replaced. We want to make it evolve. [..]
Python distutils (the one included in python) has broken our extensions countless times already, even though no significant feature has been added.
Sorry, but that's pretty vague. Are you mentioning the changes in build_ext, or something else ? Tarek.
Tarek Ziadé wrote:
The build and instalation scheme should be part of the API. If you look in sysconfig, you get most of the install schemes via the API. If anything is missing, tell us.
(I am reading the code there: http://svn.python.org/projects/python/branches/tarek_sysconfig/Lib/distutils...) I have not tried running the code, but here is what I see missing: - anything related to build (where are libraries put in the build directory, where are the extensions put - necessary if you are interested in inter-operating with make) - I don't see the equivalent of distutils.command.install.install_libbase and similar prefixes - How can we retrieve the full customized install path ? I have not found how I could use the code in sysconfig to solve the use case I have written about before (related to retrieving install prefix).
Now for the build I am not sure what you are trying to perform here.
This is always the same use-case: assuming you push the build to a 3rd party tool like make, how can you make it work within distutils: build directories, compiler options, etc... If you are familiar with autootols, the idea could be similar in the sense that you would have a Makefile.in, and a distutils command build_make would generate a makefile, filling up the variables in makefile.in. You could also imagine to put those variables in a separate file, which could be queried from an API (using a makefile.in-like process for scons is not very practical for example). So the list of variables needed: - everything in sysconfig (but available for any compiler, including MS ones) - any distutils command option. What is needed here is a way to query the options given to command globally, that is the possibility to access at any point in your setup.py every option of every command, without running them as first as we have to do currently (like running install_cmd within build_clib). To be even more down to earth, here is how a scons call looks like on linux for numscons: /usr/bin/python "/home/david/local/lib/python2.6/site-packages/numscons/scons-local/scons.py" -f numpy/core/SConstruct -I. scons_tool_path="" src_dir="numpy/core" pkg_path="numpy/core" pkg_name="numpy.core" log_level=50 distutils_libdir="../../../../build/lib.linux-i686-2.6" distutils_clibdir="../../../../build/temp.linux-i686-2.6" distutils_install_prefix="/usr/local/lib/python2.6/dist-packages/numpy/core" cc_opt=gcc cc_opt_path="/usr/bin" debug=0 f77_opt=gfortran f77_opt_path="/usr/bin" cxx_opt=g++ cxx_opt_path="/usr/bin" include_bootstrap=../../../../numpy/core/include bypass=0 import_env=0 silent=0 bootstrapping=1 To understand the weird relative paths, you need to be aware than scons (and waf) consider all the paths relatively to a build directory. Scons copy the sources files in the build directory (this is an oversimplification, but should be enough for this discussion). For example, here, we build the package numpy.core, so it looks like (relatively to the top setup.py file): cwd/setup.py /numpy/core/setup.py /build/lib.linux-i686-2.6/ # where distutils would put all the compiled code if build_ext were used /build/scons/ # any code generated by scons is put here numpy/core # the build directory for the current package numpy.core We pass the following informations: * distutils_libdir: where distutils put/expects python extensions (which is replaced by current directory + the src_dir in case of in-place build) * distutils_install_prefix: install prefix + subpackage path. Where the extensions would be installed * distutils_clibdir: where disstutils put/expects pure C libraries (the ones built through build_clib) * numscons also encode compilation options in .ini-like files (regularly auto-generated from the info at sysconfig). The idea is that if someone wants to modify the flags, there is no need to touch any python code. Right now, I have to reimplement the distutils code to find those, as the logic is not available through API. Generally, the build directories are only known through the corresponding build_* commands, but they become available only after running the commands, which we want to avoid.
And I made a proposal for this last one too in this thread. But, like the Extensions proposal I have made, you seem not to hear them.
Because I genuinely don't understand why it is required not to replace distutils but it is ok to change its behavior in a fundamental way. For some reasons (which may well just be my own fault), I fail to see any reason to start from existing code, publicly and widely used, if it is ok to break it and change its behavior, especially if it will be done several times. I was not at Pycon, so there may have been some other requirements I have missed.
Sorry, but that's pretty vague. Are you mentioning the changes in build_ext, or something else ?
For 2.6.3, yes. For python 2.6, the mingw support was broken because of VS manifests; our fortran tools were broken because the msvc compiler was change (to msvc9compiler). There were also some other changes related to install directories and the --local option in 2.6 which caused issues. Several setuptools have also broken various parts of our own extensions in the past. If you really need to know the details for the last ones, I can look back in our svn log. cheers, David
On Thu, Nov 12, 2009 at 11:25 AM, David Cournapeau
Tarek Ziadé wrote:
The build and instalation scheme should be part of the API. If you look in sysconfig, you get most of the install schemes via the API. If anything is missing, tell us.
(I am reading the code there: http://svn.python.org/projects/python/branches/tarek_sysconfig/Lib/distutils...)
I have not tried running the code, but here is what I see missing: - anything related to build (where are libraries put in the build directory, where are the extensions put - necessary if you are interested in inter-operating with make)
the build part will not be in sysconfig, and wil stay in distutils, because I don't think the rest of the stdlib cares about it. Now, there's no reason at all that we can''t add an API in distutils.util for example, that let you get the build paths, and that will be used by all build_ commands, so you don't have to use a build_ command to get them. I propose that you think about its signature, and that I add it after we agree on it.
- I don't see the equivalent of distutils.command.install.install_libbase and similar prefixes
That's platlib or purelib, depending on wether you have an extension or not. Those are the default schemes and all $VARIABLE in there can be overriden by the end user, that's why you have this expansion mechanism.
- How can we retrieve the full customized install path ? I have not found how I could use the code in sysconfig to solve the use case I have written about before (related to retrieving install prefix).
by calling "get_paths(scheme, vars)", where scheme is the installation scheme you want (unix, nt, etc) and vars your variables if you want to override any $VARIABLE.
Now for the build I am not sure what you are trying to perform here.
[...]
So the list of variables needed: - everything in sysconfig (but available for any compiler, including MS ones) - any distutils command option. What is needed here is a way to query the options given to command globally, that is the possibility to access at any point in your setup.py every option of every command, without running them as first as we have to do currently (like running install_cmd within build_clib).
You mean, accessing other commands command line options ? or the result of their finalizations ? For me, the install_cmd <-> build_clib problem shows that the "install" command do some work that is needed by other commands, and that this work has to be taken out of it. A command performs something, and should not be used to set up an *environment*. What are the other commands you run to get information on the environment ? [..]
Right now, I have to reimplement the distutils code to find those, as the logic is not available through API. Generally, the build directories are only known through the corresponding build_* commands, but they become available only after running the commands, which we want to avoid.
I'd like to change that then. That's one thing we can change without any problem. Refactoring the build commands so they use some APIs that are outside them, make sense. And will allow you to get the information without having to run in some command calls madness.
And I made a proposal for this last one too in this thread. But, like the Extensions proposal I have made, you seem not to hear them.
Because I genuinely don't understand why it is required not to replace distutils but it is ok to change its behavior in a fundamental way.
It seems that some of Distutils knowledge is located in some commands, and that this knowledge could be pushed out in simple APIs, that could be used in the build_ commands, but also in your commands. Do you agree that this change will help you, and is possible without replacing Distutils ?
Sorry, but that's pretty vague. Are you mentioning the changes in build_ext, or something else ?
For 2.6.3, yes. For python 2.6, the mingw support was broken because of VS manifests; our fortran tools were broken because the msvc compiler was change (to msvc9compiler). There were also some other changes related to install directories and the --local option in 2.6 which caused issues. Several setuptools have also broken various parts of our own extensions in the past. If you really need to know the details for the last ones, I can look back in our svn log.
I fully agree that some commands like build_ext are leading to some of this problems. For that I have a partial solution until it's redesigned : buildbot. I remember I broke Numpy once. So I've worked on a buildbot that tries to prevents regressions. This one: http://buildbot.ziade.org/waterfall , as a matter of fact, is grabbing Numpy and builds it with 2.5, 2.6, 2.7 etc to make sure the "sdist" command still works and to make sure that the produced tarball is the same all the time (on the 1.3.0 tag) I am pretty sure it can be ehnanced to protect more things on your side. Help is welcome. Tarek
On Thu, Nov 12, 2009 at 8:14 PM, Tarek Ziadé
Now, there's no reason at all that we can''t add an API in distutils.util for example, that let you get the build paths, and that will be used by all build_ commands, so you don't have to use a build_ command to get them.
I propose that you think about its signature, and that I add it after we agree on it.
The signature does not really matter I think. Since those parameters depends on user customization (through config files or command line flags), it could be something like: class FinalizedOptions: def get_build_dir_library(self): pass def get_build_dir_src(self): pass def get_prefix(self): ... And a function like def get_finalized_options(): # compute the finalized options # Cache it if expensive ... return FinalizedOptions() could be used to obtain those options. what really matters is: - it should be callable from anywhere, at any time: inside setup.py, in any commands, etc... - ideally, it should work for any python >= 2.4 (not every python version have the exact same conventions IIRC).
by calling "get_paths(scheme, vars)", where scheme is the installation scheme you want (unix, nt, etc) and vars your variables if you want to override any $VARIABLE.
Assuming the above FinalizedOptions is available, I could obtain the needed options.
You mean, accessing other commands command line options ? or the result of their finalizations ?
The result after finalization.
For me, the install_cmd <-> build_clib problem shows that the "install" command do some work that is needed by other commands, and that this work has to be taken out of it.
A command performs something, and should not be used to set up an *environment*.
I am not sure I understand *environment* in that context. If you mean options, then each command has its set of options, and yes, that's one of the fundamental issue I often complain about in distutils. The build_clib/install interaction I described is typical, and we do this kind of stuff in pretty much every command.
Do you agree that this change will help you, and is possible without replacing Distutils ?
It would definitely be helpful to get access to any finalized command option from anywhere at anytime, especially if it is supported for all the python versions we support (2.4 and above - I don't know what version distribute intends to support). Ideally, it would solve a fair shair amount of issues if this was done at the UI level as well, that is something like python setup.py build --compiler=mingw -> use mingw compiler for every compiled code But that's one of the thing which does not make much sense changing in distutils, though: it would break almost anyone's workflow. David
On Thu, Nov 12, 2009 at 1:57 PM, David Cournapeau
The signature does not really matter I think. Since those parameters depends on user customization (through config files or command line flags), it could be something like:
class FinalizedOptions: def get_build_dir_library(self): pass
def get_build_dir_src(self): pass
def get_prefix(self):
...
And a function like
def get_finalized_options(): # compute the finalized options # Cache it if expensive ... return FinalizedOptions()
could be used to obtain those options.
I was not referring to the finalized options of any commands, but to the build paths you are trying to get. e.g. like for the install paths, we can extract from the build commands, some schemes that can be queried the same way: _BUILD_SCHEMES = { 'unix' : {'buildlib': '$base/xxx' ...} .. } and then: get_build_paths(scheme, vars) [...]
A command performs something, and should not be used to set up an *environment*.
I am not sure I understand *environment* in that context. If you mean options, then each command has its set of options, and yes, that's one of the fundamental issue I often complain about in distutils. The build_clib/install interaction I described is typical, and we do this kind of stuff in pretty much every command.
By environment I mean that a command should be a standalone element that does something, and is used by distutils with this pattern:
cmd = commandA(distribution) cmd.ensure_finalized() cmd.run()
If at some point you need to look to some of its finalized options in another commandB, it means that - either commandA is doing more than it is supposed to do: it sets an *environement* that is reused by others - either commandB should be a subcommand of commandA - either commandA doesn't let you do what you want, and you are building a competiting command
Do you agree that this change will help you, and is possible without replacing Distutils ?
It would definitely be helpful to get access to any finalized command option from anywhere at anytime, especially if it is supported for all the python versions we support (2.4 and above - I don't know what version distribute intends to support).
The schemes currently, go as far as Python 2.2.
Ideally, it would solve a fair shair amount of issues if this was done at the UI level as well, that is something like
python setup.py build --compiler=mingw -> use mingw compiler for every compiled code
As a matter of fact, having a "compiler" option in build, that would be reused by build_ext and other build_ commands, where discussed lately in an issue
But that's one of the thing which does not make much sense changing in distutils, though: it would break almost anyone's workflow.
Why that ? How this would break someone's workflow, since all the changes we have discussed so far is moving code out of commands to have new APIs (and have those commands call it), and modifying some command options. Now there's another pattern emerging in this discussion : paths options like --prefix can obviously be shared by several commands. So, based on the change that makes it possible to get some install and build schemes, what about having global options that will be used to built a global environment dictionary containing paths: $ python setup.py --var1 --var2 --var3 cmd1 cmd2 cmd3 then, var1, and var2 etc.. are put in a "vars" dict, that is passed to get_paths(scheme, vars). And a global "options" dict is filled in the Distribution object (where scheme is a key for the current platform): Distribution.options = get_paths(scheme, vars) A further improvement would be to be able to register a function that is called when the options dict is flled, allowing a third party code to work on the options. The existing commands in distutils would be changed accordingly: they'll look into the global options dict for finalized options, and would fallback using the old pattern (local options + finalization) Tarek
On Thu, Nov 12, 2009 at 10:36 PM, Tarek Ziadé
On Thu, Nov 12, 2009 at 1:57 PM, David Cournapeau
wrote: [...] The signature does not really matter I think. Since those parameters depends on user customization (through config files or command line flags), it could be something like:
class FinalizedOptions: def get_build_dir_library(self): pass
def get_build_dir_src(self): pass
def get_prefix(self):
...
And a function like
def get_finalized_options(): # compute the finalized options # Cache it if expensive ... return FinalizedOptions()
could be used to obtain those options.
I was not referring to the finalized options of any commands, but to the build paths you are trying to get.
But those are linked. Build directories are customizable through pretty much any build_command (I see at least build_clib, build_ext and build on python 2.6). So :
get_build_paths(scheme, vars)
This should return different value depending on whether the in place build is set up, or the develop command is called, etc...
If at some point you need to look to some of its finalized options in another commandB, it means that - either commandA is doing more than it is supposed to do: it sets an *environement* that is reused by others - either commandB should be a subcommand of commandA - either commandA doesn't let you do what you want, and you are building a competiting command
Those differences do not make much sense. If distutils wants command foo to be a subcommand of bar, and we need a different ordering, how can we do it ? Concerning different options from different commands, I quickly looked at build, install and build_ext commands options (the ones from straight distutils). Very few options concerned only the command it belongs to. For example, the install command list is: --prefix installation prefix --exec-prefix (Unix only) prefix for platform-specific files --home (Unix only) home directory to install under --user install in user site-package '/Users/david/.local/lib/python2.6/site-packages' --install-base base installation directory (instead of --prefix or -- home) --install-platbase base installation directory for platform-specific files (instead of --exec-prefix or --home) --root install everything relative to this alternate root directory --install-purelib installation directory for pure Python module distributions --install-platlib installation directory for non-pure module distributions --install-lib installation directory for all module distributions (overrides --install-purelib and --install-platlib) --install-headers installation directory for C/C++ headers --install-scripts installation directory for Python scripts --install-data installation directory for data files --compile (-c) compile .py to .pyc [default] --no-compile don't compile .py files --optimize (-O) also compile with optimization: -O1 for "python -O", -O2 for "python -OO", and -O0 to disable [default: -O0] --force (-f) force installation (overwrite any existing files) --skip-build skip rebuilding everything (for testing/debugging) --record filename in which to record list of installed files Maybe --record is not needed, but I can see a usage for all the other ones. For example, if I have a scons builder to compile/optimize python files, I need to know it in my scons command.
The schemes currently, go as far as Python 2.2.
cool
Ideally, it would solve a fair shair amount of issues if this was done at the UI level as well, that is something like
python setup.py build --compiler=mingw -> use mingw compiler for every compiled code
As a matter of fact, having a "compiler" option in build, that would be reused by build_ext and other build_ commands, where discussed lately in an issue
When I was considering this example, I had in mind that any option which can be customized in any build_ command should be customizable in build. Compiler, compiler options, in-place vs non in-place, etc... Pretty much any option.
Why that ? How this would break someone's workflow, since all the changes we have discussed so far is moving code out of commands to have new APIs (and have those commands call it), and modifying some command options.
One of the problem is dealing with inconsistencies between subcommands. Modifying their behavior has many corner cases. One of the big win with numscons was that build was handled by only one command, so the inconsistencies disappear.
Now there's another pattern emerging in this discussion : paths options like --prefix can obviously be shared by several commands.
So, based on the change that makes it possible to get some install and build schemes, what about having global options that will be used to built a global environment dictionary containing paths:
$ python setup.py --var1 --var2 --var3 cmd1 cmd2 cmd3
then, var1, and var2 etc.. are put in a "vars" dict, that is passed to get_paths(scheme, vars).
As mentioned above, please keep in mind that many of those options see their behavior modified (directly or indirectly) through commands options. Think about inplace option, develop command, etc... David
On Thu, Nov 12, 2009 at 3:06 PM, David Cournapeau
I was not referring to the finalized options of any commands, but to the build paths you are trying to get.
But those are linked. Build directories are customizable through pretty much any build_command (I see at least build_clib, build_ext and build on python 2.6).
yes, like install can customize the install paths with --prefix. but at the end, they you just give them root paths, or things like that, and they return build paths. So, as we said for the install command, this mechanism can be extracted and put in an API, that takes variables to generate build paths, and that can be used by those commands. Meaning that you won't need them anymore to get your build paths.
So :
get_build_paths(scheme, vars)
This should return different value depending on whether the in place build is set up, or the develop command is called, etc...
yes, like platlib and purelib. Build schemes will have several paths indeed. we need to list them.
If at some point you need to look to some of its finalized options in another commandB, it means that - either commandA is doing more than it is supposed to do: it sets an *environement* that is reused by others - either commandB should be a subcommand of commandA - either commandA doesn't let you do what you want, and you are building a competiting command
Those differences do not make much sense. If distutils wants command foo to be a subcommand of bar, and we need a different ordering, how can we do it ?
I don't understand this question. What is the need for you to order two commands ? This pattern is useful only if a command uses what another command has produced. For example: $ python setup.py sdist upload -> upload requires sdist to be run so it can upload an archive If ordering means that you need a command to be run, so you can get its finalized options for your own command, something is wrong. It means that you are unable to build a command that is standalone, and that instead of using an API or the Distribution class on its own, you need to work with another command. That is what wee need to change, by externalizing for the commands the code so you can reuse that. [..]
$ python setup.py --var1 --var2 --var3 cmd1 cmd2 cmd3
then, var1, and var2 etc.. are put in a "vars" dict, that is passed to get_paths(scheme, vars).
As mentioned above, please keep in mind that many of those options see their behavior modified (directly or indirectly) through commands options. Think about inplace option, develop command, etc...
What's an option behavior ? You mean the value is changed ? The develop command is not part of Distutils and it's doing a big hack on calling the build_ext command and changing the inplace option on the fly like this. It should instead, like what the upload command does, leave it up to any build command to build the extensions in place. And just work on adding a .pth file for example, that adds the source to the path. That would prevent troubles.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Cournapeau wrote:
Tarek Ziadé wrote:
I want to improve Distutils, but not for the price of a complete drop. I don't think the edge cases we are discussing worth it, and I still fail to see why we can't work them out in the context of the existing tool.
Well, the scientific python edge case is what Guido asked about in the original thread. That's our answer: we don't think something which is not fundamentally different than distutils will solve our problems. If you think that what we considered as fundamental issues are corner cases and implementation details, we can only agree to disagree, I guess.
Now, it is legitimate to dismiss our use-cases as edge cases and not worthwhile to worry about. We are not certainly not as big as say the web development community. But the question was directed at us, as I understood it :)
FWIW, I am a web developer who agrees wholeheartedly that trying to fix distutils incrementally is going to be way harder / more painful, and take *much* longer, than writing something which abandons the "100% backward compatibility" goal completely. The phantom "distutils API" doesn't exist, which means that almost any improvement (and I don't deny that such improvements are possible) potentially breaks substantial installed codebases. In this sense, fixing distutils is like working without tests, because you don't know until too late that you've broken something. If we just don't call the new thing "distutils", we get to keep "100% backward compatibility" for free: people can begin switching to the new thing incrementally / opportunistically, but their existing code doesn't blow up. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkr8TBQACgkQ+gerLs4ltQ7wNACeNWDIdwAn1xasjs/zbJkiSU/0 k7UAniByqvgshfNsiXy5KPxQxvwd9OHE =ZUZt -----END PGP SIGNATURE-----
On Thu, Nov 12, 2009 at 6:55 PM, Tres Seaver
The phantom "distutils API" doesn't exist, which means that almost any improvement (and I don't deny that such improvements are possible) potentially breaks substantial installed codebases. In this sense, fixing distutils is like working without tests, because you don't know until too late that you've broken something.
First of all, I have not said that Distutils will stay backward compatible. If the compatibility is broken at some point, it'll have a deprecation process, that's all. And a backport will be provided for the previous Python versions. But we can't talk about breaking compatibility until we do have a new design in our hands that fixes some problems, and require to break the comaptibility. Next, the distutils code base is test-covered around 80% now (depending on your platform) Maybe you ignore about it, but I have been working to raise the test coverage during the last year, (it went from 20% to 80%) Next, there are buildbots now that builds projects like NumPy to make sure the trunk still works. Moving some code from the commands to some new APIs will not blow those commands because they are tested. Meaning that the old tests will have to still pass once some code has been moved. That's the whole point of test coverage, and a basic rule in refactoring code. So I don't really understand why you are saying that we will go blind in refactoring some stuff. Last, I am now not listening anymore to this "distutils must be fully replaced" answers, they are going no where, because Distutils will not be dropped for the reasons I've already mentioned earlier. I'll focus on the design work we are doing with David now to enhance Distutils on his use cases. As a matter of fact, we have made progress in sharing knowledge and trying to find designs that fits our brains, and that the only stuff that matters to me at this point. And if it doesn't make it to Distutils 2.7/3.2, I expect Distribute to be the incubator of those particular changes. If some some people want to build a brand new shiny tool from scratch, good luck then. If some others want to help in Distutils, welcome then. Tarek
At 12:36 AM 11/12/2009 -0600, Robert Kern wrote:
Sorry, I edited out the bit at the last minute where I explained that it would be great to have a centralized option-managing object such that any command can ask what options were set on any other regardless of the dependencies between commands.
Actually, that such a thing is needed in the first place is evidence of one of the deepest design flaws in the distutils -- the fact that things which are fundamentally system or project-level configuration values are defined in terms of options to commands! The distutils is a definite case of "superficial design flaws being so annoying as to keep most people from noticing the fundamental design flaws". ;-)
"P.J. Eby"
The distutils is a definite case of "superficial design flaws being so annoying as to keep most people from noticing the fundamental design flaws". ;-)
Douglas Adams, RIP. -- \ “Nature hath given men one tongue but two ears, that we may | `\ hear from others twice as much as we speak.” —Epictetus, | _o__) _Fragments_ | Ben Finney
participants (10)
-
Ben Finney
-
Brian Granger
-
David Cournapeau
-
David Cournapeau
-
Glyph Lefkowitz
-
Lennart Regebro
-
P.J. Eby
-
Robert Kern
-
Tarek Ziadé
-
Tres Seaver