I originally added CVS and Subversion support to setuptools in order to get past the pain of the distutils' MANIFEST system. I used to use MANIFEST.in, but it was a royal pain to get right, and I pretty much always forgot to add stuff to it. The most common problem when I shipped a source distribution was that the MANIFEST was screwed up, such that a CVS checkout worked fine but a source distribution would break. Ugh. So, the CVS/Subversion support for setuptools automatically makes your MANIFEST include anything under revision control, whether you have a MANIFEST.in or not. If you don't have a MANIFEST.in, the MANIFEST is built every time you run an sdist, so it's always up-to-date. Ah, bliss! But all is not happy in MANIFESTville. It turns out that the bdist_rpm command expects to build an sdist, for reasons impenetrable to me. If you build an RPM from a source checkout, everything is fine because setuptools can auto-discover your files and build the MANIFEST for the new sdist. But if you build an RPM *from* an sdist, it's a no-go. In addition, many folks have been asking for this autodetection to cover package data files as well. Why, they reasonably ask, must I specify each and every file to be included in a package, when the system already knows what files I have in revision control, or which is covered by my MANIFEST.in? The reason I've been avoiding adding this feature, however, is because of the first issue; when you make an sdist, you lose that additional metadata, so it would become impossible to build *any* binary from an sdist, not just RPMs. Until recently, that issue seemed insurmountable. So today, after looking over the issue a bit, I think I have a plan for dealing with MANIFEST: * Change the MANIFEST format to be platform-independent (currently it contains OS-specific path separators) * Always, always, always build MANIFEST, and always include both the MANIFEST file and MANIFEST.in (if present) in the source distribution. * Disable all the options that allow user control over MANIFEST generation, including pruning, defaults, changing the filenames, etc. * Use the MANIFEST data (along with revision control info) not only for producing source distributions, but also to determine what files should be considered "package data", if the user passes an 'include_package_data=True' keyword to setup(). The net result would be a single source for what constitutes "the distribution contents", in the sense of files that are not directly part of the distutils build process. For files that are built automatically in some way but should be included in source distributions or as package data, you would still have to put them in MANIFEST.in. But anything that was under CVS or Subversion would be handled automatically, and you wouldn't have to duplicate data between MANIFEST.in and setup(package_data={...}). I'm also thinking that most of the MANIFEST logic could and should move to the Distribution class, since the data will be used by multiple commands. Thus, the sdist command could just ask the Distribution for the MANIFEST and get it, as would the commands that copy package data files to the build directory. I suspect the most controversial parts of this idea are: * Disabling all user control of MANIFEST * Forcibly including MANIFEST and MANIFEST.in in source distributions * Making MANIFEST be always platform-independent When Googling the issues around MANIFEST, I noticed that the idea of having MANIFEST or MANIFEST.in included automatically has been repeatedly shot down here over the years. However, if I followed the logic put forth on those occasions, I would never have implemented revision control support in the first place, so I guess if I'm in for a penny, I might as well be in for a pound, as they say. I couldn't find any argument one way or the other about the manifest-generation options, nor any reasons why MANIFEST needs to remain platform-specific, so I presume the options are just YAGNI and the format was just an implementation accident. Likewise, as far as I can tell there is no reason for *not* regenerating a MANIFEST whenever you need one, so the current behavior of only building one when MANIFEST.in changes or you use --force-manifest, seems like a premature optimization. Or maybe it wasn't an excessive optimization when the distutils were created, but it's not as if it's going to save you much time compared to the actual archive building process today. I'm thinking that basically --force-manifest would become a no-op in setuptools, in the sense that you won't be able to *stop* the MANIFEST from being built every single time. --manifest-only would still be possible. --manifest and --template would have to be rejected, however, because the standard name is needed for MANIFEST to be re-read when you build stuff from the produced sdist. --no-defaults would be ignored, except for a warning. If you don't want the defaults, you can always start your MANIFEST.in with an exclude pattern to exclude absolutely everything already included. There shouldn't be two ways to do the same thing, especially not one that you can use on the command line to mess things up in a non-repeatable fashion! Likewise --no-prune, because that's a similar recipe for disaster. A lot of these ideas are potential backward compatibility problems, so we'll have to see how they play out in setuptools before considering them for addition to the distutils. My guess, however, is that most prolific Python developers want to spend their time writing code, not writing and debugging MANIFEST.in files, and that fact has been responsible for a lot of setuptools uptake so far. I've been seeing a lot of projects that use setuptools for no apparent reason other than it makes writing the setup script a little easier, due to find_packages(), package_data, and the lack of need for a MANIFEST when source control is involved. These are qualities I'd like to extend further, even at the cost of some flexibility. Heck, most of the distutils' flaws lie in their extreme versatility. You can tell each individual command that it's using different build or distribution directories, for example, and in the process completely foul up your builds. What's more, every distutils tutorial may well end up giving people different instructions as to the "best" way to lay out a project directory. If there's ever a "distutils 2", it needs to become dictator-ware and tell you exactly what the One Obvious Way is. If everything *had* to be a particular way, then changing how the distutils work would actually be possible, whereas now, it's bloody hard to even figure out which of the nine billion ways to do it are actually in use. Okay, off the soapbox now. :) Does anybody see any issues with this that I'm missing, with respect to using the MANIFEST/FileList machinery to control sdist and package data, or my implementation plans for doing so? Thanks.
Hi Phillip, you asked for it, so I'm giving you some flaming ;-) ... Phillip J. Eby wrote:
I originally added CVS and Subversion support to setuptools in order to get past the pain of the distutils' MANIFEST system. I used to use MANIFEST.in, but it was a royal pain to get right, and I pretty much always forgot to add stuff to it. The most common problem when I shipped a source distribution was that the MANIFEST was screwed up, such that a CVS checkout worked fine but a source distribution would break. Ugh.
So, the CVS/Subversion support for setuptools automatically makes your MANIFEST include anything under revision control, whether you have a MANIFEST.in or not. If you don't have a MANIFEST.in, the MANIFEST is built every time you run an sdist, so it's always up-to-date. Ah, bliss!
But all is not happy in MANIFESTville. It turns out that the bdist_rpm command expects to build an sdist, for reasons impenetrable to me. If you build an RPM from a source checkout, everything is fine because setuptools can auto-discover your files and build the MANIFEST for the new sdist. But if you build an RPM *from* an sdist, it's a no-go.
I don't understand what you mean with "no-go" - the current system works just fine if you include the MANIFEST file in the sdist.
In addition, many folks have been asking for this autodetection to cover package data files as well. Why, they reasonably ask, must I specify each and every file to be included in a package, when the system already knows what files I have in revision control, or which is covered by my MANIFEST.in?
The reason I've been avoiding adding this feature, however, is because of the first issue; when you make an sdist, you lose that additional metadata, so it would become impossible to build *any* binary from an sdist, not just RPMs. Until recently, that issue seemed insurmountable.
That's simply not true. You have to include the MANIFEST file in the sdist and then everything is fine.
So today, after looking over the issue a bit, I think I have a plan for dealing with MANIFEST:
* Change the MANIFEST format to be platform-independent (currently it contains OS-specific path separators)
-0.5 You are missing an important point: MANIFEST files can be build using tools outside distutils and external package building tools may require these to be platform dependent. Distutils itself is happy with posix style separators on all platforms.
* Always, always, always build MANIFEST, and always include both the MANIFEST file and MANIFEST.in (if present) in the source distribution.
-1 on always building MANIFEST. This would miss the point of managing MANIFEST files independently of your package files, e.g. using Makefiles or other tools dealing with file dependencies, checkouts, etc.
* Disable all the options that allow user control over MANIFEST generation, including pruning, defaults, changing the filenames, etc.
-1 Again, you are forgetting that MANIFEST files serve a purpose and are external to the distutils process for a reason. You are free to have distutils build your MANIFEST files from MANIFEST.in files, have distutils command auto-generate them, or use external programs triggered by Makefiles or similar distribution building processes to generate them. In your world, everything is done within distutils, so it's understandable that you'd like to get rid of the external nature of MANIFEST files, but please keep in mind that these features are being used and removing the logic would seriously break things for packagers relying on other mechanisms to build their MANIFEST files. Simply overwriting the MANIFEST file everytime you run the sdist command would break such use.
* Use the MANIFEST data (along with revision control info) not only for producing source distributions, but also to determine what files should be considered "package data", if the user passes an 'include_package_data=True' keyword to setup().
Isn't that already the case ? I mean you can put anything you like into MANIFEST and it will be included in the sdist.
The net result would be a single source for what constitutes "the distribution contents", in the sense of files that are not directly part of the distutils build process. For files that are built automatically in some way but should be included in source distributions or as package data, you would still have to put them in MANIFEST.in. But anything that was under CVS or Subversion would be handled automatically, and you wouldn't have to duplicate data between MANIFEST.in and setup(package_data={...}).
Again, not everybody is using distribution processes built around CVS or Subversion. Left aside that there are quite a few other SCM tools out there, you also have the case where you create distributions from plain directories (which is what MANIFEST.in and MANIFEST are targetting).
I'm also thinking that most of the MANIFEST logic could and should move to the Distribution class, since the data will be used by multiple commands. Thus, the sdist command could just ask the Distribution for the MANIFEST and get it, as would the commands that copy package data files to the build directory.
Wait: MANIFEST defines what goes into the sdist - not an arbitrary (binary) distribution.
I suspect the most controversial parts of this idea are:
* Disabling all user control of MANIFEST * Forcibly including MANIFEST and MANIFEST.in in source distributions * Making MANIFEST be always platform-independent
When Googling the issues around MANIFEST, I noticed that the idea of having MANIFEST or MANIFEST.in included automatically has been repeatedly shot down here over the years. However, if I followed the logic put forth on those occasions, I would never have implemented revision control support in the first place, so I guess if I'm in for a penny, I might as well be in for a pound, as they say.
I'm not sure what you mean by "having MANIFEST[.in] included". It would certainly make sense to have the MANIFEST[.in] files automatically be added as default in sdist.py and I'd be +1 on that (even though it never was an issue for me as I always include them in the MANIFEST file).
I couldn't find any argument one way or the other about the manifest-generation options, nor any reasons why MANIFEST needs to remain platform-specific, so I presume the options are just YAGNI and the format was just an implementation accident.
See above.
Likewise, as far as I can tell there is no reason for *not* regenerating a MANIFEST whenever you need one, so the current behavior of only building one when MANIFEST.in changes or you use --force-manifest, seems like a premature optimization. Or maybe it wasn't an excessive optimization when the distutils were created, but it's not as if it's going to save you much time compared to the actual archive building process today.
See above.
I'm thinking that basically --force-manifest would become a no-op in setuptools, in the sense that you won't be able to *stop* the MANIFEST from being built every single time. --manifest-only would still be possible. --manifest and --template would have to be rejected, however, because the standard name is needed for MANIFEST to be re-read when you build stuff from the produced sdist.
--no-defaults would be ignored, except for a warning. If you don't want the defaults, you can always start your MANIFEST.in with an exclude pattern to exclude absolutely everything already included. There shouldn't be two ways to do the same thing, especially not one that you can use on the command line to mess things up in a non-repeatable fashion! Likewise --no-prune, because that's a similar recipe for disaster.
These options are meant for people who don't have a MANIFEST.in file to begin with or just quickly want to build an sdist with parts of the whole distribution or an extended version (e.g. for testing or upgrading).
A lot of these ideas are potential backward compatibility problems, so we'll have to see how they play out in setuptools before considering them for addition to the distutils. My guess, however, is that most prolific Python developers want to spend their time writing code, not writing and debugging MANIFEST.in files, and that fact has been responsible for a lot of setuptools uptake so far. I've been seeing a lot of projects that use setuptools for no apparent reason other than it makes writing the setup script a little easier, due to find_packages(), package_data, and the lack of need for a MANIFEST when source control is involved. These are qualities I'd like to extend further, even at the cost of some flexibility.
Heck, most of the distutils' flaws lie in their extreme versatility.
That comment is just silly: distutils is so powerful because of its versatility. You wouldn't have been able to build setuptools without this versatility. Just because you don't like some of this flexibility doesn't mean that distutils is broken in some way.
You can tell each individual command that it's using different build or distribution directories, for example, and in the process completely foul up your builds. What's more, every distutils tutorial may well end up giving people different instructions as to the "best" way to lay out a project directory. If there's ever a "distutils 2", it needs to become dictator-ware and tell you exactly what the One Obvious Way is. If everything *had* to be a particular way, then changing how the distutils work would actually be possible, whereas now, it's bloody hard to even figure out which of the nine billion ways to do it are actually in use.
That's your point of view - I've never had a hard time adjusting distutils to whatever I wanted it to do. After you get used to the way things are handled in distutils, extending it is often enough really easy and would be much harder in your One Obvious Way to do it (unless you had a time-machine, zoom to 2042 and then take all possibly ways to build distributions into account on your way back to 2005 ;-). You are free to develop setuptools into your own little vision of distutils 2 - and that's one of distutils strengths !
Okay, off the soapbox now. :) Does anybody see any issues with this that I'm missing, with respect to using the MANIFEST/FileList machinery to control sdist and package data, or my implementation plans for doing so? Thanks.
I think I gave you some more hints as to why MANIFEST[.in] works the way it does. Adding these files as defaults to the set of sdist files sounds like a good idea (I don't remember discussions about this, so maybe wrong). Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2005)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
M.-A. Lemburg wrote:
* Always, always, always build MANIFEST, and always include both the MANIFEST file and MANIFEST.in (if present) in the source distribution.
-1 on always building MANIFEST.
This would miss the point of managing MANIFEST files independently of your package files, e.g. using Makefiles or other tools dealing with file dependencies, checkouts, etc.
How about always build it, automatically, if some particular option is not passed to setup()? This would only apply to packages using setuptools, not distutils, and there's other things that existing distutils packages might have to tweak to move to setuptools. Maybe it would be as simple as a keyword argument that refers to a routine to build the MANIFEST file; and if you build MANIFEST by hand, then you just make the function a no-op. Also, disable all command-line options that control when/if/how MANIFEST is generated; these seem peculiar to me. MANIFEST generation seems like package metadata, not something particular to your build, and command-line options seem like a bad way to control that. It's a little fuzzy, though, since command-line options and setup.cfg are equivalent, and setup.cfg options often feel like package data. So... I don't know what the middle ground is there. Either way, these aren't things that seem like they should be regularly tweaked. I don't know, I don't have any packages with complex (or even simple) build processes, so I only know that the current system is mysterious to me, and almost but doesn't always work without my intervention. I think Phillip also mentioned using MANIFEST for package_data? At least insofar as the packages and the MANIFEST intersect, this seems like a good idea. -- Ian Bicking / ianb@colorstudy.com / http://blog.ianbicking.org
Ian Bicking wrote:
M.-A. Lemburg wrote:
* Always, always, always build MANIFEST, and always include both the MANIFEST file and MANIFEST.in (if present) in the source distribution.
-1 on always building MANIFEST.
This would miss the point of managing MANIFEST files independently of your package files, e.g. using Makefiles or other tools dealing with file dependencies, checkouts, etc.
How about always build it, automatically, if some particular option is not passed to setup()? This would only apply to packages using setuptools, not distutils, and there's other things that existing distutils packages might have to tweak to move to setuptools. Maybe it would be as simple as a keyword argument that refers to a routine to build the MANIFEST file; and if you build MANIFEST by hand, then you just make the function a no-op.
Why can't Philipp just add the explicit build command (--force-manifest) to his setuptools ?
Also, disable all command-line options that control when/if/how MANIFEST is generated; these seem peculiar to me. MANIFEST generation seems like package metadata, not something particular to your build, and command-line options seem like a bad way to control that. It's a little fuzzy, though, since command-line options and setup.cfg are equivalent, and setup.cfg options often feel like package data. So... I don't know what the middle ground is there. Either way, these aren't things that seem like they should be regularly tweaked. I don't know, I don't have any packages with complex (or even simple) build processes, so I only know that the current system is mysterious to me, and almost but doesn't always work without my intervention.
Philipp can do the same to his setuptools - he'd just have to subclass the sdist command.
I think Phillip also mentioned using MANIFEST for package_data? At least insofar as the packages and the MANIFEST intersect, this seems like a good idea.
Again, this depends on the distribution format. It may make a lot of sense for setuptools, but there are other formats where the source distribution file list and the files to be included in a binary distribution are two completely separate things. For example, bdist_rpm actually installs the package to a temporary directory and then takes the list of installed files as basis for the binary distribution file list. Since the installation can depend on other external factors (e.g. libs being available or not), the file list is not necessarily static, nor does it have to match the file layout you have in MANIFEST (since the install commands may move files to different target directories). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2005)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
At 10:29 AM 11/16/2005 -0600, Ian Bicking wrote:
M.-A. Lemburg wrote:
* Always, always, always build MANIFEST, and always include both the MANIFEST file and MANIFEST.in (if present) in the source distribution.
-1 on always building MANIFEST. This would miss the point of managing MANIFEST files independently of your package files, e.g. using Makefiles or other tools dealing with file dependencies, checkouts, etc.
How about always build it, automatically, if some particular option is not passed to setup()? This would only apply to packages using setuptools, not distutils, and there's other things that existing distutils packages might have to tweak to move to setuptools.
Especially in cases where they have complex custom builds. People with stuff as sophisticated as SciPy, mx* stuff, Twisted, Zope, are not people I expect to move to setuptools any time soon. (I've heard rumors about Zope using it for Products, but reasonably speaking I think it's going to take some time before the Zope core could be moved to it, probably by adapting the zpkg tools to generate setuptools-based setup scripts.) My thinking here is more about the long term, and mostly setuptools is "disruptive technology" anyway, because it's shifting the balance of distribution power towards smaller and simpler tools. One of the reasons complex packages are dominant today is because they reduce the number of things you have to install. Setuptools makes the number of things you have to install a lot less relevant, so in the "new world", smaller packages will tend to dominate. In order to effect that shift, it's got to become a lot easier for a complete distutils novice to turn out a package.
Maybe it would be as simple as a keyword argument that refers to a routine to build the MANIFEST file; and if you build MANIFEST by hand, then you just make the function a no-op.
Now *that's* an interesting idea. I'll have to give that one some thought. Thanks!
Also, disable all command-line options that control when/if/how MANIFEST is generated; these seem peculiar to me. MANIFEST generation seems like package metadata, not something particular to your build, and command-line options seem like a bad way to control that. It's a little fuzzy, though, since command-line options and setup.cfg are equivalent, and setup.cfg options often feel like package data. So... I don't know what the middle ground is there. Either way, these aren't things that seem like they should be regularly tweaked. I don't know, I don't have any packages with complex (or even simple) build processes, so I only know that the current system is mysterious to me, and almost but doesn't always work without my intervention.
Marc's explanation of these options is actually pretty reasonable as a projected use case. If there are any *actual* field uses, I'll consider trying to support them. Of course, if those use cases are ones where there's no reasonable likelihood of the project moving to setuptools, it doesn't make sense to support them in setuptools. It would be more a matter of, if some o the relevant setuptools stuff makes it back into the distutils, it would need that additional degree of support.
I think Phillip also mentioned using MANIFEST for package_data? At least insofar as the packages and the MANIFEST intersect, this seems like a good idea.
Yeah, it would make your find_package_data() less necessary. The reason I haven't put it in yet, is because I've been pondering the issue of how to integrate these notions of "the distribution contents". While I can envision scenarios where you'd want certain data files to get built on the target user's machine, there's still the option of manually or programmatically specifying those files in package_data. (That is, I intend to make include_package_data be supplemental to the existing package_data option, not a replacement for it, except in the sense that for simple cases you won't need package_data any more.)
At 11:49 AM 11/16/2005 +0100, M.-A. Lemburg wrote:
I don't understand what you mean with "no-go" - the current system works just fine if you include the MANIFEST file in the sdist.
But it's not included, and you have to know to include it - and people who previously requested here that MANIFEST and MANIFEST.in be included in the manifest were shot down under the claim that including these defaults would be "too much magic".
The reason I've been avoiding adding this feature, however, is because of the first issue; when you make an sdist, you lose that additional metadata, so it would become impossible to build *any* binary from an sdist, not just RPMs. Until recently, that issue seemed insurmountable.
That's simply not true. You have to include the MANIFEST file in the sdist and then everything is fine.
I'm speaking about the context of setuptools, where the typical user has no MANIFEST.in, because they're using Subversion or CVS and therefore don't need one.
* Change the MANIFEST format to be platform-independent (currently it contains OS-specific path separators)
-0.5
You are missing an important point: MANIFEST files can be build using tools outside distutils and external package building tools may require these to be platform dependent.
One reason I posted to find out what specific uses people actually had for such things. Do the uses actually exist? What are they used for? One important point: on POSIX-y platforms (i.e. virtually everything but Windows), there's no difference between distutils paths and system paths. So, what external tools build or read MANIFEST files on Windows, or any other platform that doesn't accept '/' as a separator? Also, '/' *is* a valid path separator on Windows, so if your MANIFEST processing is done in Python, a platform-independent format won't have any effect. (Indeed, the only reason to actually *change* the format is so that sdist files built *on Windows* would be usable on other platforms! Other platforms would never see a difference.) I know that in principle, somebody somewhere may have some tool that would break if MANIFEST changed it's format. My question is, who and how many? If there's some widely-used tool that does this, then it's reasonable to take it into account. If it's just a theoretical possibility, it's not worth much concern. If there are a handful of people with a hard-to-change setup, it's somewhere in between.
* Always, always, always build MANIFEST, and always include both the MANIFEST file and MANIFEST.in (if present) in the source distribution.
-1 on always building MANIFEST.
This would miss the point of managing MANIFEST files independently of your package files, e.g. using Makefiles or other tools dealing with file dependencies, checkouts, etc.
Please point me to some examples of this, especially one that can't simply generate a MANIFEST.in instead.
* Disable all the options that allow user control over MANIFEST generation, including pruning, defaults, changing the filenames, etc.
-1
Again, you are forgetting that MANIFEST files serve a purpose and are external to the distutils process for a reason. You are free to have distutils build your MANIFEST files from MANIFEST.in files, have distutils command auto-generate them, or use external programs triggered by Makefiles or similar distribution building processes to generate them.
Examples? You keep saying that people *can* do these things, but that's not anything like the same as saying that any significant number of people *actually* do them. Frankly, most people I've encountered who are doing Python software development don't know how to get a basic setup.py to work right, and feel the distutils are way too complicated, underdocumented, or just plain broken, because they don't feel like they can control them. OSAF, for example, has some developers who are very smart about creating build processes. Smart enough to be *able* to do the kinds of things you're describing. But they sure as heck don't use the distutils to *actually* do them, because from their perspective the distutils is a big pile of broken undocumentedness. If the distutils are so frustrating to such smart developers, there's something wrong. Thus, I find these super-custom processes you're talking about highly implausible, because the only people who could implement them are the people with a strong knowledge of the distutils -- an incredibly rare breed of person, in other words. Most people just want this stuff to work, and they don't want to have to learn *how* it works. They have better things to do with their time. They want a build *tool*, not a build library or build framework. Most people don't know MANIFEST exists, until they get bitten by the need to have one, or by it being out of date. Hell, look at how many packages on PyPI and the Vaults aren't even packaged with distutils at all! For many people, it's clearly easier to just tarball your source directory than to have to learn about this MANIFEST stuff.
In your world, everything is done within distutils, so it's understandable that you'd like to get rid of the external nature of MANIFEST files, but please keep in mind that these features are being used and removing the logic would seriously break things for packagers relying on other mechanisms to build their MANIFEST files.
Please point me to these developers, and show me one that couldn't just spend a few minutes making their tools generate a MANIFEST.in instead. I'm suggesting that we present a very small number of highly-capable people with a truly minor inconvenience, in order to make an extremely large group of people happier by taking away something that invariably bites them.
* Use the MANIFEST data (along with revision control info) not only for producing source distributions, but also to determine what files should be considered "package data", if the user passes an 'include_package_data=True' keyword to setup().
Isn't that already the case ? I mean you can put anything you like into MANIFEST and it will be included in the sdist.
I'm talking about package data - a feature that was pioneered in setuptools and added to the distutils in Python 2.4. The ability to specify data files that are *installed* in a package directory. Specifically, I'm suggesting that users be able to replace the package_data setup keyword with a simple include_package_data flag, so that the MANIFEST data can be used to determine what data files to install. This has nothing to do with sdist; I'm talking about being able to unify the distutils' idea of what files are part of the distribution, because for the simple cases that are 90% of projects, that's a very useful thing to have. For every SciPy and mxODBC and Twisted, there are easily a hundred packages without anything like their level of complex build needs. I'm talking about streamlining things for the simple packages, but there isn't anything in what I'm proposing that keeps anybody from doing complex things. I'm just saying we should remove *multiple* ways to do the *same* complex thing. We should pick One Obvious Way to *customize*, replacing multiple hooks with single hooks that allow the same degree of customizability. For example, if you have tools that generate something, right now you could choose to generate either MANIFEST or MANIFEST.in. I'm suggesting that we should choose for you, and say it's MANIFEST.in that you should generate, since it's the more expressive and flexible of the two formats. Similarly, you can currently choose what filenames MANIFEST and MANIFEST.in have, and I'm suggesting that those be their only names. If you need other files you certainly have the option of copying, renaming, and using other existing file manipulation tools. These additional degrees of built-in "freedom" just give you *meaningless* choices. They don't make any *real* difference to your ability to get the job done. Instead, they raise a *barrier* to creating tools. If somebody creates a tool to do something with manifest files, you can't use it unless you and they have already agreed to make the *same* choices about these superfluous options, or else the tool maker has to support all possible options - and sometimes, a useful tool that supports all the options just isn't *possible*. See, it wouldn't matter if the arbitrary choice was that you have to generate MANIFEST instead of MANIFEST.in. You'd still have the same flexibility. The important thing in these arbitrary choices is that *we pick one*. That's why we have a BDFL for the language - we all argue for a particular surface syntax, and then he picks one, and we all move on. The distutils needs a BDFL to pick between all the mutually incompatible but semantically indistinguishable surface syntaxes for how to build and package something.
Again, not everybody is using distribution processes built around CVS or Subversion. Left aside that there are quite a few other SCM tools out there, you also have the case where you create distributions from plain directories (which is what MANIFEST.in and MANIFEST are targetting).
The source control is a supplement to the "add_defaults" of the sdist process, not a replacement for MANIFEST.in. Some users need to add non-source-controlled files, for example. But *simple things should be simple*, and treating source control info as part of the defaults makes them work simply for most people. And if you don't want the defaults, there should be only one way to turn them off - by excluding files in MANIFEST.in, not by a command line option.
I'm also thinking that most of the MANIFEST logic could and should move to the Distribution class, since the data will be used by multiple commands. Thus, the sdist command could just ask the Distribution for the MANIFEST and get it, as would the commands that copy package data files to the build directory.
Wait: MANIFEST defines what goes into the sdist - not an arbitrary (binary) distribution.
I'm talking about the include_package_data option.
It would certainly make sense to have the MANIFEST[.in] files automatically be added as default in sdist.py and I'd be +1 on that (even though it never was an issue for me as I always include them in the MANIFEST file).
Interesting; I could've sworn that you were one of the people who told somebody it would be "too much magic" to include this. But whatever the case, I'm glad you don't oppose it now.
--no-defaults would be ignored, except for a warning. If you don't want the defaults, you can always start your MANIFEST.in with an exclude pattern to exclude absolutely everything already included. There shouldn't be two ways to do the same thing, especially not one that you can use on the command line to mess things up in a non-repeatable fashion! Likewise --no-prune, because that's a similar recipe for disaster.
These options are meant for people who don't have a MANIFEST.in file to begin with or just quickly want to build an sdist with parts of the whole distribution or an extended version (e.g. for testing or upgrading).
Sure - and there are plenty of things I can leave some room for play in. For example, I could simply revert to old behaviors when you use any non-default options. I could make a separate MANIFEST.setuptools file, etc. But these things add complexity, so I want to know who *actually* needs them, and can't trivially work around their absence. I'd rather briefly inconvenience distutils mavens like you, than continue to stump and frustrate the hundreds of people who just don't get why it's all so damn complicated.
Heck, most of the distutils' flaws lie in their extreme versatility.
That comment is just silly: distutils is so powerful because of its versatility. You wouldn't have been able to build setuptools without this versatility.
You're confusing a well-factored framework with user-level versatility. Using variables instead of hardcoding filenames internally is a very good idea. Exposing those variables for users to change (in the absence of concrete use cases), however, is just bad UI design and a lack of social awareness.
That's your point of view - I've never had a hard time adjusting distutils to whatever I wanted it to do. After you get used to the way things are handled in distutils, extending it is often enough really easy and would be much harder in your One Obvious Way to do it (unless you had a time-machine, zoom to 2042 and then take all possibly ways to build distributions into account on your way back to 2005 ;-).
And your point of view is missing the part where everybody else isn't a distutils expert like you or I, and unlike you or I, has *no interest whatsoever in becoming one*. Simple things should be simple, and distributing most packages shouldn't be rocket science. In particular, there should be a gentle learning curve from "distribute one module" to "complex distribution with autogenerated bits not in source control". And, the path for *how* you do those things should be laid out. There are plenty of things that *are* Obvious use cases, but for which the Distutils Way is not obvious. It's always *possible* to customize via subclassing, and I'm not suggesting that be disallowed. But it shouldn't be necessary for the Obvious Way, and should be *required* for any deviation from the Obvious Way. If you're going to deviate, you should be *aware* that you're on your own, and parting ways with the larger community. You should be aware that you are potentially isolating yourself from the use of community tools based on that Way. Currently, you can never be sure, because there *is* no Way. Everybody has their own, and the result is chaos. Ironically, although the Perl community's language philosophy is "more than one way to do it", their build and distribution philosophy seems to be that there's not merely one obvious way to do it, there's *exactly* one way to do it. And *that* is the real reason why Perl has always been ahead of Python in readily-available libraries. The Perl distribution culture reflects the idea that build tools are for sharing software with the community, not a framework for creating private build systems.
Phillip J. Eby wrote: <major snippage>
And your point of view is missing the part where everybody else isn't a distutils expert like you or I, and unlike you or I, has *no interest whatsoever in becoming one*. Simple things should be simple, and distributing most packages shouldn't be rocket science. In particular, there should be a gentle learning curve from "distribute one module" to "complex distribution with autogenerated bits not in source control". And, the path for *how* you do those things should be laid out.
+1 to everything you said in this post, the above snippet being a pretty good synopsis. And thank you, Phillip, for championing this effort to make Python package distribution simpler and easier for those of us who really don't care how it all works, just that it do so painlessly.
Ironically, although the Perl community's language philosophy is "more than one way to do it", their build and distribution philosophy seems to be that there's not merely one obvious way to do it, there's *exactly* one way to do it. And *that* is the real reason why Perl has always been ahead of Python in readily-available libraries. The Perl distribution culture reflects the idea that build tools are for sharing software with the community, not a framework for creating private build systems.
Sadly, I think there is a lot of truth in this observation. -- Patrick K. O'Brien Orbtech http://www.orbtech.com Schevo http://www.schevo.org
Am 16.11.2005 um 18:53 schrieb Phillip J. Eby:
Again, not everybody is using distribution processes built around CVS or Subversion. Left aside that there are quite a few other SCM tools out there, you also have the case where you create distributions from plain directories (which is what MANIFEST.in and MANIFEST are targetting).
The source control is a supplement to the "add_defaults" of the sdist process, not a replacement for MANIFEST.in. Some users need to add non-source-controlled files, for example. But *simple things should be simple*, and treating source control info as part of the defaults makes them work simply for most people. And if you don't want the defaults, there should be only one way to turn them off - by excluding files in MANIFEST.in, not by a command line option.
I'm not sure I understand this proposal completely, but I don't think it's a good idea to have a project's build/setup process rely on having version control meta-data around. Wouldn't that mean that the built package would be incorrect when the setup is run from an `svn export`ed (or otherwise cleaned up) copy? Cheers, Chris -- Christopher Lenz cmlenz at gmx.de http://www.cmlenz.net/
At 08:21 PM 11/16/2005 +0100, Christopher Lenz wrote:
Am 16.11.2005 um 18:53 schrieb Phillip J. Eby:
Again, not everybody is using distribution processes built around CVS or Subversion. Left aside that there are quite a few other SCM tools out there, you also have the case where you create distributions from plain directories (which is what MANIFEST.in and MANIFEST are targetting).
The source control is a supplement to the "add_defaults" of the sdist process, not a replacement for MANIFEST.in. Some users need to add non-source-controlled files, for example. But *simple things should be simple*, and treating source control info as part of the defaults makes them work simply for most people. And if you don't want the defaults, there should be only one way to turn them off - by excluding files in MANIFEST.in, not by a command line option.
I'm not sure I understand this proposal completely, but I don't think it's a good idea to have a project's build/setup process rely on having version control meta-data around. Wouldn't that mean that the built package would be incorrect when the setup is run from an `svn export`ed (or otherwise cleaned up) copy?
The MANIFEST file would be generated from that data, and then source distributions would contain the MANIFEST, which would then be read in place of the source control metadata when it's not available. There's a similar feature in setuptools now for tagging releases with SVN revision numbers; if you don't have the metadata available, setuptools looks in PKG-INFO to get the revision number that the source release was built from. You're right, however, in that 'svn export' doesn't work with either scheme, in that setuptools assumes the One Obvious Way to distribute source is by running "sdist" on your code base, not by making an export from your revision control. If you don't have that option (or don't use CVS or SVN), you don't get to use setuptools' convenience features, and have to do things the distutils way, with an explicit MANIFEST.in file.
Hi Phillip, In general, I think you are having a different focus here than what distutils is trying to be and that's perfectly OK - you can implement all these nice strategies and automated decisions into your setuptools. I just don't see a benefit in stripping down the framework distutils itself. If people choose setuptools as front-end to distutils that's a perfectly good choice and one I'd like to encourage. Note that distutils would benefit a lot from more support for e.g. InnoSetup, NSIS, native packages for Solaris, HP-UX, Debian. There are a few projects out there trying to add this support, but few have stepped forward to suggest integration with the core framework. The issues around MANIFEST[.in] that you present are really minor compared to not being able to build e.g. Debian packages out of the box and without too much user interaction. Phillip J. Eby wrote:
At 11:49 AM 11/16/2005 +0100, M.-A. Lemburg wrote:
I don't understand what you mean with "no-go" - the current system works just fine if you include the MANIFEST file in the sdist.
But it's not included, and you have to know to include it - and people who previously requested here that MANIFEST and MANIFEST.in be included in the manifest were shot down under the claim that including these defaults would be "too much magic".
I don't see that as a big problem. Why not include it per default like README and the others ?!
The reason I've been avoiding adding this feature, however, is because of the first issue; when you make an sdist, you lose that additional metadata, so it would become impossible to build *any* binary from an sdist, not just RPMs. Until recently, that issue seemed insurmountable.
That's simply not true. You have to include the MANIFEST file in the sdist and then everything is fine.
I'm speaking about the context of setuptools, where the typical user has no MANIFEST.in, because they're using Subversion or CVS and therefore don't need one.
Right, and that's a different context than the one needed for the core framework distutils itself.
* Change the MANIFEST format to be platform-independent (currently it contains OS-specific path separators)
-0.5
You are missing an important point: MANIFEST files can be build using tools outside distutils and external package building tools may require these to be platform dependent.
One reason I posted to find out what specific uses people actually had for such things. Do the uses actually exist? What are they used for?
eGenix for one uses its own file selection mechanism (mostly for historical reasons because we had our own packaging system before we switched to distutils). The MANIFEST.in format is not everybody's favorite, so I expect others to use more common tools such as e.g. Unix find, sed or even just a plain text editor. We also manage the MANIFEST files using Makefiles which take care of the build process, do the checkouts, copies, rsyncs, etc. needed for remote builds.
One important point: on POSIX-y platforms (i.e. virtually everything but Windows), there's no difference between distutils paths and system paths. So, what external tools build or read MANIFEST files on Windows, or any other platform that doesn't accept '/' as a separator?
Also, '/' *is* a valid path separator on Windows, so if your MANIFEST processing is done in Python, a platform-independent format won't have any effect. (Indeed, the only reason to actually *change* the format is so that sdist files built *on Windows* would be usable on other platforms! Other platforms would never see a difference.)
I know that in principle, somebody somewhere may have some tool that would break if MANIFEST changed it's format. My question is, who and how many? If there's some widely-used tool that does this, then it's reasonable to take it into account. If it's just a theoretical possibility, it's not worth much concern. If there are a handful of people with a hard-to-change setup, it's somewhere in between.
The question should not be: how many setups can I break ? It should be: what do we gain by using a single format, e.g. the posix one and how can we avoid breakage ? Note that distutils knows how to transform posix file names into platform dependent ones.
* Always, always, always build MANIFEST, and always include both the MANIFEST file and MANIFEST.in (if present) in the source distribution.
-1 on always building MANIFEST.
This would miss the point of managing MANIFEST files independently of your package files, e.g. using Makefiles or other tools dealing with file dependencies, checkouts, etc.
Please point me to some examples of this, especially one that can't simply generate a MANIFEST.in instead.
See above. Tools like "find" are simply much more complete in terms of file selection. It is also sometimes necessary to massage the paths a bit, using e.g. sed.
* Disable all the options that allow user control over MANIFEST generation, including pruning, defaults, changing the filenames, etc.
-1
Again, you are forgetting that MANIFEST files serve a purpose and are external to the distutils process for a reason. You are free to have distutils build your MANIFEST files from MANIFEST.in files, have distutils command auto-generate them, or use external programs triggered by Makefiles or similar distribution building processes to generate them.
Examples? You keep saying that people *can* do these things, but that's not anything like the same as saying that any significant number of people *actually* do them. Frankly, most people I've encountered who are doing Python software development don't know how to get a basic setup.py to work right, and feel the distutils are way too complicated, underdocumented, or just plain broken, because they don't feel like they can control them.
It's underdocumented, yes, but getting a simple setup.py to work is really not all that complicated - and this is underlined by the fact that most Python packages nowadays are distributed as distutils-based packages.
OSAF, for example, has some developers who are very smart about creating build processes. Smart enough to be *able* to do the kinds of things you're describing. But they sure as heck don't use the distutils to *actually* do them, because from their perspective the distutils is a big pile of broken undocumentedness. If the distutils are so frustrating to such smart developers, there's something wrong.
The code itself is well documented and easy to read. It should be well within range of every average Python programmer. Furthermore, you only need to dig into distutils if you plan to extend or modify its default functionality in some way. The casual user does not have to read the sources.
Thus, I find these super-custom processes you're talking about highly implausible, because the only people who could implement them are the people with a strong knowledge of the distutils -- an incredibly rare breed of person, in other words.
I was only talking about special ways to build the MANIFEST files, not "super-custom" processes. No idea where you got that impression from.
Most people just want this stuff to work, and they don't want to have to learn *how* it works. They have better things to do with their time. They want a build *tool*, not a build library or build framework.
distutils does work for these people. The many existing packages using distutils is proof enough, IMHO. Of course, you can always do better if you have more specific requirements such as your CVS/Subversion integration. Those people should then use your setuptools front-end. I don't see that as a problem.
Most people don't know MANIFEST exists, until they get bitten by the need to have one, or by it being out of date. Hell, look at how many packages on PyPI and the Vaults aren't even packaged with distutils at all!
Not that many... :-)
For many people, it's clearly easier to just tarball your source directory than to have to learn about this MANIFEST stuff.
I agree that this feature is underdocumented, but changing the framework won't help with this: documentation patches is what we *really* need ! BTW, not many people need to have these MANIFEST files at all - distutils uses a built-in file finder based on these defaults: - README or README.txt - setup.py - test/test*.py - all pure Python modules mentioned in setup script - all C sources listed as part of extensions or C libraries in the setup script (doesn't catch C headers!)
In your world, everything is done within distutils, so it's understandable that you'd like to get rid of the external nature of MANIFEST files, but please keep in mind that these features are being used and removing the logic would seriously break things for packagers relying on other mechanisms to build their MANIFEST files.
Please point me to these developers, and show me one that couldn't just spend a few minutes making their tools generate a MANIFEST.in instead. I'm suggesting that we present a very small number of highly-capable people with a truly minor inconvenience, in order to make an extremely large group of people happier by taking away something that invariably bites them.
See above. Carelessly overwriting hand-edited or otherwise generated files in a build process is simply bad design. If a MANIFEST file exists it should be left untouched. If no such file exists, but there's a MANIFEST.in exists, it should be rebuilt. If there's not MANIFEST.in, use a set of sane defaults determined by introspection of the setup.py details. This is what distutils does.
* Use the MANIFEST data (along with revision control info) not only for producing source distributions, but also to determine what files should be considered "package data", if the user passes an 'include_package_data=True' keyword to setup().
Isn't that already the case ? I mean you can put anything you like into MANIFEST and it will be included in the sdist.
I'm talking about package data - a feature that was pioneered in setuptools and added to the distutils in Python 2.4. The ability to specify data files that are *installed* in a package directory.
Specifically, I'm suggesting that users be able to replace the package_data setup keyword with a simple include_package_data flag, so that the MANIFEST data can be used to determine what data files to install. This has nothing to do with sdist; I'm talking about being able to unify the distutils' idea of what files are part of the distribution, because for the simple cases that are 90% of projects, that's a very useful thing to have.
Perhaps you should then enhance the sdist way of finding suitable defaults - it currently does not take package_data files into account. MANIFEST is only used for source code distrbutions. I don't see how you can use it for anything else. See e.g. the way bdist_rpm works: it actually installs the package to find out which files are actually installed and then records all the files copied during that process - that's a very smart, future proof and flexible design.
For every SciPy and mxODBC and Twisted, there are easily a hundred packages without anything like their level of complex build needs. I'm talking about streamlining things for the simple packages, but there isn't anything in what I'm proposing that keeps anybody from doing complex things. I'm just saying we should remove *multiple* ways to do the *same* complex thing. We should pick One Obvious Way to *customize*, replacing multiple hooks with single hooks that allow the same degree of customizability.
I don't buy this: on one hand you are talking about simple packages (which don't need the MANIFEST files in the first place), on the other about hooks to adjust distutils' build process, something I'd group under more complex setups.
For example, if you have tools that generate something, right now you could choose to generate either MANIFEST or MANIFEST.in. I'm suggesting that we should choose for you, and say it's MANIFEST.in that you should generate, since it's the more expressive and flexible of the two formats. Similarly, you can currently choose what filenames MANIFEST and MANIFEST.in have, and I'm suggesting that those be their only names. If you need other files you certainly have the option of copying, renaming, and using other existing file manipulation tools.
These additional degrees of built-in "freedom" just give you *meaningless* choices. They don't make any *real* difference to your ability to get the job done. Instead, they raise a *barrier* to creating tools. If somebody creates a tool to do something with manifest files, you can't use it unless you and they have already agreed to make the *same* choices about these superfluous options, or else the tool maker has to support all possible options - and sometimes, a useful tool that supports all the options just isn't *possible*.
So your point is to make your life as setuptools author easier ? Why don't you just disable all these options in your setuptools front-end and hard-code the MANIFEST file names ?
See, it wouldn't matter if the arbitrary choice was that you have to generate MANIFEST instead of MANIFEST.in. You'd still have the same flexibility. The important thing in these arbitrary choices is that *we pick one*. That's why we have a BDFL for the language - we all argue for a particular surface syntax, and then he picks one, and we all move on. The distutils needs a BDFL to pick between all the mutually incompatible but semantically indistinguishable surface syntaxes for how to build and package something.
distutils is a loosly coupled framework of components. In such a framework, a basic design principle is to be able to decouple and recouple existing components. The only way to implement this is by making the components suitably independent and this is what was done in distutils. Note that adding user options to change certain assumptions or defaults does not count towards having "multiple ways to get something done" - it just gives the user a possiblity to adapt the framework to a particular need and on a case-by-case basis. Also note that it's not hard for setuptools or any other front-end to access these user options - just ask the component for them.
Again, not everybody is using distribution processes built around CVS or Subversion. Left aside that there are quite a few other SCM tools out there, you also have the case where you create distributions from plain directories (which is what MANIFEST.in and MANIFEST are targetting).
The source control is a supplement to the "add_defaults" of the sdist process, not a replacement for MANIFEST.in. Some users need to add non-source-controlled files, for example. But *simple things should be simple*, and treating source control info as part of the defaults makes them work simply for most people. And if you don't want the defaults, there should be only one way to turn them off - by excluding files in MANIFEST.in, not by a command line option.
If you don't want the defaults added, you are requesting a change in the way distutils works. Such a change should be done using the command line switch --no-defaults (or added to setup.cfg). MANIFEST.in OTOH is really only needed in case you plan to add non-standard files to your source distribution. You are not changing the way distutils itself works - just tell it to add a few more things that you might need or that you might not want in the distribution.
I'm also thinking that most of the MANIFEST logic could and should move to the Distribution class, since the data will be used by multiple commands. Thus, the sdist command could just ask the Distribution for the MANIFEST and get it, as would the commands that copy package data files to the build directory.
Wait: MANIFEST defines what goes into the sdist - not an arbitrary (binary) distribution.
I'm talking about the include_package_data option.
It would certainly make sense to have the MANIFEST[.in] files automatically be added as default in sdist.py and I'd be +1 on that (even though it never was an issue for me as I always include them in the MANIFEST file).
Interesting; I could've sworn that you were one of the people who told somebody it would be "too much magic" to include this. But whatever the case, I'm glad you don't oppose it now.
--no-defaults would be ignored, except for a warning. If you don't want the defaults, you can always start your MANIFEST.in with an exclude pattern to exclude absolutely everything already included. There shouldn't be two ways to do the same thing, especially not one that you can use on the command line to mess things up in a non-repeatable fashion! Likewise --no-prune, because that's a similar recipe for disaster.
These options are meant for people who don't have a MANIFEST.in file to begin with or just quickly want to build an sdist with parts of the whole distribution or an extended version (e.g. for testing or upgrading).
Sure - and there are plenty of things I can leave some room for play in. For example, I could simply revert to old behaviors when you use any non-default options. I could make a separate MANIFEST.setuptools file, etc. But these things add complexity, so I want to know who *actually* needs them, and can't trivially work around their absence. I'd rather briefly inconvenience distutils mavens like you, than continue to stump and frustrate the hundreds of people who just don't get why it's all so damn complicated.
Look, nobody stops you from removing all these features in your front-end. distutils lets you do all this and that's what so great about it. My point is that you shouldn't try to strip down distutils itself just because you think it's hard work to support all these features in setuptools. It's not needed to strip down distutils for this reason as you can easily disable these options for anyone using your setuptools. As a result, both users of setuptools and straight distutils are happy. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2005)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
At 12:19 AM 11/17/2005 +0100, M.-A. Lemburg wrote:
Thus, I find these super-custom processes you're talking about highly implausible, because the only people who could implement them are the people with a strong knowledge of the distutils -- an incredibly rare breed of person, in other words.
I was only talking about special ways to build the MANIFEST files, not "super-custom" processes. No idea where you got that impression from.
Anything that involves a user knowing about MANIFEST at *all* (let alone changing its name) is a "super custom" process. Which unfortunately means that even for a fairly basic sdist, it's an issue.
I agree that this feature is underdocumented, but changing the framework won't help with this: documentation patches is what we *really* need !
That's certainly part of it; but it's certainly not all of it. Mostly, it's the difference between being a tool and a framework. The distutils provides all the mechanism you like, and none of the policy. That was a brilliant political move, as it dodges all the infighting that otherwise would have stopped it getting off the ground. However, there's no longer any reason *not* to have One Official Python Way To Do It, and to streamline the tools around that philosophy. One minor example of the problems with the distutils, is that their flexibility and inconsistent documentation causes people to create wildly different project source layouts, just because they can. You call that flexibility, I call that bad design for the developers who don't want to have to think about all these options (i.e., most of them).
BTW, not many people need to have these MANIFEST files at all - distutils uses a built-in file finder based on these defaults:
- README or README.txt - setup.py - test/test*.py - all pure Python modules mentioned in setup script - all C sources listed as part of extensions or C libraries in the setup script (doesn't catch C headers!)
It also leaves out documentation and package data, for starters.
Carelessly overwriting hand-edited or otherwise generated files in a build process is simply bad design.
Not if it's a file *owned* by the distutils. If MANIFEST is assumed to be some sort of possibly user-generated file, the distutils should've picked a different name for *its* manifest file, and maybe provided the ability to read other files to generate its manifest. This is my point: there should be a manifest file that is wholly owned and controlled by the distutils with *no user-serviceable parts inside*. Having a file that is *partly* owned by the distutils and partly by the user is a recipe for disaster and the real bad design in this case.
MANIFEST is only used for source code distrbutions. I don't see how you can use it for anything else.
Because if you have package data files as part of your *source*, then the MANIFEST has to list them in order to have a valid source distribution. So, for some projects, it is equally valid to say, "be sure to install any package data files listed in the MANIFEST", and that is what the 'install_package_data' option would do. (Or maybe 'use_manifest_package_data' would be a better name, but you get the general idea.)
I don't buy this: on one hand you are talking about simple packages (which don't need the MANIFEST files in the first place),
Not true! You run into a problem the moment you have any documentation besides README, or if you have any package data files.
on the other about hooks to adjust distutils' build process, something I'd group under more complex setups.
My point is that a well-designed system that offers customization hooks should not offer you *meaningless* choices between different ways to customize the same thing.
So your point is to make your life as setuptools author easier ?
No, if that was my point I'd have just done whatever the heck I wanted and not bothered posting to the list. :) My point is to get the *community* better tools, and part of that is lowering the entry barriers to creating them. If it was only me I cared about, I'd not have embarked on the entire venture to begin with. Sheesh.
Why don't you just disable all these options in your setuptools front-end and hard-code the MANIFEST file names ?
I plan to! The point of my posting was to find out whether I was going to be introducing any meaningful barriers to adoption of setuptools. For example, if there were some bdist* command out there in widespread use that would break as a result of these changes. Yes, I ranted a little about making any "distutils 2" less flexible, but my practical present-day assumption is that setuptools actually modifying the distutils is still a long time coming. I don't even plan to *propose* that idea myself, certainly not any time soon.
distutils is a loosly coupled framework of components. In such a framework, a basic design principle is to be able to decouple and recouple existing components. The only way to implement this is by making the components suitably independent and this is what was done in distutils.
You're confusing a usability issue with a technical design issue. One calls for TMTOWTDI (There's More Than One Way To Do It), the other for TSBO-APOO-OWTDI (There Should Be One - And Preferably Only One - Obvious Way To Do It).
My point is that you shouldn't try to strip down distutils itself just because you think it's hard work to support all these features in setuptools. It's not needed to strip down distutils for this reason as you can easily disable these options for anyone using your setuptools.
As a result, both users of setuptools and straight distutils are happy.
This was always my plan; the question was to find out whether there were any holes in that plan, for *setuptools*. I'm sorry if that wasn't clear; I only talked about removing these things in distutils under a hypothetical "distutils 2" scenario. *None* of these changes were being proposed for a "pure" distutils today. That doesn't mean I don't think that the distutils design is too flexible in terms of offering meaningless choices, or more precisely, offering lots of choices to either shoot yourself in the foot. :) It just means that I realistically understand that today's distutils are unlikely to be changeable. Rather, I'm looking at the eventual transmutation of setuptools into "distutils 2", and want to make sure that late adopters moving to it in the distant future don't run into issues that haven't been accomodated in some way.
On 11/17/05, Phillip J. Eby <pje@telecommunity.com> wrote:
That's certainly part of it; but it's certainly not all of it. Mostly, it's the difference between being a tool and a framework. The distutils provides all the mechanism you like, and none of the policy. That was a brilliant political move, as it dodges all the infighting that otherwise would have stopped it getting off the ground. However, there's no longer any reason *not* to have One Official Python Way To Do It, and to streamline the tools around that philosophy.
This seems to me to be a good point. However, wouldn't a relatively simple way of addressing it be just to *document* the One Official Way (ideally, in the Python documentation, but maybe hosted elsewhere as well for users of current and older versions)? You're right in saying (elsewhere, I think - I can't find the exact quote now) that the distutils documentation is a bit lacking in "how should I do it" help like this. You could add a footnote somewhere that setuptools is limited to support of projects following this approach. Paul.
M.-A. Lemburg wrote:
Hi Phillip,
In general, I think you are having a different focus here than what distutils is trying to be and that's perfectly OK - you can implement all these nice strategies and automated decisions into your setuptools.
<major snippage>
Look, nobody stops you from removing all these features in your front-end. distutils lets you do all this and that's what so great about it.
My point is that you shouldn't try to strip down distutils itself just because you think it's hard work to support all these features in setuptools. It's not needed to strip down distutils for this reason as you can easily disable these options for anyone using your setuptools.
As a result, both users of setuptools and straight distutils are happy.
Since I commented on Phillip's post, I figured I'd better comment on this one as well. In a nutshell, this has been one of the more passionate yet even-handed and rational threads I've seen in some time. I'm impressed by the quality of this dialog and am hopeful that good things will come from it. Thank you both, for what it's worth. :-) -- Patrick K. O'Brien Orbtech http://www.orbtech.com Schevo http://www.schevo.org
participants (6)
-
Christopher Lenz -
Ian Bicking -
M.-A. Lemburg -
Patrick K. O'Brien -
Paul Moore -
Phillip J. Eby