Mailman 3 Builders vs Installers - Distutils-SIG

newer
Merge catalog-sig and distutils-sig

Builders vs Installers

older
python setup.py bdist_rpm fails on...

Paul Moore

25 Mar 2013 25 Mar '13

9:08 p.m.

There's a longer-term issue that occurred to me when thinking about pip's role as a "builder" or an "installer" (to use Nick's terminology). As I understand Nick's vision for the future, installers (like pip) will locate built wheels and download and install them, and builders (like distutils and bento) will be responsible for building wheels. But there's an intermediate role which shouldn't get forgotten in the transition - the role that pip currently handles with the "pip wheel" command. This is where I specify a list of distributions, and pip locates sdists, downloads them, checks dependencies, and ultimately builds all of the wheels. I'm not sure whether the current idea of builders includes this "locate, download and resolve dependencies" function (distutils and bento certainly don't have that capability). I imagine that pip will retain some form of the current "pip wheel" capability that covers this requirement, but maybe as the overall picture of the new design gets clarified, this role should be captured. Paul

Show replies by date

Vinay Sajip

25 Mar 25 Mar

10:46 p.m.

Paul Moore <p.f.moore <at> gmail.com> writes:

...

I imagine that pip will retain some form of the current "pip wheel" capability that covers this requirement, but maybe as the overall picture of the new design gets clarified, this role should be captured.

Strictly speaking I would have thought "pip wheel" was a builder function which is only in pip as a transitional step, to get wheels more exposure. Is that an incorrect assumption on my part? Regards, Vinay Sajip

Paul Moore

11 p.m.

On 25 March 2013 22:46, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

Paul Moore <p.f.moore <at> gmail.com> writes:

...
I imagine that pip will retain some form of the current "pip wheel" capability that covers this requirement, but maybe as the overall picture of the new design gets clarified, this role should be captured.

Strictly speaking I would have thought "pip wheel" was a builder function which is only in pip as a transitional step, to get wheels more exposure. Is that an incorrect assumption on my part?

If some other tool provides the same functionality, I can see the possibility that pip will drop it (assuming that pip takes the route of becoming a "pure installer"). But I can't imagine pip dropping that functionality *until* some other tool is available which does the equivalent. Paul

Daniel Holth

11:07 p.m.

Unix users will always want to compile their own. Pip wheel is not going away, but we will definitely evolve the implementation. On Mar 25, 2013 7:00 PM, "Paul Moore" <p.f.moore@gmail.com> wrote:

...

...
Paul Moore <p.f.moore <at> gmail.com> writes:

...
I imagine that pip will retain some form of the current "pip wheel" capability that covers this requirement, but maybe as the overall picture of the new design gets clarified, this role should be captured.

Strictly speaking I would have thought "pip wheel" was a builder function which is only in pip as a transitional step, to get wheels more exposure. Is

On 25 March 2013 22:46, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote: that an

...
incorrect assumption on my part?

If some other tool provides the same functionality, I can see the possibility that pip will drop it (assuming that pip takes the route of becoming a "pure installer"). But I can't imagine pip dropping that functionality *until* some other tool is available which does the equivalent.

Paul _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Lennart Regebro

27 Mar 27 Mar

1:39 p.m.

On Tue, Mar 26, 2013 at 12:07 AM, Daniel Holth <dholth@gmail.com> wrote:

...

Unix users will always want to compile their own.

Yup.

...

Pip wheel is not going away

I don't see how that follows. //Lennart

Daniel Holth

1:57 p.m.

On Wed, Mar 27, 2013 at 9:39 AM, Lennart Regebro <regebro@gmail.com> wrote:

...

On Tue, Mar 26, 2013 at 12:07 AM, Daniel Holth <dholth@gmail.com> wrote:

...
Unix users will always want to compile their own.

Yup.

...
Pip wheel is not going away

I don't see how that follows.

//Lennart

Is it too convenient? The tool knows how to find sources, compile them, and install them. It will delegate all the work to the actual build system. If pip was a pure installer without a way to invoke a build system then it wouldn't be able to install from sdist at all. It would help if you'd describe the alternative workflow again. The proposed workflow is "pip wheel package"; "pip install --find-links wheelhouse --no-index package". We don't suggest uploading the wheels used to cache compilation to pypi especially since most of them are probably other people's packages.

Lennart Regebro

2:04 p.m.

On Wed, Mar 27, 2013 at 2:57 PM, Daniel Holth <dholth@gmail.com> wrote:

...

Is it too convenient? The tool knows how to find sources, compile them, and install them. It will delegate all the work to the actual build system. If pip was a pure installer without a way to invoke a build system then it wouldn't be able to install from sdist at all.

All of that should be implemented in a library that pip can use. So this is only a question of a conceptual difference between different tools. It makes no sense to have a tools for developers that does everything including running building, running tests and packaging, and another tool that does nothing but installs, and creates wheel packages. Making wheels should be a part of the tool using for packaging, not the tool used for installing. //Lennart

Paul Moore

2:16 p.m.

On 27 March 2013 14:04, Lennart Regebro <regebro@gmail.com> wrote:

...

On Wed, Mar 27, 2013 at 2:57 PM, Daniel Holth <dholth@gmail.com> wrote:

...
Is it too convenient? The tool knows how to find sources, compile them, and install them. It will delegate all the work to the actual build system. If pip was a pure installer without a way to invoke a build system then it wouldn't be able to install from sdist at all.

All of that should be implemented in a library that pip can use. So this is only a question of a conceptual difference between different tools.

It makes no sense to have a tools for developers that does everything including running building, running tests and packaging, and another tool that does nothing but installs, and creates wheel packages.

Making wheels should be a part of the tool using for packaging, not the tool used for installing.

But sometimes practicality beats purity. As an end user who wants to just install packages, but who knows that not everything will be available as wheels, I need to be able to build my own wheels. But I don't want a full development tool. Having the install tool able to do a download and build from sdist is a huge convenience to me. Of course if someone builds a "wheelmaker" tool that did precisely what "pip wheel" did, I would have no objections to using that. But even then, the mere existence of another tool doesn't seem to me to be enough justification for removing functionality from pip. If pip wheel didn't exist, and someone had written wheelmaker, I would not be arguing to *add* pip wheel. But it's there already and there's a much higher bar for removing useful functionality. Paul

Lennart Regebro

2:41 p.m.

On Wed, Mar 27, 2013 at 3:16 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...

But sometimes practicality beats purity. As an end user who wants to just install packages, but who knows that not everything will be available as wheels, I need to be able to build my own wheels.

Can you explain to me why you as an end user can not just install the packages? Why do you need to first build wheels? //Lennart

Vinay Sajip

2:47 p.m.

Lennart Regebro <regebro <at> gmail.com> writes:

...

Can you explain to me why you as an end user can not just install the packages? Why do you need to first build wheels?

One likely scenario on Windows is that you have a compiler and can install from sdists or wheels, but want to distribute packages to people who don't have a compiler, so can only install from wheels. Regards, Vinay Sajip

Lennart Regebro

3:22 p.m.

On Wed, Mar 27, 2013 at 3:47 PM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

One likely scenario on Windows is that you have a compiler and can install from sdists or wheels, but want to distribute packages to people who don't have a compiler, so can only install from wheels.

Which means you are actually not just a simple end user, but ops or devops who want to build packages. And then the question arises why we can't have documentation explaining how to build packages with the packaging tools that is usable for that user. Fine, as a stop-gap measure pip wheel might be useful, as this mythical packaging tool doesn't really exist yet (except as bdist_wheel, but I suspect pip wheel does more than that?) But in the long run I don't see the point, and I think it muddles what pip is and does. //Lennart

Daniel Holth

3:49 p.m.

On Wed, Mar 27, 2013 at 11:22 AM, Lennart Regebro <regebro@gmail.com> wrote:

...

On Wed, Mar 27, 2013 at 3:47 PM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...
One likely scenario on Windows is that you have a compiler and can install from sdists or wheels, but want to distribute packages to people who don't have a compiler, so can only install from wheels.

Which means you are actually not just a simple end user, but ops or devops who want to build packages. And then the question arises why we can't have documentation explaining how to build packages with the packaging tools that is usable for that user.

Fine, as a stop-gap measure pip wheel might be useful, as this mythical packaging tool doesn't really exist yet (except as bdist_wheel, but I suspect pip wheel does more than that?) But in the long run I don't see the point, and I think it muddles what pip is and does.

Then you are also in favor of removing sdist support from the "pip install" command, in the same way that rpm doesn't automatically compile srpm. Pip wheel does nothing more than run bdist_wheel on each package in a requirements set. It's kindof a stopgap measure but it's also a firm foundation for the more decoupled way packaging should work.

Lennart Regebro

4:18 p.m.

On Wed, Mar 27, 2013 at 4:49 PM, Daniel Holth <dholth@gmail.com> wrote:

...

Then you are also in favor of removing sdist support from the "pip install" command, in the same way that rpm doesn't automatically compile srpm.

I was not aware that pip could create sdists. //Lennart

Daniel Holth

5:09 p.m.

On Wed, Mar 27, 2013 at 12:18 PM, Lennart Regebro <regebro@gmail.com> wrote:

...

On Wed, Mar 27, 2013 at 4:49 PM, Daniel Holth <dholth@gmail.com> wrote:

...
Then you are also in favor of removing sdist support from the "pip install" command, in the same way that rpm doesn't automatically compile srpm.

I was not aware that pip could create sdists.

In my view the fact that pip creates an installation as an artifact of installing from a source package is equivalent to creating a wheel, given that wheel is a format defined as a zip file containing one installation of a distribution. Both operations equally ruin pip's reputation as being an installer instead of a build tool. Instead all installation should have an intermediate, static, documented binary representation created by the build tool that is later moved into place by the install tool. I would be pleased if "pip install" lost the ability to natively install sdists without that intermediate step.

PJ Eby

5:46 p.m.

On Wed, Mar 27, 2013 at 1:09 PM, Daniel Holth <dholth@gmail.com> wrote:

...

I would be pleased if "pip install" lost the ability to natively install sdists without that intermediate step.

At that point, it would be giving easy_install (or any other tool that did) a comparative advantage. So that's probably not going to fly. (Unless of course you meant that the intermediate step remains transparent to the user.) easy_install (and pip) became popular because they get code from developers to users with the fewest possible steps for people on either end of the distribution channel. Adding pointless steps is both bad UI design and poor marketing.

Lennart Regebro

6:50 p.m.

On Wed, Mar 27, 2013 at 6:09 PM, Daniel Holth <dholth@gmail.com> wrote:

...

In my view the fact that pip creates an installation as an artifact of installing from a source package is equivalent to creating a wheel, given that wheel is a format defined as a zip file containing one installation of a distribution. Both operations equally ruin pip's reputation as being an installer instead of a build tool.

How installing something can ruin the reputation as an installer is beyond me.

...

Instead all installation should have an intermediate, static, documented binary representation created by the build tool that is later moved into place by the install tool. I would be pleased if "pip install" lost the ability to natively install sdists without that intermediate step.

That's a separate issue, but I disagree with that as well. //Lennart

Daniel Holth

7:08 p.m.

On Wed, Mar 27, 2013 at 2:50 PM, Lennart Regebro <regebro@gmail.com> wrote:

...

On Wed, Mar 27, 2013 at 6:09 PM, Daniel Holth <dholth@gmail.com> wrote:

...
In my view the fact that pip creates an installation as an artifact of installing from a source package is equivalent to creating a wheel, given that wheel is a format defined as a zip file containing one installation of a distribution. Both operations equally ruin pip's reputation as being an installer instead of a build tool.

How installing something can ruin the reputation as an installer is beyond me.

...
Instead all installation should have an intermediate, static, documented binary representation created by the build tool that is later moved into place by the install tool. I would be pleased if "pip install" lost the ability to natively install sdists without that intermediate step.

That's a separate issue, but I disagree with that as well.

//Lennart

We have a different definition of build tools if installing an sdist that has a C extension doesn't make pip a build tool already. Clearly we're just going to disagree on this one.

Lennart Regebro

7:20 p.m.

On Wed, Mar 27, 2013 at 8:08 PM, Daniel Holth <dholth@gmail.com> wrote:

...

We have a different definition of build tools if installing an sdist that has a C extension doesn't make pip a build tool already.

Then the word "build tool" is irrelevant, and the whole discussion of builders vs installers is pointless, since installers in the meaning most people use it within Python by necessity also are builders. The point is still that pip IMO should be a tool to *install* distributions, not *make* distributions. That's what is relevant. If the word "builder" does not describe the too that builds distribution then lets not use that word. //Lennart

Vinay Sajip

5:44 p.m.

Lennart Regebro <regebro <at> gmail.com> writes:

...

Fine, as a stop-gap measure pip wheel might be useful, as this mythical packaging tool doesn't really exist yet (except as bdist_wheel, but I suspect pip wheel does more than that?)

Well the distil tool does exist, and though I'm not claiming that it's ready for prime-time yet, it seems well on the way to being useful for this and other purposes. Regards, Vinay Sajip

Lennart Regebro

6:50 p.m.

On Wed, Mar 27, 2013 at 6:44 PM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

Lennart Regebro <regebro <at> gmail.com> writes:

...
Fine, as a stop-gap measure pip wheel might be useful, as this mythical packaging tool doesn't really exist yet (except as bdist_wheel, but I suspect pip wheel does more than that?)

Well the distil tool does exist, and though I'm not claiming that it's ready for prime-time yet, it seems well on the way to being useful for this and other purposes.

Exactly. //Lennart

Daniel Holth

2:51 p.m.

On Wed, Mar 27, 2013 at 10:41 AM, Lennart Regebro <regebro@gmail.com> wrote:

...

On Wed, Mar 27, 2013 at 3:16 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...
But sometimes practicality beats purity. As an end user who wants to just install packages, but who knows that not everything will be available as wheels, I need to be able to build my own wheels.

Can you explain to me why you as an end user can not just install the packages? Why do you need to first build wheels?

//Lennart

It's because when you install lots of the same packages repeatedly you might want it to be lightning fast the second time. The pip wheel workflow also gives you a useful local copy of all the packages you need, insulating yourself from pypi outages. This is the practical side. The long term / bigger picture use case is that the wheel format or an equivalent manifest serves as a sort of packaging WSGI analogue -- a static interface between builds and installs. We would remove the "setup.py install" command entirely. In that world pip would have to build the wheel because it couldn't "just install" the package. The first convenient wheel tool was more like wheeler.py. It was just a shell script that called pip install --no-install, and "setup.py bdist_wheel for subdirectory in build directory".

Vinay Sajip

2:44 p.m.

Paul Moore <p.f.moore <at> gmail.com> writes:

...

Of course if someone builds a "wheelmaker" tool that did precisely what "pip wheel" did, I would have no objections to using that. But

I already have made one, it's called wheeler.py [1]. It uses vanilla pip (not the variant which provides pip wheel) to build wheels from sdists on PyPI. The distil tool builds wheels with or without using vanilla pip as a helper; the vanilla pip helper is needed where you *have* to run setup.py to get a correct build (not always the case). With wheeler.py you need to install distlib, while with distil it's included.

...

even then, the mere existence of another tool doesn't seem to me to be enough justification for removing functionality from pip. If pip wheel didn't exist, and someone had written wheelmaker, I would not be arguing to *add* pip wheel. But it's there already and there's a much higher bar for removing useful functionality.

I personally have no problem with "pip wheel" staying, but it does muddy pip's original intent as denoted by pip standing for "pip installs packages". While "pip wheel" was added as a pragmatic way of getting wheels out there for people to work with, pip's wheel functionality has only recently been added and is unlikely to be widespread, e.g. in distro packages for pip. So it could be reverted (since there are alternatives) and ISTM that the likely impact would only be on a few early adopters. Note that I'm not arguing for reversion at all - it makes sense for there for multiple implementations of wheel building and usage so that interoperability wrinkles can be ironed out. Regards, Vinay Sajip [1] https://gist.github.com/vsajip/4988471

Paul Moore

3:11 p.m.

On 27 March 2013 14:41, Lennart Regebro <regebro@gmail.com> wrote:

...

Can you explain to me why you as an end user can not just install the packages? Why do you need to first build wheels?

Mainly just as Daniel said for convenience of repeat installs (in virtualenvs). But also I think there are a *lot* of different workflows out there and we need to avoid focusing on any one exclusively (the strict builder/installer split is more focused on production installs than on developers installing into virtualenvs, for instance). On 27 March 2013 14:44, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

I personally have no problem with "pip wheel" staying, but it does muddy pip's original intent as denoted by pip standing for "pip installs packages".

I think we have to remember that pip is a reasonably mature tool with a large existing user base. I don't want the idea that pip is now the "official Python installer" to be at odds with continued support of those users and their backward compatibility needs. Refactoring pip's internals and moving towards support of the new standards and workflow models is one thing, and I'm 100% in favour of that, but I don't see major changes in fundamentals like "pip install foo" (working seamlessly even if there are no wheels available for foo) being on the cards. Having a "pip wheel" command fits into that model simply as a way of saying "stop the pip install process just before the actual final install, so that I can run that final step over and over without redoing the first part". Think of it as "pip install --all-but-install" if you like :-) Paul

Daniel Holth

2:16 p.m.

On Wed, Mar 27, 2013 at 10:04 AM, Lennart Regebro <regebro@gmail.com> wrote:

...

On Wed, Mar 27, 2013 at 2:57 PM, Daniel Holth <dholth@gmail.com> wrote:

...
Is it too convenient? The tool knows how to find sources, compile them, and install them. It will delegate all the work to the actual build system. If pip was a pure installer without a way to invoke a build system then it wouldn't be able to install from sdist at all.

All of that should be implemented in a library that pip can use. So this is only a question of a conceptual difference between different tools.

It makes no sense to have a tools for developers that does everything including running building, running tests and packaging, and another tool that does nothing but installs, and creates wheel packages.

Making wheels should be a part of the tool using for packaging, not the tool used for installing.

It kindof works this way already. Pip doesn't include any of the actual wheel building logic, it just collects the necessary sources then calls out to the "build one wheel" tool for each downloaded source archive. The developer's tool has some overlap in functionality but is focused on dealing with one first-party package at a time for upload to the index rather than many packages at a time for download and install. What will change is that pip will include the install logic instead of delegating it to the worst shortcoming of packaging "setup.py install".

Vinay Sajip

2:57 p.m.

Lennart Regebro <regebro <at> gmail.com> writes:

...

It makes no sense to have a tools for developers that does everything including running building, running tests and packaging, and another tool that does nothing but installs, and creates wheel packages.

Making wheels should be a part of the tool using for packaging, not the tool used for installing.

Don't forget that developers are users too - they consume packages as well as developing them. I see no *conceptual* harm in a tool that can do archive/build/install, as long as it can do them well (harder to do than to say, I know). And I see that there is a place for just-installation functionality which does not require the presence of a build environment. But a single tool could have multiple guises, just as some Unix tools of old behaved differently according to which link they were invoked from (the linked-to executable being the same). Isn't our present antagonism to the idea of having one ring to bind them all due to the qualities specific to that ring (setup.py, calls to setup())? Regards, Vinay Sajip

Daniel Holth

3:03 p.m.

On Wed, Mar 27, 2013 at 10:57 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

Lennart Regebro <regebro <at> gmail.com> writes:

...
It makes no sense to have a tools for developers that does everything including running building, running tests and packaging, and another tool that does nothing but installs, and creates wheel packages.

Making wheels should be a part of the tool using for packaging, not the tool used for installing.

Don't forget that developers are users too - they consume packages as well as developing them. I see no *conceptual* harm in a tool that can do archive/build/install, as long as it can do them well (harder to do than to say, I know). And I see that there is a place for just-installation functionality which does not require the presence of a build environment. But a single tool could have multiple guises, just as some Unix tools of old behaved differently according to which link they were invoked from (the linked-to executable being the same).

Isn't our present antagonism to the idea of having one ring to bind them all due to the qualities specific to that ring (setup.py, calls to setup())?

I really think so. distutils is a bad implementation. This has a lot more to do with how it works internally than how its command line interface looks. We can have new tools that do everything with a single command but really delegate the work out to separate decoupled and hopefully pluggable pieces underneath.

Donald Stufft

3:58 p.m.

On Mar 27, 2013, at 10:57 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

Lennart Regebro <regebro <at> gmail.com> writes:

...
It makes no sense to have a tools for developers that does everything including running building, running tests and packaging, and another tool that does nothing but installs, and creates wheel packages.

Making wheels should be a part of the tool using for packaging, not the tool used for installing.

Don't forget that developers are users too - they consume packages as well as developing them. I see no *conceptual* harm in a tool that can do archive/build/install, as long as it can do them well (harder to do than to say, I know). And I see that there is a place for just-installation functionality which does not require the presence of a build environment. But a single tool could have multiple guises, just as some Unix tools of old behaved differently according to which link they were invoked from (the linked-to executable being the same).

Isn't our present antagonism to the idea of having one ring to bind them all due to the qualities specific to that ring (setup.py, calls to setup())?

Regards,

Vinay Sajip

_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Basically this. There no need to *enforce* that the toolchain each be separate pieces, but rather ensure that it *can*. The current status quo means setuptools (or distutils) are the only name in the game, if you want to do anything else you have to pretend you are setuptools. In short setuptools owns the entire process. The goal here is to break it up so no one tool owns the entire process, but still allow tools to act as more then one part of the process when it makes sense. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

PJ Eby

25 Mar 25 Mar

11:15 p.m.

On Mon, Mar 25, 2013 at 5:08 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...

There's a longer-term issue that occurred to me when thinking about pip's role as a "builder" or an "installer" (to use Nick's terminology).

As I understand Nick's vision for the future, installers (like pip) will locate built wheels and download and install them, and builders (like distutils and bento) will be responsible for building wheels. But there's an intermediate role which shouldn't get forgotten in the transition - the role that pip currently handles with the "pip wheel" command. This is where I specify a list of distributions, and pip locates sdists, downloads them, checks dependencies, and ultimately builds all of the wheels. I'm not sure whether the current idea of builders includes this "locate, download and resolve dependencies" function (distutils and bento certainly don't have that capability).

Yes, and to make things even more interesting, consider the cases where there are build-time dependencies. ;-) I would guess that installing from sdists (and revision control) is probably here to stay, along with the inherent coupling between "build" and "fetch" functions. Right now, the "build" side of setuptools fetches build-time dependencies, but in the New World, ISTM that a top-level build tool would just be something that reads the package metadata, finds build-time dependencies, and then runs some entry points to ask for a wheel to be spit out (or to get back a data structure describing the wheel, anyway). This part could be standardized and indeed could be just something that pip does. Another piece that hasn't seemed well-specified so far is what Nick called "archiving" - creating the sdist. Or perhaps more precisely, generating an sdist PKG-INFO and putting the stuff together. IMO, a good install tool needs to be able to run the archiving and building steps as well as installing directly from wheels. However, although metadata 2.0 provides us with a good basis for running build and install steps, there really isn't anything yet in the way of a standard for sdist generation. Of course, we have "setup.py sdist", and the previous work on setup.cfg by the distutil2 team. We could either build on setup.cfg, or perhaps start over with a simple spec to say how the package is to be built: i.e., just a simple set of entry points saying what archiver, builder, etc. are used. Such a file wouldn't change much over the life of the package, and would avoid the need for all the dynamic hooks provided by distutils2's setup.cfg. In the degenerate case, I suppose, it could just be "pyarchiver.cfg" and contain a couple lines saying what tool is used to generate PKG-INFO. On the other hand, we could draw the line at saying, pip only ever installs from sdists, no source checkouts or tarballs. I'm not sure that's a reasonable limitation, though. On the *other* other hand, Perhaps it would be best to just use the setup.cfg work, updated to handle the full metadata 2.0 spec. As I recall, the setup.cfg format handled a ridiculously large number of use cases in a very static format, and IIRC still included the possibility for dynamic hooks to affect the metadata-generation and archive content selection processes, let alone the build and other stages. But then, is that biasing against e.g. bento.info? Argh. Packaging is hard, let's go shopping. ;-) On balance, I think I lean towards just having a simple way to specify your chosen archiver, so that installing from source checkouts and dumps is possible. I just find it annoying that you have to have *two* files in your checkout, one to say what tool you're using, and another one to configure it. (What'd be nice is if you could just somehow detect files like bento.info and setup.cfg and thereby detect what archiver to use. But that would have limited extensibility unless there was a standard naming convention for the files, or a standardized format for at least the first line in the file or something like that, so you could identify the needed tool.)

Daniel Holth

26 Mar 26 Mar

3:35 a.m.

On Mon, Mar 25, 2013 at 7:15 PM, PJ Eby <pje@telecommunity.com> wrote:

...

On Mon, Mar 25, 2013 at 5:08 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...
There's a longer-term issue that occurred to me when thinking about pip's role as a "builder" or an "installer" (to use Nick's terminology).

As I understand Nick's vision for the future, installers (like pip) will locate built wheels and download and install them, and builders (like distutils and bento) will be responsible for building wheels. But there's an intermediate role which shouldn't get forgotten in the transition - the role that pip currently handles with the "pip wheel" command. This is where I specify a list of distributions, and pip locates sdists, downloads them, checks dependencies, and ultimately builds all of the wheels. I'm not sure whether the current idea of builders includes this "locate, download and resolve dependencies" function (distutils and bento certainly don't have that capability).

Yes, and to make things even more interesting, consider the cases where there are build-time dependencies. ;-)

I would guess that installing from sdists (and revision control) is probably here to stay, along with the inherent coupling between "build" and "fetch" functions.

Right now, the "build" side of setuptools fetches build-time dependencies, but in the New World, ISTM that a top-level build tool would just be something that reads the package metadata, finds build-time dependencies, and then runs some entry points to ask for a wheel to be spit out (or to get back a data structure describing the wheel, anyway).

This part could be standardized and indeed could be just something that pip does.

Another piece that hasn't seemed well-specified so far is what Nick called "archiving" - creating the sdist. Or perhaps more precisely, generating an sdist PKG-INFO and putting the stuff together.

IMO, a good install tool needs to be able to run the archiving and building steps as well as installing directly from wheels. However, although metadata 2.0 provides us with a good basis for running build and install steps, there really isn't anything yet in the way of a standard for sdist generation.

Of course, we have "setup.py sdist", and the previous work on setup.cfg by the distutil2 team. We could either build on setup.cfg, or perhaps start over with a simple spec to say how the package is to be built: i.e., just a simple set of entry points saying what archiver, builder, etc. are used. Such a file wouldn't change much over the life of the package, and would avoid the need for all the dynamic hooks provided by distutils2's setup.cfg.

In the degenerate case, I suppose, it could just be "pyarchiver.cfg" and contain a couple lines saying what tool is used to generate PKG-INFO.

On the other hand, we could draw the line at saying, pip only ever installs from sdists, no source checkouts or tarballs. I'm not sure that's a reasonable limitation, though.

On the *other* other hand, Perhaps it would be best to just use the setup.cfg work, updated to handle the full metadata 2.0 spec. As I recall, the setup.cfg format handled a ridiculously large number of use cases in a very static format, and IIRC still included the possibility for dynamic hooks to affect the metadata-generation and archive content selection processes, let alone the build and other stages.

But then, is that biasing against e.g. bento.info? Argh. Packaging is hard, let's go shopping. ;-)

On balance, I think I lean towards just having a simple way to specify your chosen archiver, so that installing from source checkouts and dumps is possible. I just find it annoying that you have to have *two* files in your checkout, one to say what tool you're using, and another one to configure it.

(What'd be nice is if you could just somehow detect files like bento.info and setup.cfg and thereby detect what archiver to use. But that would have limited extensibility unless there was a standard naming convention for the files, or a standardized format for at least the first line in the file or something like that, so you could identify the needed tool.)

The problem we are solving first is not "setuptools", it is simply that right now Python programs are usually installed by downloading and then running a program that performs the actual install. Instead, we're going to let the installer do the actual install. The funny thing is that all the same things happen. We just move the responsibilities around a little bit, things like Bento or distutils2 don't have to implement their own installer code, and we don't have to worry about that code doing bad things like messing up our systems or accessing the Internet. The installer might do new clever things now that it really controls the install. There are a tremendous number of things you can do from there, mostly undeveloped, including decoupling the rest of the packaging pipeline all the way down to the humble sdist, but we don't really need to change the pip user interface. All the same things will continue to happen, just rearranged in a more modular way. The MEBS "ultimate punt" design is that you would iterate over build plugins to recognize an sdist. The sdist would be defined as anything recognized by a plugin. It's probably more practical to at least name the preferred build system in a very minimal setup.cfg. This is still a little bit ugly. In a normal "new style" sdist, you may be able to trust the PKG-INFO file, but when building from source control you can't, would need to inspect setup.cfg, and ask the build system to refresh the metadata.

PJ Eby

5:15 a.m.

On Mon, Mar 25, 2013 at 11:35 PM, Daniel Holth <dholth@gmail.com> wrote:

...

The problem we are solving first is not "setuptools", it is simply that right now Python programs are usually installed by downloading and then running a program that performs the actual install. Instead, we're going to let the installer do the actual install.

The funny thing is that all the same things happen. We just move the responsibilities around a little bit, things like Bento or distutils2 don't have to implement their own installer code, and we don't have to worry about that code doing bad things like messing up our systems or accessing the Internet. The installer might do new clever things now that it really controls the install.

There are a tremendous number of things you can do from there, mostly undeveloped, including decoupling the rest of the packaging pipeline all the way down to the humble sdist, but we don't really need to change the pip user interface. All the same things will continue to happen, just rearranged in a more modular way.

I'm not sure what you're trying to say here; all the above was assumed in my post, as I assumed was implied in Paul's post. I was talking about the *how* we're going to accomplish all this while still supporting building from pure source (vs. an sdist with a 2.0 PKG-INFO). More specifically, I was hoping to move the discussion forward on nailing down some of the details that still need specifying in a PEP somewhere, to finish out what the "new world" of packaging will look like.

...

The MEBS "ultimate punt" design is that you would iterate over build plugins to recognize an sdist. The sdist would be defined as anything recognized by a plugin.

I'm only calling it an "sdist" if it includes a PKG-INFO, since right now that's the only real difference between a source checkout and an sdist. (That, and the sdist might have fewer files.) So, source checkouts and github tarballs aren't "sdists" in this sense. (Perhaps we should call them "raw sources" to distinguish them from sdists.) Anyway, the downside to a plugin approach like this is that it would be a chicken-and-egg problem: you would have to install the plugin before you could build a package from a raw source, and there'd be no way to know that you needed the plugin until *after* you obtained the raw source, if it was using a freshly-invented build tool.

...

It's probably more practical to at least name the preferred build system in a very minimal setup.cfg. This is still a little bit ugly. In a normal "new style" sdist, you may be able to trust the PKG-INFO file, but when building from source control you can't, would need to inspect setup.cfg, and ask the build system to refresh the metadata.

Right. Or technically, the "archiver" in Nick's terminology. Probably we'll need to standardize on a config file, even if it makes life a little more difficult for users of other archiving tools. OTOH, I suppose a case could be made for checking PKG-INFO into source control along with the rest of your code, in which case the problem disappears entirely: there'd be no such thing as a "raw" source in that case. The downside, though, is that there's a small but vocal contingent that believes checking generated files into source control is a sign of ultimate evil, so it probably won't be a *popular* choice. But, if we support "either you have a setup.cfg specifying your archiver, or a PKG-INFO so an archiver isn't needed", then that would probably cover all the bases, actually.

Nick Coghlan

6:48 p.m.

On Tue, Mar 26, 2013 at 3:15 PM, PJ Eby <pje@telecommunity.com> wrote:

...

More specifically, I was hoping to move the discussion forward on nailing down some of the details that still need specifying in a PEP somewhere, to finish out what the "new world" of packaging will look like.

I'm deliberately trying to postpone some of those decisions - one of the reasons distutils2 foundered is because it tried to solve everything at once, and that's just too big a topic. So, *right now*, my focus is on making it possible to systematically decouple building from installing, so that running "setup.py install" on a production system becomes as bizarre an idea as running "make install". As we move further back in the tool chain, I want to follow the lead of the most widely deployed package management systems (i.e. Debian control files and RPM SPEC files) and provide appropriate configurable hooks for *invoking* archivers and builders, allowing developers to choose their own tools, so long as those tools correctly emit standardised formats understood by the rest of the Python packaging ecosystem. In the near term, however, these hooks will still be based on setup.py (specifically setuptools rather than raw distutils, so we can update older versions of Python).

...

OTOH, I suppose a case could be made for checking PKG-INFO into source control along with the rest of your code, in which case the problem disappears entirely: there'd be no such thing as a "raw" source in that case.

The downside, though, is that there's a small but vocal contingent that believes checking generated files into source control is a sign of ultimate evil, so it probably won't be a *popular* choice.

But, if we support "either you have a setup.cfg specifying your archiver, or a PKG-INFO so an archiver isn't needed", then that would probably cover all the bases, actually.

Yeah, you're probably right that we will need to support something else in addition to the PKG-INFO file. A PKG-INFO.in could work, though, rather than a completely independent format like setup.cfg. That way we could easily separate a source checkout/tarball (with PKG-INFO.in) from an sdist (with PKG-INFO) from a wheel (with a named .dist-info directory). (For consistency, we may want to rename PKG-INFO to DIST-INFO in sdist 2.0, though) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Daniel Holth

7:03 p.m.

On Tue, Mar 26, 2013 at 2:48 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On Tue, Mar 26, 2013 at 3:15 PM, PJ Eby <pje@telecommunity.com> wrote:

...
More specifically, I was hoping to move the discussion forward on nailing down some of the details that still need specifying in a PEP somewhere, to finish out what the "new world" of packaging will look like.

I'm deliberately trying to postpone some of those decisions - one of the reasons distutils2 foundered is because it tried to solve everything at once, and that's just too big a topic.

So, *right now*, my focus is on making it possible to systematically decouple building from installing, so that running "setup.py install" on a production system becomes as bizarre an idea as running "make install".

As we move further back in the tool chain, I want to follow the lead of the most widely deployed package management systems (i.e. Debian control files and RPM SPEC files) and provide appropriate configurable hooks for *invoking* archivers and builders, allowing developers to choose their own tools, so long as those tools correctly emit standardised formats understood by the rest of the Python packaging ecosystem.

In the near term, however, these hooks will still be based on setup.py (specifically setuptools rather than raw distutils, so we can update older versions of Python).

...
OTOH, I suppose a case could be made for checking PKG-INFO into source control along with the rest of your code, in which case the problem disappears entirely: there'd be no such thing as a "raw" source in that case.

The downside, though, is that there's a small but vocal contingent that believes checking generated files into source control is a sign of ultimate evil, so it probably won't be a *popular* choice.

But, if we support "either you have a setup.cfg specifying your archiver, or a PKG-INFO so an archiver isn't needed", then that would probably cover all the bases, actually.

Yeah, you're probably right that we will need to support something else in addition to the PKG-INFO file. A PKG-INFO.in could work, though, rather than a completely independent format like setup.cfg. That way we could easily separate a source checkout/tarball (with PKG-INFO.in) from an sdist (with PKG-INFO) from a wheel (with a named .dist-info directory).

(For consistency, we may want to rename PKG-INFO to DIST-INFO in sdist 2.0, though)

Cheers, Nick.

-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

With Metadata 2.0 it's pretty feasible that sdists will have a trustworthy PKG-INFO at their root since there can be a lot less changing of requires-dist based on whether you are on win32 (perhaps occasionally still changing based on things we forgot to put into environment marker variables). It would not be surprising to see them also grow a full .dist-info directory (with an unfortunate copy of PKG-INFO, named METADATA) just like sdists tend to contain .egg-info directories. You might always regenerate the file anyway as long as you're running the package's build system. I think PKG-INFO is a highly human-editable format. My hypothetical sdist archiver would validate PKG-INFO instead of regenerating it. It should be clear that I am also in the deliberately postpone as much as possible camp. Daniel

PJ Eby

8:08 p.m.

On Tue, Mar 26, 2013 at 3:03 PM, Daniel Holth <dholth@gmail.com> wrote:

...

I think PKG-INFO is a highly human-editable format.

That doesn't mean you necessarily want to edit it yourself; notably, there will likely be some redundancy between the description in the file and other files like the README. Also, today one of the key use cases people have for custom code in setup.py is to pull the package version from a __version__ attribute in a module. (Which is evil, of course, but people do it anyway.) But it might be worth adding a setuptools feature to pull metadata from PKG-INFO (or DIST-INFO) instead of generating a new one, to see what people think of using PKG-INFO first, other files second. In principle, one could reduce a setup.py to just "from setuptools import setup_distinfo; setup_distinfo()" or some such. No matter what, though, there's going to be some redundancy with the rest of the project. Some people use revision control tags or other automated tags in their versioning, and it's *precisely* these projects that most need raw source builds. Maybe DIST-INFO shouldn't strictly be a PEP 426-conformant file, but rather, a file that allows some additional metadata to be specified via hooks. That way, you could list your version hook, your readme-generation hook, etc. in it, and then the output gets used to generate the final PKG-INFO. So, call it PKG-INFO.in (as Nick said), or BUILD-INFO, or something like that, add a list of "metadata hooks", and presto: no redundancy in the file, so people can check it into source control, and minimal duplication with your build tool. (Presumably, if you use Bento, your BUILD-INFO file would just list the Bento hook and nothing else, if all the other data comes from Bento's .info file.) Heck, in the minimalist case, you could pretend that a missing BUILD-INFO was there and contained a hook that runs setup.py to troll for the metadata, stopping once setup() is called. ;-) And now it's (mostly) backward compatible.

Erik Bray

9:01 p.m.

On Tue, Mar 26, 2013 at 4:08 PM, PJ Eby <pje@telecommunity.com> wrote:

...

On Tue, Mar 26, 2013 at 3:03 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I think PKG-INFO is a highly human-editable format.

That doesn't mean you necessarily want to edit it yourself; notably, there will likely be some redundancy between the description in the file and other files like the README.

Also, today one of the key use cases people have for custom code in setup.py is to pull the package version from a __version__ attribute in a module. (Which is evil, of course, but people do it anyway.)

But it might be worth adding a setuptools feature to pull metadata from PKG-INFO (or DIST-INFO) instead of generating a new one, to see what people think of using PKG-INFO first, other files second. In principle, one could reduce a setup.py to just "from setuptools import setup_distinfo; setup_distinfo()" or some such.

In other words, using d2to1 and only for `setup.py egg_info` (only not egg_info but whatever we're doing instead to generate the metadata ;) Erik

Daniel Holth

9:12 p.m.

I am -1 on renaming anything unless it solves a technical problem. Forever after we will have to explain "well, it used to be called X, now it's called Y..." On Tue, Mar 26, 2013 at 5:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:

...

On Tue, Mar 26, 2013 at 4:08 PM, PJ Eby <pje@telecommunity.com> wrote:

...
On Tue, Mar 26, 2013 at 3:03 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I think PKG-INFO is a highly human-editable format.

That doesn't mean you necessarily want to edit it yourself; notably, there will likely be some redundancy between the description in the file and other files like the README.

Also, today one of the key use cases people have for custom code in setup.py is to pull the package version from a __version__ attribute in a module. (Which is evil, of course, but people do it anyway.)

But it might be worth adding a setuptools feature to pull metadata from PKG-INFO (or DIST-INFO) instead of generating a new one, to see what people think of using PKG-INFO first, other files second. In principle, one could reduce a setup.py to just "from setuptools import setup_distinfo; setup_distinfo()" or some such.

In other words, using d2to1 and only for `setup.py egg_info` (only not egg_info but whatever we're doing instead to generate the metadata ;)

Erik

Donald Stufft

9:16 p.m.

On Mar 26, 2013, at 5:12 PM, Daniel Holth <dholth@gmail.com> wrote:

...

I am -1 on renaming anything unless it solves a technical problem. Forever after we will have to explain "well, it used to be called X, now it's called Y..."

On Tue, Mar 26, 2013 at 5:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:

...
On Tue, Mar 26, 2013 at 4:08 PM, PJ Eby <pje@telecommunity.com> wrote:

...
On Tue, Mar 26, 2013 at 3:03 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I think PKG-INFO is a highly human-editable format.

That doesn't mean you necessarily want to edit it yourself; notably, there will likely be some redundancy between the description in the file and other files like the README.

Also, today one of the key use cases people have for custom code in setup.py is to pull the package version from a __version__ attribute in a module. (Which is evil, of course, but people do it anyway.)

But it might be worth adding a setuptools feature to pull metadata from PKG-INFO (or DIST-INFO) instead of generating a new one, to see what people think of using PKG-INFO first, other files second. In principle, one could reduce a setup.py to just "from setuptools import setup_distinfo; setup_distinfo()" or some such.

In other words, using d2to1 and only for `setup.py egg_info` (only not egg_info but whatever we're doing instead to generate the metadata ;)

Erik

Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Rename it and make it JSON instead of the homebrew* format! * Yes techincally it's based on a real format, but that format doesn't support all the things it needs so there are hackishly added extensions added to it. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Daniel Holth

9:21 p.m.

I want to poke myself in the eye every time I have to edit json by hand. Especially the description field. On Mar 26, 2013 5:17 PM, "Donald Stufft" <donald@stufft.io> wrote:

...

On Mar 26, 2013, at 5:12 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I am -1 on renaming anything unless it solves a technical problem. Forever after we will have to explain "well, it used to be called X, now it's called Y..."

On Tue, Mar 26, 2013 at 5:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:

...
On Tue, Mar 26, 2013 at 4:08 PM, PJ Eby <pje@telecommunity.com> wrote:

...
On Tue, Mar 26, 2013 at 3:03 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I think PKG-INFO is a highly human-editable format.

That doesn't mean you necessarily want to edit it yourself; notably, there will likely be some redundancy between the description in the file and other files like the README.

Also, today one of the key use cases people have for custom code in setup.py is to pull the package version from a __version__ attribute in a module. (Which is evil, of course, but people do it anyway.)

But it might be worth adding a setuptools feature to pull metadata from PKG-INFO (or DIST-INFO) instead of generating a new one, to see what people think of using PKG-INFO first, other files second. In principle, one could reduce a setup.py to just "from setuptools import setup_distinfo; setup_distinfo()" or some such.

In other words, using d2to1 and only for `setup.py egg_info` (only not egg_info but whatever we're doing instead to generate the metadata ;)

Erik

Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Rename it and make it JSON instead of the homebrew* format!

* Yes techincally it's based on a real format, but that format doesn't support all the things it needs so there are hackishly added extensions added to it.

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Donald Stufft

9:24 p.m.

On Mar 26, 2013, at 5:21 PM, Daniel Holth <dholth@gmail.com> wrote:

...

I want to poke myself in the eye every time I have to edit json by hand. Especially the description field.

On Mar 26, 2013 5:17 PM, "Donald Stufft" <donald@stufft.io> wrote:

On Mar 26, 2013, at 5:12 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I am -1 on renaming anything unless it solves a technical problem. Forever after we will have to explain "well, it used to be called X, now it's called Y..."

On Tue, Mar 26, 2013 at 5:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:

...
On Tue, Mar 26, 2013 at 4:08 PM, PJ Eby <pje@telecommunity.com> wrote:

...
On Tue, Mar 26, 2013 at 3:03 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I think PKG-INFO is a highly human-editable format.

That doesn't mean you necessarily want to edit it yourself; notably, there will likely be some redundancy between the description in the file and other files like the README.

Also, today one of the key use cases people have for custom code in setup.py is to pull the package version from a __version__ attribute in a module. (Which is evil, of course, but people do it anyway.)

But it might be worth adding a setuptools feature to pull metadata from PKG-INFO (or DIST-INFO) instead of generating a new one, to see what people think of using PKG-INFO first, other files second. In principle, one could reduce a setup.py to just "from setuptools import setup_distinfo; setup_distinfo()" or some such.

In other words, using d2to1 and only for `setup.py egg_info` (only not egg_info but whatever we're doing instead to generate the metadata ;)

Erik

Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Rename it and make it JSON instead of the homebrew* format!

* Yes techincally it's based on a real format, but that format doesn't support all the things it needs so there are hackishly added extensions added to it.

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

So don't edit it by hand, nobody edits PKG-INFO by hand. PKG-INFO (and the would be replacement) are for tools. Archiver can create it however the package author wants to, could be setup.py sdist, could be bentomaker sdist, could be totallyradpackagemaker create. It's a data exchange format not the API for developers or end users. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Daniel Holth

9:56 p.m.

We will have a standard json version of metadata 2. On Mar 26, 2013 5:24 PM, "Donald Stufft" <donald@stufft.io> wrote:

...

On Mar 26, 2013, at 5:21 PM, Daniel Holth <dholth@gmail.com> wrote:

I want to poke myself in the eye every time I have to edit json by hand. Especially the description field. On Mar 26, 2013 5:17 PM, "Donald Stufft" <donald@stufft.io> wrote:

...
On Mar 26, 2013, at 5:12 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I am -1 on renaming anything unless it solves a technical problem. Forever after we will have to explain "well, it used to be called X, now it's called Y..."

On Tue, Mar 26, 2013 at 5:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:

...
On Tue, Mar 26, 2013 at 4:08 PM, PJ Eby <pje@telecommunity.com> wrote:

...
On Tue, Mar 26, 2013 at 3:03 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I think PKG-INFO is a highly human-editable format.

That doesn't mean you necessarily want to edit it yourself; notably, there will likely be some redundancy between the description in the file and other files like the README.

Also, today one of the key use cases people have for custom code in setup.py is to pull the package version from a __version__ attribute in a module. (Which is evil, of course, but people do it anyway.)

But it might be worth adding a setuptools feature to pull metadata from PKG-INFO (or DIST-INFO) instead of generating a new one, to see what people think of using PKG-INFO first, other files second. In principle, one could reduce a setup.py to just "from setuptools import setup_distinfo; setup_distinfo()" or some such.

In other words, using d2to1 and only for `setup.py egg_info` (only not egg_info but whatever we're doing instead to generate the metadata ;)

Erik

Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Rename it and make it JSON instead of the homebrew* format!

* Yes techincally it's based on a real format, but that format doesn't support all the things it needs so there are hackishly added extensions added to it.

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

So don't edit it by hand, nobody edits PKG-INFO by hand. PKG-INFO (and the would be replacement) are for tools. Archiver can create it however the package author wants to, could be setup.py sdist, could be bentomaker sdist, could be totallyradpackagemaker create. It's a data exchange format not the API for developers or end users.

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Donald Stufft

10:01 p.m.

On Mar 26, 2013, at 5:56 PM, Daniel Holth <dholth@gmail.com> wrote:

...

We will have a standard json version of metadata 2.

On Mar 26, 2013 5:24 PM, "Donald Stufft" <donald@stufft.io> wrote:

On Mar 26, 2013, at 5:21 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I want to poke myself in the eye every time I have to edit json by hand. Especially the description field.

On Mar 26, 2013 5:17 PM, "Donald Stufft" <donald@stufft.io> wrote:

On Mar 26, 2013, at 5:12 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I am -1 on renaming anything unless it solves a technical problem. Forever after we will have to explain "well, it used to be called X, now it's called Y..."

On Tue, Mar 26, 2013 at 5:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:

...
On Tue, Mar 26, 2013 at 4:08 PM, PJ Eby <pje@telecommunity.com> wrote:

...
On Tue, Mar 26, 2013 at 3:03 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I think PKG-INFO is a highly human-editable format.

That doesn't mean you necessarily want to edit it yourself; notably, there will likely be some redundancy between the description in the file and other files like the README.

Also, today one of the key use cases people have for custom code in setup.py is to pull the package version from a __version__ attribute in a module. (Which is evil, of course, but people do it anyway.)

But it might be worth adding a setuptools feature to pull metadata from PKG-INFO (or DIST-INFO) instead of generating a new one, to see what people think of using PKG-INFO first, other files second. In principle, one could reduce a setup.py to just "from setuptools import setup_distinfo; setup_distinfo()" or some such.

In other words, using d2to1 and only for `setup.py egg_info` (only not egg_info but whatever we're doing instead to generate the metadata ;)

Erik

Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Rename it and make it JSON instead of the homebrew* format!

* Yes techincally it's based on a real format, but that format doesn't support all the things it needs so there are hackishly added extensions added to it.

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

So don't edit it by hand, nobody edits PKG-INFO by hand. PKG-INFO (and the would be replacement) are for tools. Archiver can create it however the package author wants to, could be setup.py sdist, could be bentomaker sdist, could be totallyradpackagemaker create. It's a data exchange format not the API for developers or end users.

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Hopefully this will be included in .dist-info and in every package so we* can pretend PKF-INFO doesn't exist ;) * The proverbial we. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Nick Coghlan

11:50 p.m.

On Wed, Mar 27, 2013 at 8:01 AM, Donald Stufft <donald@stufft.io> wrote:

...

Hopefully this will be included in .dist-info and in every package so we* can pretend PKF-INFO doesn't exist ;)

The key-value format is actually easier for hand editing and covers most cases. The extension format allows embedded JSON for more complex cases. As an on-disk format, it's isomorphic to JSON, so I don't actually plan to propose changing it. Where we *do* need JSON-compatible metadata, though, is as an easy to pass around in-memory data structure for use in APIs. In particular, metadata 2.0 will be defining this format (and how to convert it to/from the key/value format) so that the signature of the post-install hook can be: def post_install_hook(installed, previous=None): ... "installed" will be a string-keyed metadata dictionary for the distribution that was just installed, containing only dicts, lists and strings as values. "previous" will be the metadata for the version of the distribution that was previously installed, if any. Cheers, Nick. P.S. And now I'm leaving for the airport to fly home to Australia - no more replies from me for a couple of days :) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Donald Stufft

27 Mar 27 Mar

12:33 a.m.

On Mar 26, 2013, at 7:50 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On Wed, Mar 27, 2013 at 8:01 AM, Donald Stufft <donald@stufft.io> wrote:

...
Hopefully this will be included in .dist-info and in every package so we* can pretend PKF-INFO doesn't exist ;)

The key-value format is actually easier for hand editing and covers most cases. The extension format allows embedded JSON for more complex cases. As an on-disk format, it's isomorphic to JSON, so I don't actually plan to propose changing it.

I disagree. - These files are used for tools to exchange data, so "hand editable" shouldn't be a primary concern. - There are a number of current fields where the current format is *not* enough and one off psuedo formats have had to be added - `Keywords: dog puppy voting election` - A list masquerading as a string, this one needs field.split() to actually parse it - `Project-URL: Bug, Issue Tracker, http://bitbucket.org/tarek/distribute/issues/` - A dictionary masquerading as a list of strings, this one needs {key.strip(): value.strip() for key, value in [x.rsplit(", ", 1) for x in field]} - Any of the fields can contain arbitrary content, previously Description had specialized handling for this which it has now been moved to the payload section, but all the same issues there affect other fields. - The Extension field name using ExtensionName/ActualKey to kludge a nested dictionary - The ExtensionName/json is a horrible kludge why are we nesting a format inside of a format instead of just using a format that supports everything we could want? As far as I can tell the only things that even use PKG-INFO is setuptools/distribute and we want to phase them out of existence anyways. The only other thing I can think of is Wheel which can either a) be updated to a different format it's new enough there's not much need to worry about legacy support or b) generate the METADATA file just for Wheels. TBH I'd like it if my name was removed as author of the PEP, I only briefly touched the versioning section and I do not agree with the decision to continue using PKG-INFO and do not want my name attached to a PEP that advocates it.

...

Where we *do* need JSON-compatible metadata, though, is as an easy to pass around in-memory data structure for use in APIs. In particular, metadata 2.0 will be defining this format (and how to convert it to/from the key/value format) so that the signature of the post-install hook can be:

def post_install_hook(installed, previous=None): ...

"installed" will be a string-keyed metadata dictionary for the distribution that was just installed, containing only dicts, lists and strings as values. "previous" will be the metadata for the version of the distribution that was previously installed, if any.

Cheers, Nick.

P.S. And now I'm leaving for the airport to fly home to Australia - no more replies from me for a couple of days :)

-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

PJ Eby

1:12 a.m.

On Tue, Mar 26, 2013 at 8:33 PM, Donald Stufft <donald@stufft.io> wrote:

...

As far as I can tell the only things that even use PKG-INFO is setuptools/distribute and we want to phase them out of existence anyways.

The only thing setuptools uses it for is to find out the version of a package in the case where an .egg-info directory or filename doesn't have a version in its filename... which normally only happens in the "setup.py develop" case. So no need to keep it around on my account. ;-) (Some tools do check for the *existence* of a PKG-INFO, like PyPI's sdist upload validation, and the various egg formats require a file *named* PKG-INFO, but AFAIK nothing commonly used out there actually *reads* PKG-INFO or gives a darn about its contents, except for that version usecase mentioned above.)

Daniel Holth

2:49 a.m.

On Tue, Mar 26, 2013 at 9:12 PM, PJ Eby <pje@telecommunity.com> wrote:

...

On Tue, Mar 26, 2013 at 8:33 PM, Donald Stufft <donald@stufft.io> wrote:

...
As far as I can tell the only things that even use PKG-INFO is setuptools/distribute and we want to phase them out of existence anyways.

The only thing setuptools uses it for is to find out the version of a package in the case where an .egg-info directory or filename doesn't have a version in its filename... which normally only happens in the "setup.py develop" case. So no need to keep it around on my account. ;-)

(Some tools do check for the *existence* of a PKG-INFO, like PyPI's sdist upload validation, and the various egg formats require a file *named* PKG-INFO, but AFAIK nothing commonly used out there actually *reads* PKG-INFO or gives a darn about its contents, except for that version usecase mentioned above.) _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

It will be OK. Take a deep breath and laugh at the idea that string.rsplit(', ', 1) on a useless field that's probably already posted as a dict to pypi should be considered a serious threat to the future of packaging. If you didn't laugh you can write Metadata 3.0 / define the JSON serialization and we'll write metadata.json into the .dist-info directory. It's not the end of the world, it is the beginning.

Donald Stufft

3:40 a.m.

On Mar 26, 2013, at 10:49 PM, Daniel Holth <dholth@gmail.com> wrote:

...

On Tue, Mar 26, 2013 at 9:12 PM, PJ Eby <pje@telecommunity.com> wrote:

...
On Tue, Mar 26, 2013 at 8:33 PM, Donald Stufft <donald@stufft.io> wrote:

...
As far as I can tell the only things that even use PKG-INFO is setuptools/distribute and we want to phase them out of existence anyways.

The only thing setuptools uses it for is to find out the version of a package in the case where an .egg-info directory or filename doesn't have a version in its filename... which normally only happens in the "setup.py develop" case. So no need to keep it around on my account. ;-)

(Some tools do check for the *existence* of a PKG-INFO, like PyPI's sdist upload validation, and the various egg formats require a file *named* PKG-INFO, but AFAIK nothing commonly used out there actually *reads* PKG-INFO or gives a darn about its contents, except for that version usecase mentioned above.) _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

It will be OK. Take a deep breath and laugh at the idea that string.rsplit(', ', 1) on a useless field that's probably already posted as a dict to pypi should be considered a serious threat to the future of packaging. If you didn't laugh you can write Metadata 3.0 / define the JSON serialization and we'll write metadata.json into the .dist-info directory. It's not the end of the world, it is the beginning.

Yea, it's totally about keywords and that's just not an example of a larger problem (like embedding little mini json documents) and what we need is another competing standard all because of a legacy file format for a file that barely anything uses right now (which makes it the ideal time _to_ replace it, before it starts being actively used in a widespread fashion). ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Daniel Holth

4:08 a.m.

...

Yea, it's totally about keywords and that's just not an example of a larger problem (like embedding little mini json documents) and what we need is another competing standard all because of a legacy file format for a file that barely anything uses right now (which makes it the ideal time _to_ replace it, before it starts being actively used in a widespread fashion).

We need approximately five fields: Name Version Provides-Extra Requires-Dist Setup-Requires-Dist the rest are useless, never need to be parsed by anyone, or are already sent to pypi as a dict. We need the environment markers language. We need the requirements specifiers >= 4.0.0, < 9. Define the JSON serialization and we'll have this format converted in 50 lines of code or less. It's that easy.

Nick Coghlan

4:55 a.m.

On 26 Mar 2013 21:08, "Daniel Holth" <dholth@gmail.com> wrote:

...

...
Yea, it's totally about keywords and that's just not an example of a

larger problem (like embedding little mini json documents) and what we need is another competing standard all because of a legacy file format for a file that barely anything uses right now (which makes it the ideal time _to_ replace it, before it starts being actively used in a widespread fashion).

...

We need approximately five fields:

Name Version Provides-Extra Requires-Dist Setup-Requires-Dist

the rest are useless, never need to be parsed by anyone, or are already sent to pypi as a dict.

We need the environment markers language.

We need the requirements specifiers >= 4.0.0, < 9.

Define the JSON serialization and we'll have this format converted in 50 lines of code or less. It's that easy.

I've already defined it for the post install hook design, and will now be rewriting the PEP to use that as the base format. As added bonuses, it will allow 2.0 metadata to live alongside 1.1 metadata (due to a different file name), be easier to explain to readers of the PEP and allow us to fix some clumsy legacy naming. When we last considered this question, we were still trying to keep the metadata 1.3 changes minimal to avoid delaying the addition of wheel support to pip. That issue has since been solved more expediently by allowing metadata 1.1 in wheel files. The addition of the post install hook is the other major relevant change, and that's the one which means we need to define a structured metadata format regardless of the on-disk format. It all adds up to it making far more sense to just switch the format to JSON for 2.0 rather than persisting with ad hoc attempts to use a key-value multidict for structured data storage. Cheers, Nick. P.S. I forgot LAX has free wi-fi now :)

...

_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Vinay Sajip

12:22 p.m.

Nick Coghlan <ncoghlan <at> gmail.com> writes:

...

It all adds up to it making far more sense to just switch the format to JSON for 2.0 rather than persisting with ad hoc attempts to use a key-value multidict for structured data storage.

+1, and I sincerely hope you will take a look at the JSON metadata used in distlib/distil to good advantage in dependency resolution, and installing, archiving and building distributions. Regards, Vinay Sajip

Vinay Sajip

12:18 p.m.

Daniel Holth <dholth <at> gmail.com> writes:

...

We need approximately five fields:

Name Version Provides-Extra Requires-Dist Setup-Requires-Dist

the rest are useless, never need to be parsed by anyone, or are already sent to pypi as a dict.

We need the environment markers language.

We need the requirements specifiers >= 4.0.0, < 9.

You're taking a disappointingly narrow view of the metadata, it seems to me. If you look at the totality of metadata which describes distributions in the here- and-now world of setuptools, it's a lot more than that - just look at any of my metadata files which I've pointed to in the "distil" documentation. You're only talking about *installation* metadata, but even there your coverage is incomplete. I won't go into any more details now, but suffice to say that as I am working on "distil", I am coming across decisions about installation which either I hard-code into distil (thus making it quite likely that another tool will give different results), or enshrine in installation metadata (constraining all compliant tools to adhere to the developer's and/or user's wishes in that area). Regards, Vinay Sajip

Daniel Holth

12:37 p.m.

On Wed, Mar 27, 2013 at 8:18 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

Daniel Holth <dholth <at> gmail.com> writes:

...
We need approximately five fields:

Name Version Provides-Extra Requires-Dist Setup-Requires-Dist

the rest are useless, never need to be parsed by anyone, or are already sent to pypi as a dict.

We need the environment markers language.

We need the requirements specifiers >= 4.0.0, < 9.

You're taking a disappointingly narrow view of the metadata, it seems to me. If you look at the totality of metadata which describes distributions in the here- and-now world of setuptools, it's a lot more than that - just look at any of my metadata files which I've pointed to in the "distil" documentation.

You're only talking about *installation* metadata, but even there your coverage is incomplete. I won't go into any more details now, but suffice to say that as I am working on "distil", I am coming across decisions about installation which either I hard-code into distil (thus making it quite likely that another tool will give different results), or enshrine in installation metadata (constraining all compliant tools to adhere to the developer's and/or user's wishes in that area).

Hooray for JSON. I actually liked the separation and viewed it as a de-coupling feature, but that will probably be less important as we avoid setup.py generating different metadata for each execution.

Tres Seaver

5:10 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 03/26/2013 09:12 PM, PJ Eby wrote:

...

(Some tools do check for the *existence* of a PKG-INFO, like PyPI's sdist upload validation, and the various egg formats require a file *named* PKG-INFO, but AFAIK nothing commonly used out there actually *reads* PKG-INFO or gives a darn about its contents, except for that version usecase mentioned above.)

I do have a tool (named 'pkginfo', funnily enough) which does parse them: https://pypi.python.org/pypi/pkginfo http://pythonhosted.org/pkginfo/ I use it in another tool, 'compoze', which allows me to build "cureated" indexes from versions installed locally (e.g., after testing in a virtualenv): https://pypi.python.org/pypi/compoze/ http://docs.repoze.org/compoze/ Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlFTKBcACgkQ+gerLs4ltQ4x/wCfZsp/p60ELrQvTCXfdPMhuK1E qJQAoJXvTlTSo1iy/KxylnuizPodbr25 =IJ6t -----END PGP SIGNATURE-----

Vinay Sajip

12:02 p.m.

Donald Stufft <donald <at> stufft.io> writes:

...

I disagree.

- These files are used for tools to exchange data, so "hand editable" shouldn't be a primary concern.

...

- There are a number of current fields where the current format is *not* enough and one off psuedo formats have had to be added - `Keywords: dog puppy voting election` - A list masquerading as a string,

Right. Nobody hand-edits PKG-INFO now, do they? this one needs field.split() to

...

actually parse it - `Project-URL: Bug, Issue Tracker, http://bitbucket.org/tarek/distribute/issues/` - A dictionary masquerading as a list of strings, this one needs {key.strip(): value.strip() for key, value in [x.rsplit(", ", 1) for x in field]} - Any of the fields can contain arbitrary content, previously Description had specialized handling for this which it has now been moved to the payload section, but all the same issues there affect other fields. - The Extension field name using ExtensionName/ActualKey to kludge a nested dictionary - The ExtensionName/json is a horrible kludge why are we nesting a format inside of a format instead of just using a format that supports everything we could want?

As far as I can tell the only things that even use PKG-INFO is setuptools/distribute and we want to phase them out of existence anyways. The only other thing I can think of is Wheel which can either a) be updated to a different format it's new enough there's not much need to worry about legacy support or b) generate the METADATA file just for Wheels.

Please note that: * I already have a system working fairly well *now* (though it's early days, and needs more testing) where JSON is used for metadata. * The metadata covers not just the index metadata (PKG-INFO) but also metadata covering how to build, install and test distributions. * The metadata already exists for the vast bulk of distributions on PyPI and is derived from the setup.py in those distributions. So migration is not much of an issue. * The "distil" tool demonstrates each of the Archiver, Builder and Installer roles reasonably well for its stage of development. Donald's above analysis resonates with me - it seems pretty kludgy trying to shoe-horn stuff into key-value format which doesn't fit it well. There don't seem to be any valid technical arguments to keeping the key-value format, other than "please let's not try to change too many things at once". If that's really going to be accepted as the reason, it strikes me as being a little timid (given what "distil" shows is possible). And that would be enough of a shame as it is, without making things worse by introducing something like ExtensionName/json. To those people who would balk at editing JSON by hand - who's asking you to? Why not just get the data into an appropriate dict, using any tools you like, and then serialise it to JSON? That approach seems to be what JSON was designed for. If any tools need PKG-INFO style metadata, that's easy enough to generate from a JSON format, as distil's wheel building support demonstrates. Regards, Vinay Sajip

PJ Eby

5:12 p.m.

On Wed, Mar 27, 2013 at 8:02 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

To those people who would balk at editing JSON by hand - who's asking you to? Why not just get the data into an appropriate dict, using any tools you like, and then serialise it to JSON?

The challenge here is again the distinction between raw source and sdist, and the interaction with revision control. Either there has to be some way to tell MEBS (i.e. the overall build system) what tool you're using to generate that JSON, or you have to check a generated file into revision control, and make sure you've updated it. (Which is error prone, even if you don't mind checking generated files into revision control.) This strongly suggests there needs to be *some* human-editable way to at *least* specify what tool you're using to generate the JSON with.

Donald Stufft

5:30 p.m.

On Mar 27, 2013, at 1:12 PM, PJ Eby <pje@telecommunity.com> wrote:

...

On Wed, Mar 27, 2013 at 8:02 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...
To those people who would balk at editing JSON by hand - who's asking you to? Why not just get the data into an appropriate dict, using any tools you like, and then serialise it to JSON?

The challenge here is again the distinction between raw source and sdist, and the interaction with revision control. Either there has to be some way to tell MEBS (i.e. the overall build system) what tool you're using to generate that JSON, or you have to check a generated file into revision control, and make sure you've updated it. (Which is error prone, even if you don't mind checking generated files into revision control.)

This strongly suggests there needs to be *some* human-editable way to at *least* specify what tool you're using to generate the JSON with. _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

I don't actually think packaging needs to solve this. But there are a number of solutions that come to mind (mostly either expecting a standard command ala setup.py develop to work). If I want to install a development version of say libsodium (just an example C lib) I download it and run ./autogen.sh && make make install but once it's packaged I can install it using the packaging tools. So this issue is really sort of parallel to builders, archivers and even the JSON and it comes down to how does an unpackaged directory of code (the VCS checkout portion isn't really that important here) signal to an installer how to install a development version of it. Personally I think a common entrypoint (ala make install) is the way forward for this. When you leave the realm of package formats (ala sdist, wheel, etc) you start needing to get much more freeform. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Daniel Holth

5:41 p.m.

On Wed, Mar 27, 2013 at 1:30 PM, Donald Stufft <donald@stufft.io> wrote:

...

On Mar 27, 2013, at 1:12 PM, PJ Eby <pje@telecommunity.com> wrote:

...
On Wed, Mar 27, 2013 at 8:02 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...
To those people who would balk at editing JSON by hand - who's asking you to? Why not just get the data into an appropriate dict, using any tools you like, and then serialise it to JSON?

The challenge here is again the distinction between raw source and sdist, and the interaction with revision control. Either there has to be some way to tell MEBS (i.e. the overall build system) what tool you're using to generate that JSON, or you have to check a generated file into revision control, and make sure you've updated it. (Which is error prone, even if you don't mind checking generated files into revision control.)

This strongly suggests there needs to be *some* human-editable way to at *least* specify what tool you're using to generate the JSON with. _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

I don't actually think packaging needs to solve this. But there are a number of solutions that come to mind (mostly either expecting a standard command ala setup.py develop to work).

If I want to install a development version of say libsodium (just an example C lib) I download it and run ./autogen.sh && make make install but once it's packaged I can install it using the packaging tools.

So this issue is really sort of parallel to builders, archivers and even the JSON and it comes down to how does an unpackaged directory of code (the VCS checkout portion isn't really that important here) signal to an installer how to install a development version of it. Personally I think a common entrypoint (ala make install) is the way forward for this. When you leave the realm of package formats (ala sdist, wheel, etc) you start needing to get much more freeform.

It does get a little murky. nothing the file in a source checkout PKG-INFO the file in an sdist PKG-INFO the re-generated file PKG-INFO the installed file (we will probably call it metadata.json soon but the confusion is the same). I think it might make sense to expect only a stub PKG-INFO[.in] at the root of a VCS checkout, have a 100% generated and hopefully trustworthy .dist-info directory in an sdist, and don't bother regenerating the root PKG-INFO.

Vinay Sajip

5:41 p.m.

PJ Eby <pje <at> telecommunity.com> writes:

...

The challenge here is again the distinction between raw source and sdist, and the interaction with revision control. Either there has to be some way to tell MEBS (i.e. the overall build system) what tool you're using to generate that JSON, or you have to check a generated file into revision control, and make sure you've updated it. (Which is error prone, even if you don't mind checking generated files into revision control.)

This strongly suggests there needs to be *some* human-editable way to at *least* specify what tool you're using to generate the JSON with.

There are no doubt many possible workflows, but one such is: metadata input files - any number, hand-edited, checked into source control metadata merge tool - creates JSON metadata from input files JSON metadata - produced by tool, so not checked in If the "merge tool" (which could be a simple Python script) is custom to a project, it can be checked into source control in that project. If it is used across multiple projects, it is maintained as a separate tool in its own repo and, if you are just using it but not maintaining it, it becomes part of your build toolset (like sphinx-build). Actually, the doc tools seem to be a good analogy - create a useful format which is a pain to edit by hand (HTML that looks nice in a browser) from some checked in sources which are reasonable to edit by hand (.rst) + a merge tool (Sphinx). The merge tool seems similar in kind to the release.py script that many projects have, which creates release distribution files, bumps version numbers, registers and uploads to PyPI. Regards, Vinay Sajip

Donald Stufft

5:51 p.m.

On Mar 27, 2013, at 1:41 PM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

PJ Eby <pje <at> telecommunity.com> writes:

...
The challenge here is again the distinction between raw source and sdist, and the interaction with revision control. Either there has to be some way to tell MEBS (i.e. the overall build system) what tool you're using to generate that JSON, or you have to check a generated file into revision control, and make sure you've updated it. (Which is error prone, even if you don't mind checking generated files into revision control.)

This strongly suggests there needs to be *some* human-editable way to at *least* specify what tool you're using to generate the JSON with.

There are no doubt many possible workflows, but one such is:

metadata input files - any number, hand-edited, checked into source control metadata merge tool - creates JSON metadata from input files JSON metadata - produced by tool, so not checked in

If the "merge tool" (which could be a simple Python script) is custom to a project, it can be checked into source control in that project. If it is used across multiple projects, it is maintained as a separate tool in its own repo and, if you are just using it but not maintaining it, it becomes part of your build toolset (like sphinx-build). Actually, the doc tools seem to be a good analogy - create a useful format which is a pain to edit by hand (HTML that looks nice in a browser) from some checked in sources which are reasonable to edit by hand (.rst) + a merge tool (Sphinx).

The merge tool seems similar in kind to the release.py script that many projects have, which creates release distribution files, bumps version numbers, registers and uploads to PyPI.

Regards,

Vinay Sajip

_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

I don't think the packaging formats should dictate the development flow at all. .IN files and such all dictate how that should be. To me this is an installer issue not a packaging issue and it's best solved in the installers. Obviously there is some benefit to a "standard" way for installers to treat these but I don't think it should be defined in terms of the packaging formats. Hence my off the cuff suggestion of keeping setup.py develop, or develop.py or some such script that express purpose is in use for development checkouts, but that development checkouts should be discouraged unless you're actively working on that project. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Daniel Holth

5:53 p.m.

Agreed that python is a fine language for build scripts. On Wed, Mar 27, 2013 at 1:51 PM, Donald Stufft <donald@stufft.io> wrote:

...

On Mar 27, 2013, at 1:41 PM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...
PJ Eby <pje <at> telecommunity.com> writes:

...
The challenge here is again the distinction between raw source and sdist, and the interaction with revision control. Either there has to be some way to tell MEBS (i.e. the overall build system) what tool you're using to generate that JSON, or you have to check a generated file into revision control, and make sure you've updated it. (Which is error prone, even if you don't mind checking generated files into revision control.)

This strongly suggests there needs to be *some* human-editable way to at *least* specify what tool you're using to generate the JSON with.

There are no doubt many possible workflows, but one such is:

metadata input files - any number, hand-edited, checked into source control metadata merge tool - creates JSON metadata from input files JSON metadata - produced by tool, so not checked in

If the "merge tool" (which could be a simple Python script) is custom to a project, it can be checked into source control in that project. If it is used across multiple projects, it is maintained as a separate tool in its own repo and, if you are just using it but not maintaining it, it becomes part of your build toolset (like sphinx-build). Actually, the doc tools seem to be a good analogy - create a useful format which is a pain to edit by hand (HTML that looks nice in a browser) from some checked in sources which are reasonable to edit by hand (.rst) + a merge tool (Sphinx).

The merge tool seems similar in kind to the release.py script that many projects have, which creates release distribution files, bumps version numbers, registers and uploads to PyPI.

Regards,

Vinay Sajip

_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

I don't think the packaging formats should dictate the development flow at all. .IN files and such all dictate how that should be. To me this is an installer issue not a packaging issue and it's best solved in the installers. Obviously there is some benefit to a "standard" way for installers to treat these but I don't think it should be defined in terms of the packaging formats. Hence my off the cuff suggestion of keeping setup.py develop, or develop.py or some such script that express purpose is in use for development checkouts, but that development checkouts should be discouraged unless you're actively working on that project. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Vinay Sajip

6:04 p.m.

Donald Stufft <donald <at> stufft.io> writes:

...

I don't think the packaging formats should dictate the development flow at all.

We might be at cross purposes here. If we posit that packaging metadata is in JSON format (which I think we both favour), I was addressing Daniel's objection to it on the grounds that he doesn't like editing JSON, to suggest an alternative for people with that objection. It doesn't follow that they *have* to use any particular workflow or tool, or that packaging formats are dictating it (other than the bare fact that they are JSON). Regards, Vinay Sajip

Donald Stufft

6:09 p.m.

On Mar 27, 2013, at 2:04 PM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:

...

Donald Stufft <donald <at> stufft.io> writes:

...
I don't think the packaging formats should dictate the development flow at all.

We might be at cross purposes here. If we posit that packaging metadata is in JSON format (which I think we both favour), I was addressing Daniel's objection to it on the grounds that he doesn't like editing JSON, to suggest an alternative for people with that objection. It doesn't follow that they *have* to use any particular workflow or tool, or that packaging formats are dictating it (other than the bare fact that they are JSON).

Regards,

Vinay Sajip

_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Gotcha, yea in my mind the JSON is generated by the archiver tool and added to the various types of dists, wheels, etc. What the users actually edit/use is totally up to the archiver tool. It could be .in files it could be a python file, it could be YAML, it could pull from a SQLite database. Packaging shouldn't care as long as it gets it's sdists bdists, wheels etc in the proper format with the proper metadata files. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

PJ Eby

10:41 p.m.

On Wed, Mar 27, 2013 at 1:51 PM, Donald Stufft <donald@stufft.io> wrote:

...

I don't think the packaging formats should dictate the development flow at all. .IN files and such all dictate how that should be.

You can't *not* dictate the flow. If you don't have something to generate the file with, then you're dictating that developers must ship an sdist and users must manually run build tools to make wheels. (Or, more likely, you're creating the motivation for somebody to create a meta-meta-build system that solves the problem, probably by having a bunch of plugins to detect what build system a raw source directory is using.)

...

To me this is an installer issue not a packaging issue and it's best solved in the installers. Obviously there is some benefit to a "standard" way for installers to treat these but I don't think it should be defined in terms of the packaging formats.

It definitely doesn't have to be. distutils2's setup.cfg isn't actually a bad human-readable format, but it's not a *packaging* format. In any case, the only thing an installer needs is a way to get the setup-requires-dist, or the portion of it that pertains to identifying the metadata hooks. The rest could be handled with entry points registered for configuration file names. For example, Bento could expose an entry point like: [mebs.metadata.generators] bento.info = some.module.in.bento:hookfunction And then an installer builds a list of these hooks that are in the setup-requires-dists, and runs them based on the filenames found in the project directory. All done. Basically, the only thing we need is a way to avoid having to either: 1. Make every user install Bento/etc. before trying to install a package from source that uses it, or 2. Embed a registry of every possible build configuration file name into every installer. And this can be done in any of several ways: * Have a standardized naming pattern like pybuild.*, and make the .* part indicate the build tool * Have a standardized single name (like pybuild.cfg), and encode the build tool in a string that can be read regardless of file format, so it can be embedded in whatever format the build tool itself uses * Have a separate file that *only* lists the build tool or setup-requires-dists (and maybe can be extended to contain other information for use with a stdlib-supplied build tool) I personally lean towards the last one, especially if it reuses setup.cfg, because setup.cfg already exists and is fairly standardized. There are even tools that work today to let you do a metadata-free setup.py and specify everything needed in setup.cfg, with environment markers and everything. Heck, IIUC, there's at least one library you can use today with *setuptools* to do that -- it doesn't need distutils2 or any of that, it just translates setup.cfg to setup.py arguments. But an even more important reason to standardize is that there should be one, and preferably only one, obvious way to do it. AFAIK, the distutils2 effort didn't fail because of setup.cfg -- heck, setup.cfg was the main *benefit* I saw in the distutils2 work, everything else about it AFAIK was just setuptools warmed over -- it failed because of trying to boil the ocean and *implement* everything, rather than just standardizing on interfaces. A minimal boilerplate setup.cfg could be something like [build] builder = bento >1.6 And leave it at that. Disadvantage is that it's a dumb boilerplate file for tools that don't use setup.cfg for their configuration -- i.e., it's a minor disadvantage to users of those tools. However, if your preferred build tool generates the file for you, it's no big deal, as long as the generated file doesn't change all the time and you check it into source control. Such a usage pattern is teachable and provides what's needed, without dictating anything about the development workflow, other than that you need to tell installers how to make an sdist if you want people to install stuff you shipped without an sdist or a wheel, or if you want to use any generic build-running tools that need to know your build hook(s).

...

development checkouts should be discouraged unless you're actively working on that project.

Perhaps Jim can chime in on this point, but when you work with a whole bunch of people developing a whole bunch of libraries making up a larger project (e.g. Zope), it doesn't seem very sensible to expect that everybody manually check out and manage all the dependencies they're using. Maybe you could mitigate that somewhat with some sort automated continuous build/release system, but that's not always a practical option.

Donald Stufft

10:55 p.m.

On Mar 27, 2013, at 6:41 PM, PJ Eby <pje@telecommunity.com> wrote:

...

On Wed, Mar 27, 2013 at 1:51 PM, Donald Stufft <donald@stufft.io> wrote:

...
I don't think the packaging formats should dictate the development flow at all. .IN files and such all dictate how that should be.

You can't *not* dictate the flow. If you don't have something to generate the file with, then you're dictating that developers must ship an sdist and users must manually run build tools to make wheels.

(Or, more likely, you're creating the motivation for somebody to create a meta-meta-build system that solves the problem, probably by having a bunch of plugins to detect what build system a raw source directory is using.)

...
To me this is an installer issue not a packaging issue and it's best solved in the installers. Obviously there is some benefit to a "standard" way for installers to treat these but I don't think it should be defined in terms of the packaging formats.

It definitely doesn't have to be. distutils2's setup.cfg isn't actually a bad human-readable format, but it's not a *packaging* format.

In any case, the only thing an installer needs is a way to get the setup-requires-dist, or the portion of it that pertains to identifying the metadata hooks. The rest could be handled with entry points registered for configuration file names. For example, Bento could expose an entry point like:

[mebs.metadata.generators] bento.info = some.module.in.bento:hookfunction

And then an installer builds a list of these hooks that are in the setup-requires-dists, and runs them based on the filenames found in the project directory. All done.

Basically, the only thing we need is a way to avoid having to either:

1. Make every user install Bento/etc. before trying to install a package from source that uses it, or 2. Embed a registry of every possible build configuration file name into every installer.

And this can be done in any of several ways:

* Have a standardized naming pattern like pybuild.*, and make the .* part indicate the build tool

* Have a standardized single name (like pybuild.cfg), and encode the build tool in a string that can be read regardless of file format, so it can be embedded in whatever format the build tool itself uses

* Have a separate file that *only* lists the build tool or setup-requires-dists (and maybe can be extended to contain other information for use with a stdlib-supplied build tool)

I personally lean towards the last one, especially if it reuses setup.cfg, because setup.cfg already exists and is fairly standardized. There are even tools that work today to let you do a metadata-free setup.py and specify everything needed in setup.cfg, with environment markers and everything.

Heck, IIUC, there's at least one library you can use today with *setuptools* to do that -- it doesn't need distutils2 or any of that, it just translates setup.cfg to setup.py arguments.

But an even more important reason to standardize is that there should be one, and preferably only one, obvious way to do it. AFAIK, the distutils2 effort didn't fail because of setup.cfg -- heck, setup.cfg was the main *benefit* I saw in the distutils2 work, everything else about it AFAIK was just setuptools warmed over -- it failed because of trying to boil the ocean and *implement* everything, rather than just standardizing on interfaces. A minimal boilerplate setup.cfg could be something like

[build] builder = bento >1.6

And leave it at that. Disadvantage is that it's a dumb boilerplate file for tools that don't use setup.cfg for their configuration -- i.e., it's a minor disadvantage to users of those tools. However, if your preferred build tool generates the file for you, it's no big deal, as long as the generated file doesn't change all the time and you check it into source control.

Such a usage pattern is teachable and provides what's needed, without dictating anything about the development workflow, other than that you need to tell installers how to make an sdist if you want people to install stuff you shipped without an sdist or a wheel, or if you want to use any generic build-running tools that need to know your build hook(s).

repurposing a single line of setup.cfg for this usecase wouldn't be the worst thing in the world. I don't like setup.cfg and I especially don't like it as the format to exchange the _actual_ metadata, but as a configuration format (configure which build system to use) it's ok. I still think I prefer a setup.py develop or develop.py to invoke the build system for development builds, but atm the difference between something like echo "[build]\nbuilder = bento > 1.6" > setup.cfg and develop.py is not a hill I care to die on. Maybe Nick has different ides for how VCS/install from an unpacked directory (E.g. explicitly not a package) should look I don't know.

...

...
development checkouts should be discouraged unless you're actively working on that project.

Perhaps Jim can chime in on this point, but when you work with a whole bunch of people developing a whole bunch of libraries making up a larger project (e.g. Zope), it doesn't seem very sensible to expect that everybody manually check out and manage all the dependencies they're using. Maybe you could mitigate that somewhat with some sort automated continuous build/release system, but that's not always a practical option.

Sorry my statement was a bit unclear, those people would all fall under actively working on that project (Zope in this case). I mean installs from VCS's should be discouraged for end users. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

PJ Eby

28 Mar 28 Mar

1:03 a.m.

On Wed, Mar 27, 2013 at 6:55 PM, Donald Stufft <donald@stufft.io> wrote:

...

I still think I prefer a setup.py develop or develop.py to invoke the build system for development builds

It's possible we're not talking about the same thing -- I notice you keep mentioning "setup.py develop", which is entirely unrelated to the scenarios I'm talking about. "setup.py develop" is for installing something *you* are working on/developing. But depending on raw source doesn't imply that you would be editing or developing that source; it just means that you have a bleeding-edge dependency (which might in turn have others), adding to your management overhead if you have to know how to build an sdist for each of those dependencies whenever you need a refresh. So, I'm not talking about scenarios where a user obtains a source checkout and does something with it, I'm talking about scenarios where the developer of a package wants to declare a *dependency* on *another* package that currently has to be fetched from revision control. So, in order to install *their* package (e.g. to their staging/test server), the install system has to be able to fetch and build from raw sources.

...

Sorry my statement was a bit unclear, those people would all fall under actively working on that project (Zope in this case). I mean installs from VCS's should be discouraged for end users.

Define "end users". ;-) Here's a different example: there was a point at which I was actively developing PEAK-Rules and somebody else was actively developing something that used it. That person wasn't developing PEAK-Rules, and I wasn't part of their project, but they wanted up-to-the-minute versions because I was making changes based on their use cases, which they needed right away. Are they an "end user"? ;-) You could argue that, well, that's just one project, except that what if somebody *else* depends on *their* project, because they're also doing bleeding edge development? Well, that happened, too, because the consumer of PEAK-Rules was doing a bleeding-edge library that *other* people were doing bleeding-edge development against. So now there were two levels of dependency on raw sources. If you don't support these kinds of scenarios, you slow the community's development velocity. Not too long ago, Richard Jones posted a graph on r/Python showing how package registration took off exponentially around the time easy_install was released. I think that this is in large part due to the increased development velocity afforded by being able to depend on other packages at both development *and* deployment time. Even though most packages don't depend on the bleeding edge (because they're not themselves the bleeding edge), for individual development it's a godsend to be able to depend on your *own* packages from revision control, without needing all kinds of manual rigamarole to use them. (This is also really relevant for private and corporate-internal development scenarios.)

Donald Stufft

1:43 a.m.

On Mar 27, 2013, at 9:03 PM, PJ Eby <pje@telecommunity.com> wrote:

...

On Wed, Mar 27, 2013 at 6:55 PM, Donald Stufft <donald@stufft.io> wrote:

...
I still think I prefer a setup.py develop or develop.py to invoke the build system for development builds

It's possible we're not talking about the same thing -- I notice you keep mentioning "setup.py develop", which is entirely unrelated to the scenarios I'm talking about. "setup.py develop" is for installing something *you* are working on/developing. But depending on raw source doesn't imply that you would be editing or developing that source; it just means that you have a bleeding-edge dependency (which might in turn have others), adding to your management overhead if you have to know how to build an sdist for each of those dependencies whenever you need a refresh.

So, I'm not talking about scenarios where a user obtains a source checkout and does something with it, I'm talking about scenarios where the developer of a package wants to declare a *dependency* on *another* package that currently has to be fetched from revision control. So, in order to install *their* package (e.g. to their staging/test server), the install system has to be able to fetch and build from raw sources.

I don't think you can, nor should you be able to, explicitly depend on something that is a VCS checkout. Declared dependencies in the metadata should be "abstract", they are a name, possibly a version specifier but they are explicitly _not_ where you get that dependency from. They only become "concrete" when you resolve the abstract dependencies via an index (this index could be PyPI, it could be a directory on your machine, etc). This fits in very well with the idea of "Provides" as well, I do not depend on https://pypi.python.org/packages/source/s/setuptools/setuptools-0.6c11.tar.g... I depend on something that claims to be setuptools it could be https://bitbucket.org/tarek/distribute. The point being me as the theoretical author of setuptools package author can't dictate where to install my package from.

...

...
Sorry my statement was a bit unclear, those people would all fall under actively working on that project (Zope in this case). I mean installs from VCS's should be discouraged for end users.

Define "end users". ;-)

Here's a different example: there was a point at which I was actively developing PEAK-Rules and somebody else was actively developing something that used it. That person wasn't developing PEAK-Rules, and I wasn't part of their project, but they wanted up-to-the-minute versions because I was making changes based on their use cases, which they needed right away. Are they an "end user"? ;-)

You could argue that, well, that's just one project, except that what if somebody *else* depends on *their* project, because they're also doing bleeding edge development?

Well, that happened, too, because the consumer of PEAK-Rules was doing a bleeding-edge library that *other* people were doing bleeding-edge development against. So now there were two levels of dependency on raw sources.

I see, I was misunderstanding the use case. Like I said above though I don't think that a package author should dictate where you install X from. I don't know easy_install or buildout very well, but when I need a bleeding-edge release of something in my dependency graph I use pip's requirements.txt file and add a ``-e <…>`` (which pip internally translates to checking the repo out and running setup.py develop). This is where, in my mind, this belongs because requirements.txt lists "concrete" dependencies (it is paired with a index url, defaulting to PyPI) and so I'm just listing another "concrete" dependency. This does mean dependency graphs involving pre-released unpackaged dependencies are less friendly, but I think that's ok because: * Users should opt into development releases - This thought is reflected in the PEP426 where it instructs installers to default to stable releases only unless the end user requests it either via flag, or by explicitly including it in the version spec. * This is already outside of the packaging infrastructure/ecosystem. Sometimes I need system libraries installed too and I need to manually make sure they are installed. Python packaging can't solve every problem. * I think this is an edge case (one I have hit myself) and I don't think it's a large enough use case to break the "abstract"-ness of the PKG-INFO/JSON/whatever metadata.

...

If you don't support these kinds of scenarios, you slow the community's development velocity. Not too long ago, Richard Jones posted a graph on r/Python showing how package registration took off exponentially around the time easy_install was released. I think that this is in large part due to the increased development velocity afforded by being able to depend on other packages at both development *and* deployment time. Even though most packages don't depend on the bleeding edge (because they're not themselves the bleeding edge), for individual development it's a godsend to be able to depend on your *own* packages from revision control, without needing all kinds of manual rigamarole to use them.

(This is also really relevant for private and corporate-internal development scenarios.)

This was a hard email to write because I totally understand the motivation behind it, and I think it's a very attractive mis-feature. It sounds really good but I do not think it's a benefit is large enough to include it inside of "packaging" (the formats and minimum toolchain) with the negative qualities behind it. I do think this is a great value add on for an *installer* but that it should remain in the realms of installer specific (ala requirements.txt). * I can't think of better terms than "abstract" and "concrete" and they don't perfectly describe the difference. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Nick Coghlan

2:05 a.m.

On Thu, Mar 28, 2013 at 11:43 AM, Donald Stufft <donald@stufft.io> wrote:

...

I don't think you can, nor should you be able to, explicitly depend on something that is a VCS checkout.

I find it more useful to think of the issue as whether or not you allow publication of source tarballs to satisfy a dependency, or *require* publication of a fully populated sdist. If you allow raw source tarballs, then you effectively allow VCS checkouts as well. I prefer requiring an explicit publication step, but we also need to acknowledge that the installer ecosystem we're trying to replace allows them, and some people are relying on that feature. However, as I've said elsewhere, for metadata 2.0, I *do not* plan to migrate the archiving or build steps away from setup.py. So "give me an sdist" will be spelled "python setup.py sdist" and "give me a wheel file" will be spelled "python setup.py bdist_wheel". There's also an interesting migration problem for pre-2.0 sdists, where we can't assume that "python setup.py bdist_wheel && pip install <created wheel>" is equivalent to "python setup.py install": projects like Twisted that run a post-install hook won't install properly if you build a wheel first, since the existing post-install hook won't run. It's an interesting problem, but one where my near term plans amount to "document the status quo". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Donald Stufft

2:09 a.m.

On Mar 27, 2013, at 10:05 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On Thu, Mar 28, 2013 at 11:43 AM, Donald Stufft <donald@stufft.io> wrote:

...
I don't think you can, nor should you be able to, explicitly depend on something that is a VCS checkout.

I find it more useful to think of the issue as whether or not you allow publication of source tarballs to satisfy a dependency, or *require* publication of a fully populated sdist. If you allow raw source tarballs, then you effectively allow VCS checkouts as well. I prefer requiring an explicit publication step, but we also need to acknowledge that the installer ecosystem we're trying to replace allows them, and some people are relying on that feature.

Right, which is why I think the ability to install from a raw source is a good feature for an installer, but not for the dependency metadata. Following that we just need a standard way for a raw source tarball to declare what it's builder is, either via some sort of file that tells you that, or a build script , or something along those lines.

...

However, as I've said elsewhere, for metadata 2.0, I *do not* plan to migrate the archiving or build steps away from setup.py. So "give me an sdist" will be spelled "python setup.py sdist" and "give me a wheel file" will be spelled "python setup.py bdist_wheel".

There's also an interesting migration problem for pre-2.0 sdists, where we can't assume that "python setup.py bdist_wheel && pip install <created wheel>" is equivalent to "python setup.py install": projects like Twisted that run a post-install hook won't install properly if you build a wheel first, since the existing post-install hook won't run.

It's an interesting problem, but one where my near term plans amount to "document the status quo".

Cheers, Nick.

-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

PJ Eby

3:44 a.m.

On Wed, Mar 27, 2013 at 10:09 PM, Donald Stufft <donald@stufft.io> wrote:

...

Right, which is why I think the ability to install from a raw source is a good feature for an installer, but not for the dependency metadata.

Sure - I never said the dependency metadata had to be able to say *where* you get the raw source from, just that the tool for resolving dependencies needed to be able to process raw source into something installable.

...

Following that we just need a standard way for a raw source tarball to declare what it's builder is, either via some sort of file that tells you that, or a build script , or something along those lines.

Yep. Static configuration is a *must*, here, though, as we want to move away from arbitrary setup script writing by package authors: in general they are really bad at it. A lot of setuptools' odd build-time features (like sandboxing) exist specifically because people write whatever crap they want in setup.py and have zero idea how to actually use/integrate with distutils. One interesting feature that would be possible under a configuration-based system is that you could actually have an installer with a whitelist or blacklist for build tools and setup-requires, in order to prevent or limit untrusted code execution by the overall build system. This would make it slightly more practical to have, say, servers that build wheels, such that only tools the servers' owners know won't import or run arbitrary code are allowed to do the compiling. (Not that that should be the only security involved, but it'd be a cool first-tier sanity check.) (Interestingly, this is also an argument for having a separate "tests-require-dist" in metadata 2.0, since testing tools *have* to run arbitrary code from the package, but archivers and builders do not.) Nick wrote:

...

...
However, as I've said elsewhere, for metadata 2.0, I *do not* plan to migrate the archiving or build steps away from setup.py. So "give me an sdist" will be spelled "python setup.py sdist" and "give me a wheel file" will be spelled "python setup.py bdist_wheel".

Works for me. Well, sort of. In principle, it means you can grow next generation build systems that use a dummy setup.py. In practice, it means you're still gonna be relying on setuptools. (Presumably 0.7 post-merge, w/bdist_wheel support baked in.) At some point, there has to be a new way to do it, because the pain of creating a functional dummy setup.py is a really high barrier to entry for a build tool to meet, until all the current tools that run setup.py files go away. IMO it'd be better to standardize this bit *now*, so that it'd be practical to start shipping projects without a setup.py, or perhaps make a "one dummy setup.py to rule them all" implementation that delegates everything to the new build interface. I can certainly understand that there are more urgent priorities in the short run; I just hope that a standard for this part lands concurrent with, say, PEP 439 and distlib ending up in the stdlib, so we don't have to wait another couple years to begin phasing out setuptools/distutils as the only build game in town. I mean, it basically amounts to defining some parameters to programmatically call a pair of sdist() and bdist_wheel() functions with, and a configuration syntax to say what distributions and modules to import those functions from. So it's not like it's going to be a huge time drain. (Maybe not even as much has already been consumed by this thread so far. ;-) ) Nick also wrote:

...

...
There's also an interesting migration problem for pre-2.0 sdists, where we can't assume that "python setup.py bdist_wheel && pip install <created wheel>" is equivalent to "python setup.py install": projects like Twisted that run a post-install hook won't install properly if you build a wheel first, since the existing post-install hook won't run.

It's an interesting problem, but one where my near term plans amount to "document the status quo".

Yeah, it's already broken and the new world order isn't going to break it any further. Same goes for allowing pip to convert eggs; the ones that don't work right due to bad platform tags, etc. *already* don't work, so documenting the status quo as a transitional measure is sufficient. Heck, in general, supporting backward compatible stuff that suffers from the same problems as the stuff it's being backward compatible with is a no-brainer if it lets people get on the new so we can phase out the old. (Which is why I love that Vinay is looking into how to make wheels more usable for some of eggs less-frequent but still important use cases: it makes it that much easier to tell someone they don't need to stay on setuptools to do the same stuff.)

Donald Stufft

3:46 a.m.

On Mar 27, 2013, at 11:44 PM, PJ Eby <pje@telecommunity.com> wrote:

...

On Wed, Mar 27, 2013 at 10:09 PM, Donald Stufft <donald@stufft.io> wrote:

...
Right, which is why I think the ability to install from a raw source is a good feature for an installer, but not for the dependency metadata.

Sure - I never said the dependency metadata had to be able to say *where* you get the raw source from, just that the tool for resolving dependencies needed to be able to process raw source into something installable.

...
Following that we just need a standard way for a raw source tarball to declare what it's builder is, either via some sort of file that tells you that, or a build script , or something along those lines.

Yep. Static configuration is a *must*, here, though, as we want to move away from arbitrary setup script writing by package authors: in general they are really bad at it. A lot of setuptools' odd build-time features (like sandboxing) exist specifically because people write whatever crap they want in setup.py and have zero idea how to actually use/integrate with distutils.

One interesting feature that would be possible under a configuration-based system is that you could actually have an installer with a whitelist or blacklist for build tools and setup-requires, in order to prevent or limit untrusted code execution by the overall build system. This would make it slightly more practical to have, say, servers that build wheels, such that only tools the servers' owners know won't import or run arbitrary code are allowed to do the compiling. (Not that that should be the only security involved, but it'd be a cool first-tier sanity check.)

(Interestingly, this is also an argument for having a separate "tests-require-dist" in metadata 2.0, since testing tools *have* to run arbitrary code from the package, but archivers and builders do not.)

catalog-sig without an argument? Is this a first? ;)

...

Nick wrote:

...
...
However, as I've said elsewhere, for metadata 2.0, I *do not* plan to migrate the archiving or build steps away from setup.py. So "give me an sdist" will be spelled "python setup.py sdist" and "give me a wheel file" will be spelled "python setup.py bdist_wheel".

Works for me. Well, sort of. In principle, it means you can grow next generation build systems that use a dummy setup.py.

In practice, it means you're still gonna be relying on setuptools. (Presumably 0.7 post-merge, w/bdist_wheel support baked in.) At some point, there has to be a new way to do it, because the pain of creating a functional dummy setup.py is a really high barrier to entry for a build tool to meet, until all the current tools that run setup.py files go away.

IMO it'd be better to standardize this bit *now*, so that it'd be practical to start shipping projects without a setup.py, or perhaps make a "one dummy setup.py to rule them all" implementation that delegates everything to the new build interface.

I can certainly understand that there are more urgent priorities in the short run; I just hope that a standard for this part lands concurrent with, say, PEP 439 and distlib ending up in the stdlib, so we don't have to wait another couple years to begin phasing out setuptools/distutils as the only build game in town.

I mean, it basically amounts to defining some parameters to programmatically call a pair of sdist() and bdist_wheel() functions with, and a configuration syntax to say what distributions and modules to import those functions from. So it's not like it's going to be a huge time drain. (Maybe not even as much has already been consumed by this thread so far. ;-) )

Nick also wrote:

...
...
There's also an interesting migration problem for pre-2.0 sdists, where we can't assume that "python setup.py bdist_wheel && pip install <created wheel>" is equivalent to "python setup.py install": projects like Twisted that run a post-install hook won't install properly if you build a wheel first, since the existing post-install hook won't run.

It's an interesting problem, but one where my near term plans amount to "document the status quo".

Yeah, it's already broken and the new world order isn't going to break it any further. Same goes for allowing pip to convert eggs; the ones that don't work right due to bad platform tags, etc. *already* don't work, so documenting the status quo as a transitional measure is sufficient. Heck, in general, supporting backward compatible stuff that suffers from the same problems as the stuff it's being backward compatible with is a no-brainer if it lets people get on the new so we can phase out the old.

(Which is why I love that Vinay is looking into how to make wheels more usable for some of eggs less-frequent but still important use cases: it makes it that much easier to tell someone they don't need to stay on setuptools to do the same stuff.)

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

PJ Eby

5:31 a.m.

On Wed, Mar 27, 2013 at 11:46 PM, Donald Stufft <donald@stufft.io> wrote:

...

catalog-sig without an argument? Is this a first? ;)

No. This! Is! Spart... uh, I mean, DISTUTILS-SIG! ;-)

Paul Moore

8:08 a.m.

On 28 March 2013 02:05, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On Thu, Mar 28, 2013 at 11:43 AM, Donald Stufft <donald@stufft.io> wrote:

...
I don't think you can, nor should you be able to, explicitly depend on something that is a VCS checkout.

I find it more useful to think of the issue as whether or not you allow publication of source tarballs to satisfy a dependency, or *require* publication of a fully populated sdist. If you allow raw source tarballs, then you effectively allow VCS checkouts as well. I prefer requiring an explicit publication step, but we also need to acknowledge that the installer ecosystem we're trying to replace allows them, and some people are relying on that feature.

To give a real-life example of this issue, on Windows IPython depends on PyReadline. But the released version (1.7.x) of PyReadline is Python 2 only. So if you are using IPython on Python 3, you have to also depend on PyReadline from git. Now IPython doesn't declare a dependency on the VCS version (it just depends on "pyreadline"). And pyreadline is sufficiently stagnant that it hasn't declared anything much. But as an *end user* I have to make sure I force pip to install pyreadline from VCS if I want a working system. Paul.

Erik Bray

26 Mar 26 Mar

10:47 p.m.

On Tue, Mar 26, 2013 at 5:21 PM, Daniel Holth <dholth@gmail.com> wrote:

...

I want to poke myself in the eye every time I have to edit json by hand. Especially the description field.

I'm with you on that--I much prefer YAML (which is a superset of JSON!) but we don't even have that in the stdlib and it's not worth bikeshedding over to me.

...

On Mar 26, 2013 5:17 PM, "Donald Stufft" <donald@stufft.io> wrote:

...
On Mar 26, 2013, at 5:12 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I am -1 on renaming anything unless it solves a technical problem. Forever after we will have to explain "well, it used to be called X, now it's called Y..."

On Tue, Mar 26, 2013 at 5:01 PM, Erik Bray <erik.m.bray@gmail.com> wrote:

...
On Tue, Mar 26, 2013 at 4:08 PM, PJ Eby <pje@telecommunity.com> wrote:

...
On Tue, Mar 26, 2013 at 3:03 PM, Daniel Holth <dholth@gmail.com> wrote:

...
I think PKG-INFO is a highly human-editable format.

That doesn't mean you necessarily want to edit it yourself; notably, there will likely be some redundancy between the description in the file and other files like the README.

Also, today one of the key use cases people have for custom code in setup.py is to pull the package version from a __version__ attribute in a module. (Which is evil, of course, but people do it anyway.)

But it might be worth adding a setuptools feature to pull metadata from PKG-INFO (or DIST-INFO) instead of generating a new one, to see what people think of using PKG-INFO first, other files second. In principle, one could reduce a setup.py to just "from setuptools import setup_distinfo; setup_distinfo()" or some such.

In other words, using d2to1 and only for `setup.py egg_info` (only not egg_info but whatever we're doing instead to generate the metadata ;)

Erik

Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

Rename it and make it JSON instead of the homebrew* format!

* Yes techincally it's based on a real format, but that format doesn't support all the things it needs so there are hackishly added extensions added to it.

----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Donald Stufft

11:01 p.m.

On Mar 26, 2013, at 6:47 PM, Erik Bray <erik.m.bray@gmail.com> wrote:

...

I'm with you on that--I much prefer YAML (which is a superset of JSON!) but we don't even have that in the stdlib and it's not worth bikeshedding over to me.

YAML is great for human editable. I don't think is has much value over JSON for machine oriented data. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Erik Bray

8:58 p.m.

On Tue, Mar 26, 2013 at 2:48 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

On Tue, Mar 26, 2013 at 3:15 PM, PJ Eby <pje@telecommunity.com> wrote:

...
More specifically, I was hoping to move the discussion forward on nailing down some of the details that still need specifying in a PEP somewhere, to finish out what the "new world" of packaging will look like.

I'm deliberately trying to postpone some of those decisions - one of the reasons distutils2 foundered is because it tried to solve everything at once, and that's just too big a topic.

So, *right now*, my focus is on making it possible to systematically decouple building from installing, so that running "setup.py install" on a production system becomes as bizarre an idea as running "make install".

As we move further back in the tool chain, I want to follow the lead of the most widely deployed package management systems (i.e. Debian control files and RPM SPEC files) and provide appropriate configurable hooks for *invoking* archivers and builders, allowing developers to choose their own tools, so long as those tools correctly emit standardised formats understood by the rest of the Python packaging ecosystem.

Right--what we really need here is something akin to the debian/rules file, only not a shell script :) I like the hook idea. It's the "so long as those tools correctly emit standardised formats" that's the problem.

...

In the near term, however, these hooks will still be based on setup.py (specifically setuptools rather than raw distutils, so we can update older versions of Python).

That pretty much eases the concerns I brought up in the "backwards compat" thread.

...

...
But, if we support "either you have a setup.cfg specifying your archiver, or a PKG-INFO so an archiver isn't needed", then that would probably cover all the bases, actually.

Yeah, you're probably right that we will need to support something else in addition to the PKG-INFO file. A PKG-INFO.in could work, though, rather than a completely independent format like setup.cfg. That way we could easily separate a source checkout/tarball (with PKG-INFO.in) from an sdist (with PKG-INFO) from a wheel (with a named .dist-info directory).

I'm partly in favor of just saying, "there should be a PKG-INFO in your version control to be considered a valid python distribution". Intermediate formats like a setup.cfg or Nick's JSON format seem kind of unnecessary to me--why have two different formats to describe the same thing? In cases where the metadata needs to be mutated somehow--such as attaching revision numbers to the version--some sort of PKG-INFO.in like you suggest would be great. But I don't see why it should have a different format from PKG-INFO itself. I'd think it would just be a superset of the metadata format but with support for hooks. Basically akin to what d2to1 does with setup.cfg, but without the unnecessarily different-looking intermediate format (I do agree that a JSON format would allow a much greater degree of expression and flexibility, but I'm not entirely sure how I feel about having one file format that generates an entirely different file format).

...

(For consistency, we may want to rename PKG-INFO to DIST-INFO in sdist 2.0, though)

+1 Thanks, Erik

Nick Coghlan

6:25 p.m.

On Tue, Mar 26, 2013 at 9:15 AM, PJ Eby <pje@telecommunity.com> wrote:

...

On balance, I think I lean towards just having a simple way to specify your chosen archiver, so that installing from source checkouts and dumps is possible. I just find it annoying that you have to have *two* files in your checkout, one to say what tool you're using, and another one to configure it.

Ah, you have uncovered part of the cunning plan behind Setup-Requires-Dist and the metadata extension system in 2.0+: once we have the basic hooks in place, then we should be able to embed the config settings for the archivers and builders in the main metadata, without the installer needing to understand the details. Allowing embedded json also supports almost arbitrarily complex config options. For metadata 2.0, however, I'm thinking we should retain the distutils-based status quo for the archive hook and the build hook: Archive hook: python setup.py sdist Environment: current working directory = root dir of source checkout/unpacked tarball Build hook: python setup.py bdist_wheel Environment: current working directory = root dir of unpacked sdist The install tool would then pick up the files from their default output locations. Installing from a checkout/tarball would go through the full daisy chain (make sdist, make wheel from sdist, install the wheel), and installing from sdist would also build the intermediate wheel file. The only entry points inspired hook in 2.0 would be the post-install hook I have discussed previously (and will write up properly in PEP 426 later this week). In theory, we could have separate dependencies for the "make sdist" and "make wheel" parts of the chain, but that seems to add complexity without adequate justification to me. The runtime vs setup split is necessary so that you don't need a build chain on your deployment targets, but it seems comparatively harmless to install the archiver onto a dedicated build system even if you're only building from sdists (particularly when I expect most Python-specific tools to continue to follow the model of handling both archiving and building, rather than having completely separate tools for the two steps).

...

(What'd be nice is if you could just somehow detect files like bento.info and setup.cfg and thereby detect what archiver to use. But that would have limited extensibility unless there was a standard naming convention for the files, or a standardized format for at least the first line in the file or something like that, so you could identify the needed tool.)

Yeah, I plan to use future releases of the 2.x metadata to define hooks for this. We can also start experimenting in 2.0 through entry points and the structured metadata format I will be defining for the post install hook. (Daniel has an entry points extension PEP mostly written, we just haven't got around to publishing it yet...) In the meantime, formalising the "setup.py sdist" and "setup.py bdist_wheel" invocations should provide a useful stepping stone to a setup.py-is-optional future. Cheers, Nick.

...

_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Lennart Regebro

27 Mar 27 Mar

1:39 p.m.

On Mon, Mar 25, 2013 at 10:08 PM, Paul Moore <p.f.moore@gmail.com> wrote:

...

There's a longer-term issue that occurred to me when thinking about pip's role as a "builder" or an "installer" (to use Nick's terminology).

As I understand Nick's vision for the future, installers (like pip) will locate built wheels and download and install them, and builders (like distutils and bento) will be responsible for building wheels. But there's an intermediate role which shouldn't get forgotten in the transition - the role that pip currently handles with the "pip wheel" command. This is where I specify a list of distributions, and pip locates sdists, downloads them, checks dependencies, and ultimately builds all of the wheels. I'm not sure whether the current idea of builders includes this "locate, download and resolve dependencies" function (distutils and bento certainly don't have that capability).

Personally I don't see that as an intermediate role at all. That for me is a builder.

...

I imagine that pip will retain some form of the current "pip wheel"

I hope it will not. //Lennart

4264

Age (days ago)

4267

Last active (days ago)

List overview

Download

74 comments

9 participants

participants (9)

Daniel Holth
Donald Stufft
Erik Bray
Lennart Regebro
Nick Coghlan
Paul Moore
PJ Eby
Tres Seaver
Vinay Sajip