[Distutils] Towards a simple and standard sdist format that isn't intertwined with distutils

Daniel Holth dholth at gmail.com
Fri Oct 2 21:48:47 CEST 2015


The MEBS idea is inspired by heroku buildpacks where you just ask a list of
tools whether they can build something.
https://devcenter.heroku.com/articles/buildpacks . The idea would be that
pip would use MEBS instead of its setup.py-focused builder. The first
available MEBS plugin would notice setup.py and do what pip does now (force
setuptools, build in a subprocess).

You should know about flit https://github.com/takluyver/flit and Bento
http://cournape.github.io/Bento/ which have their own lightweight metadata
formats, which are transformed into standard Python formats by the
respective tools.

requires.txt is popular but I'm not a fan of it, it seems like it was
invented by people who didn't want to have a proper setup.py for their
project.

We have to come up with something simpler than setup.py if we want to get
some of the people who don't understand how to write setup.py. Ideally any
required new user-editable "which build system" metadata could be boiled
down to a single line in setup.cfg. There would be 3 stages: VCS checkout
(minimal metadata, a "generate machine readable metadata" step equivalent
to "setup.py egg_info") -> new sdist (PEP 376 style static metadata that
can be trusted) -> wheel.

(How pip builds a package from source: 1. download sdist; .egg-info
directory is almost always present 2. run setup.py egg_info to get
dependencies, because the static one is not reliable, because too many
requirements lists have 'if' statements 3. compile)

For all the talk about static metadata, the build script in general needs
to remain a Turing-complete script. Build systems everywhere are programs
to build other programs.

I really like your idea about returning a list of built artifacts. Python
packaging is strictly 1:1 source package -> output package but rpm, deb,
can generate many packages from a single source package.

I don't think we have to worry that much about Debian & RHEL. They will get
over it if setup.py is no longer there. Change brings work but stagnation
brings death.

On Fri, Oct 2, 2015 at 2:41 PM Brett Cannon <brett at python.org> wrote:

> On Fri, 2 Oct 2015 at 05:08 Donald Stufft <donald at stufft.io> wrote:
>
>> On October 2, 2015 at 12:54:03 AM, Nathaniel Smith (njs at pobox.com) wrote:
>> > > We realized that actually as far as we could tell, it wouldn't
>> > be that
>> > hard at this point to clean up how sdists work so that it would be
>> > possible to migrate away from distutils. So we wrote up a little
>> > draft
>> > proposal.
>> >
>> > The main question is, does this approach seem sound?
>>
>> I've just read over your proposal, but I've also just woken up so I might
>> be
>> a little slow still! After reading what you have, I don't think that this
>> proposal is the right way to go about improving sdists.
>>
>> The first thing that immediately stood out to me, is that it's
>> recommending
>> that downstream redistributors like Debian, Fedora, etc utilize Wheels
>> instead
>> of the sdist to build their packages from. However, that is not really
>> going to
>> fly with most (all?) of the downstream redistributors. Debian for
>> instance has
>> policy that requires the use of building all of it's packages from
>> Source, not
>> from anything else and Wheels are not a source package. While it can
>> theoretically work for pure python packages, it quickly devolves into a
>> mess
>> when you factor in packages that have any C code what so ever.
>>
>
> So wouldn't they then download the sdist, build a wheel as an
> intermediate, and then generate the .deb file? I mean as long as people
> upload an sdist for those that want to build from source and a wheel for
> convenience -- which is probably what most people providing wheels do
> anyway -- then I don't see the problem.
>
>
>>
>> Overall, this feels more like a sidegrade than an upgrade. One major theme
>> throughout of the PEP is that we're going to push to rely heavily on
>> wheels as
>> the primary format of installation. While that works well for things like
>> Debian, I don't think it's going to work as wheel for us. If we were only
>> distributing pure python packages, then yes absolutely, however given
>> that we
>> are not, we have to worry about ABI issues. Given that there is so many
>> different environments that a particular package might be installed into,
>> all
>> with different ABIs we have to assume that installing from source is still
>> going to be a primary path for end users to install and that we are never
>> going
>> to have a world where we can assume a Wheel in a repository.
>>
>> One of the problems with the current system, is that we have no mechanism
>> by
>> which to determine dependencies of a source distribution without
>> downloading
>> the file and executing some potentially untrusted code. This makes
>> dependency
>> resolution harder and much much slower than if we could read that
>> information
>> statically from a source distribution. This PEP doesn't offer anything in
>> the
>> way of solving this problem.
>>
>
> Isn't that what the requirements and requirements-file fields in the
> _pypackage file provide? Only if you use that requirements-dynamic would it
> require execcuting arbitrary code to gather dependency information, or am I
> missing something?
>
>
>>
>> To a similar tune, this PEP also doesn't make it possible to really get at
>> any other metadata without executing software. This makes it pratically
>> impossible to safely inspect an unknown or untrusted package to determine
>> what
>> it is and to get information about it. Right now PyPI relies on the
>> uploading
>> tool to send that information alongside of the file it is uploading, but
>> honestly what it should be doing is extracting that information from
>> within the
>> file. This is sort of possible right now since distutils and setuptools
>> both
>> create a static metadata file within the source distribution, but we
>> don't rely
>> on that within PyPI because that information may or may not be accurate
>> and may
>> or may not exist. However the twine uploading tool *does* rely on that,
>> and
>> this PEP would break the ability for twine to upload a package without
>> executing arbitrary code.
>>
>
> Isn't that only if you use the dynamic fields?
>
>
>>
>> Overall, I don't think that this really solves most of the foundational
>> problems with the current format. Largely it feels that what it achieves
>> is
>> shuffling around some logic (you need to create a hook that you reference
>> from
>> within a .cfg file instead of creating a setuptools extension or so) but
>> without fixing most of the problems. The largest benefit I see to
>> switching to
>> this right now is that it would enable us to have build time dependencies
>> that
>> were controlled by pip rather than installed implicitly via the execution
>> of
>> the setup.py. That doesn't feel like a big enough benefit to me to do a
>> mass
>> shakeup of what we recommend and tell people to do. Having people adjust
>> and
>> change and do something new requires effort, and we need something to
>> justify
>> that effort to other people and I don't think that this PEP has something
>> we
>> can really use to justify that effort.
>>
>
> From my naive perspective this proposal seems to help push forward a
> decoupling of building using distutils/setuptools as the only way you can
> properly build Python projects (which is what I think we are all after) and
> will hopefully eventually free pip up to simply do orchestration.
>
>
>>
>> I *do* think that there is a core of some ideas here that are valuable,
>> and in
>> fact are similar to some ideas I've had. The main flaw I see here is that
>> it
>> doesn't really fix sdists, it takes a solution that would work for VCS
>> checkouts and then reuses it for sdists. In my mind, the supported flow
>> for
>> package installation would be:
>>
>>     VCS/Bare Directory -> Source Distribution -> Wheel
>>
>> This would (eventually) be the only path that was supported for
>> installation
>> but you could "enter" the path at any stage. For example, if there is a
>> Wheel
>> already available, then you jump right on at the end and just install
>> that, if
>> there is a sdist available then pip first builds it into a wheel and then
>> installs that, etc.
>>
>> I think your PEP is something like what the VCS/Bare Directory to sdist
>> tooling
>> could look like, but I don't think it's what the sdist to wheel path
>> should
>> look like.
>>
>
> Is there another proposal I'm unaware for the sdist -> wheel step that is
> build tool-agnostic? I'm all for going with the best solution but there has
> to be an actual alternative to compare against and I don't know of any
> others right now and this proposal does seem to move things forward in a
> reasonable fashion.
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> https://mail.python.org/mailman/listinfo/distutils-sig
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20151002/79500e1c/attachment-0001.html>


More information about the Distutils-SIG mailing list