[Distutils] Provisionally accepting PEP 517's declarative build system interface

Nathaniel Smith njs at pobox.com
Mon May 29 16:04:39 EDT 2017


Ugh, sorry, fat-fingered that. Actual reply below...

On Mon, May 29, 2017 at 12:56 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Mon, May 29, 2017 at 12:50 PM, Donald Stufft <donald at stufft.io> wrote:
>>
>> To be honest, I’m not hardly going to feel particularly bad if one of the
>> most compilation heavy packages that exist takes a whole 10 seconds to
>> install from a VCS checkout.

Rebuild latency is *really* important. People get really cranky at me
when I argue that we should get rid of "editable installs", which
create much greater problems for maintaining consistent environments,
and that's only saving like 1 second of latency. I think I'm entitled
to be cranky if your response is "well suck it up and maybe rewrite
all your build tools".

NumPy really isn't that compilation heavy either... it's all C, which
is pretty quick. SciPy is *much* slower, for example, as is pretty
much any project using C++.

>> Particularly when I assume that the build tool
>> can be even smarter here than ccache is able to be to reduce the setup.py
>> build step back down to the no-op incremental build case.
>>
>> I mean, unless numpy is doing something different, the default distutils
>> incremental build stuff is incredibly dumb, it just stores the build output
>> in a directory (by default it’s located in ./build/) and compares the mtime
>> of a list of source files with the mtime of the target file, and if the
>> sources files are newer, it recompiles it. If you replace mtime with blake2
>> (or similar) then you can trivially support the exact same thing just
>> storing the built target files in some user directory cache instead.

Cache management is not a trivial problem.

And it actually doesn't matter, because we definitely can't silently
dump stuff into some user directory. An important feature of storing
temporary artifacts in the source tree is that it means that if
someone downloads the source, plays around with it a bit, and deletes
it, then it's actually gone. We can't squirrel away a few hundred
megabytes of data in some hidden directory that will hang around for
years after the user stops using numpy.

>> Hell,
>> we *might* even be able to preserve mtime (if we’re not already… we might
>> be! But I’d need to dig into it) so literally the only thing that would need
>> to change is instead of storing the built artifacts in ./build/ you store
>> them in ~/.cache/my-cool-build-tool/{project-name}. Bonus points: this means
>> you get incremental speeds even when building from a sdist from PyPI that
>> doesn’t have wheels and hasn’t changed those files either.
>>
>> I’m of the opinion that first you need to make it *correct*, then you can
>> try to make it *fast*. It is my opinion that a installer that shits random
>> debris into your current directory is not correct. It’s kind of silly that
>> we have to have a “random pip/distutils/setuptools” crap chunk of stuff to
>> add to .gitignore to basically every Python package in existence. Nevermind
>> the random stuff that doesn’t currently get written there, but will if we
>> stop copying files out of the path and into a temporary location (I’m sure
>> everyone wants a pip-egg-info directory in their current directory).
>>
>> I’m also of the opinion that avoiding foot guns is more important than
>> shooting for the fastest operation possible. I regularly (sometimes multiple
>> times a week!, but often every week or two) see people tripping up on the
>> fact that ``git clone … && pip install .`` does something different than
>> ``git clone … && python setup.py sdist && pip install dist/*``. Files
>> suddenly go missing and they have no idea why. If they’re lucky, they’ll
>> figure out they need to modify some combination of package_data, data_files,
>> and MANIFEST.in to make it work, if they’re not lucky they just sit there
>> dumbfounded at it.

Yeah, setuptools is kinda sucky this way. But this is fixable with
better build systems. And before we can get better build systems, we
need buy-in from devs. And saying "sorry, we're unilaterally screwing
up your recompile times because we don't care" is not a good way to
get there :-(

>>
>>
>> Also also, notice elsewhere in the thread where Thomas notes that flit
>> can't build an sdist from an unpacked sdist. It seems like 'pip
>> install unpacked-sdist/' is an important use case to support…
>>
>>
>> If the build tool gives us a mechanism to determine if something is an
>> unpacked sdist or not so we can fallback to just copying in that case, that
>> is fine with me. The bad case is generally only going to be hit on VCS
>> checkouts or other not sdist kinds of source trees.

I guess numpy could just claim that all VCS checkouts are actually
unpacked sdists...?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


More information about the Distutils-SIG mailing list