[Distutils] PEP 517: Open questions around artifact export directories

Nick Coghlan ncoghlan at gmail.com
Tue Jun 13 21:53:45 EDT 2017

On 13 June 2017 at 19:44, Thomas Kluyver <thomas at kluyver.me.uk> wrote:
> On Tue, Jun 13, 2017, at 02:27 AM, Nick Coghlan wrote:
>> Despite being the one to originally propose standardisation on passing
>> directory paths around, I'm starting to lean back towards this
>> approach.
>> My rationale for this doesn't really have a lot to do with topics
>> we've discussed so far, and instead asks the question: what would work
>> best for an installation frontend that wanted to keep the actual build
>> tools off the system doing the installation, while still allowing for
>> transparent "from sdist" installations?
> I think that we're discovering a variety of reasons why an unpacked
> distribution may not be clearly preferable to an archive. In the absence
> of a clear benefit, I think it's advantageous to say that the archive is
> the canonical interchange format which different tools produce and
> consume. This is more compelling for wheels than for sdists, since the
> wheel format is more precisely specified, but it seems more internally
> consistent to say that they both build archives.

Agreed. As part of an unrelated discussion [1], I also realised that
PEP 517 may actually provide a way for Linux-only projects to add
venv-compatible shims to their projects without having to fully
decouple themselves from their distro packaging: defining backends
based on projects like dirtbike and rewheel that bundle up the
project's pure Python pieces into a wheel file.

It would definitely be a hack, but may help with cases like RPM's
Python bindings (which are currently hard to test on Python versions
other than the system Python, since they can't readily be built
independently of the system RPM package).

[1] https://bugs.python.org/issue30628#msg295970

> I've updated the PR to specify zip archives for build_wheel and .tar.gz
> archives for build_sdist.


I've added one suggestion, which is to explicitly require PAX_FORMAT
for the sdist tarballs produced this way (that's a POSIX format
standardised in 2001 and supported by both 2.7 and 3.x that
specifically requires that the paths be encoded as UTF-8). While the
standard library does still default to GNU_FORMAT in general, the
stated rationale for doing so (it being more widely supported than
PAX_FORMAT) was last updated more than 10 years ago, and I don't think
it applies here.

>>>    2) Specify that the wheel generation hook, metadata hook, and sdist
>>>    hook return the name of path that they created as a unicode string.
>>>    Rationale: fixes a point of ambiguity in the current spec.
>> And still leaves the door open to supporting multiple wheels in the
>> future by returning a list instead of string.
> This makes sense to me, and I've added that to the PR.
> I have currently specified that they should return only the basename of
> the file/directory they create, not the full path. I don't think there's
> any particular reason to prefer one or the other, but it needs to be
> specified. Does anyone think there's a concrete reason they should be
> full paths?

+1 from me for relative paths, since the frontend is specifying the
base path anyway. It's an affordance that encourages backends to use
the supplied storage location, rather than returning arbitrary paths
(e.g. to files in temp directories)

> Finally, I noticed while updating the PEP that there's a section showing
> how simple a PEP 517 backend can be, with a tiny working example.
> Following our discussions, that example is incomplete (it's missing at
> least the sdist hook). Do we want to:
> 1. Make the example complete again, which would make it much less
> simple.
> 2. Remove the whole section about how easy it is to implement a backend
> (it's not a very convincing example anyway, because it requires the user
> to manually do most of the work in preparing a wheel).
> (3. Try to actually make it simple to implement a backend ;-)

I think it would be worthwhile to make the example complete again,
using the "archive everything except dot-prefixed files and
directories" sdist construction strategy.

Something like:

    def _exclude_hidden_and_special_files(archive_entry):
        """Tarfile filter to exclude hidden and special files from the
        if entry.isfile() or entry.isdir():
            if not os.path.basename(archive_entry.name).startswith("."):
                return archive_entry
        return None

    def build_sdist(sdist_dir, config_settings):
        sdist_subdir = "mypackage-0.1"
        sdist_path = pathlib.Path(sdist_dir) / (sdist_subdir + ".tar.gz")
        sdist = tarfile.open(sdist_path, "w:gz", format=tarfile.PAX_FORMAT)
        sdist.add(os.getcwd(), arcname=sdist_subdir,

(I haven't actually tested that code, but I believe it should be
reasonably close to doing the right thing)


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Distutils-SIG mailing list