[Distutils] PEP 517 - specifying build system in pyproject.toml

Paul Moore p.f.moore at gmail.com
Mon May 22 09:15:09 EDT 2017


On 22 May 2017 at 12:28, Thomas Kluyver <thomas at kluyver.me.uk> wrote:
> What if it wants to send a character which can't be encoded in the
> locale encoding? It's quite easy on Windows to end up with a character
> that you can't encode as cp1252. If the build tool uses .encode(loc_enc,
> 'replace'), then you've lost information even before it gets to the
> install tool.
>
> It's 2017, I really don't want to go down the 'locale specified
> encoding' route again. UTF-8 everywhere!

Hang on. Can we take a step back here? I just re-read the PEP and
remembered (!) that hooks are *in-process* Python entry points (I've
been working with pip's current backend-as-subprocess model, and mixed
up in my mind the original 2 proposals here). I think this encoding
debate may be a red herring.

If a hook is being called as a Python method call, then it can print
what it likes to stdout and stderr. And it's the backend's
responsibility to ensure that it never fails when printing - so the
*backend* has to deal with the fact that anything it wants to print
must be representable in sys.stdout.encoding, with the default (raise
an exception) error handling. Given this fact, and the fact that
sys.stdout and sys.stderr are *text* output streams, build frontends
like pip can reasonably just replace sys.std{out,err} (for example
with a StringIO object) to get hook output. There's no encoding issue
for frontends, they just capture the text sent to the stdio streams.

The rules needed for *backends* are then:

1. Backends MUST NOT write to raw IO channels, all output MUST go via
sys.stdout and sys.stderr. Build frontends MAY redirect these streams
to post-process them, but are not required to do so. As a consequence:

  1a. Backends MUST be prepared to deal with the possibility that
those IO streams have the limitations of the platform IO streams
(e.g., limited subset of Unicode allowed, fails with an exception when
invalid characters are written).

  1b. Backends MUST capture and manage the output from any
subprocesses they spawn (so that they can follow the other rules).

  1c. Backends cannot assume that they can write output that the user
will see - frontends may suppress or modify any output passed on
stdout. Conversely, backends should not bypass the ability of
frontends to capture stdout, as frontends are responsible for user
interaction.

Some of those MUSTs could be replaced by SHOULD, if we want to allow
backends to write directly to the screen. But that is likely to
corrupt the UI of the frontend, so I'm inclined to say that we don't
allow that.

Paul


More information about the Distutils-SIG mailing list