[Distutils] PEP 517 - specifying build system in pyproject.toml
Thomas Kluyver
thomas at kluyver.me.uk
Mon May 22 07:28:17 EDT 2017
On Mon, May 22, 2017, at 12:02 PM, Paul Moore wrote:
> The only reservation I have is that the choice of UTF-8 means that on
> Windows, build backends pretty much have to explicitly manage tool
> output (as they are pretty much certain *not* to output in UTF-8).
> Build backend writers that aren't aware of this issue (most likely
> because their main platform is not Windows) could very easily choose
> to just pass through the raw bytes, and as a result *all* non-ASCII
> output would be garbled on non-UTF-8 systems.
>
> Would locale.getpreferredencoding() not be a better choice here? I
> know it has issues in some situations on Unix, but are they worse than
> the issues UTF-8 would cause on Windows? After all it's the encoding
> used by subprocess.Popen in "universal newlines" mode...
What if it wants to send a character which can't be encoded in the
locale encoding? It's quite easy on Windows to end up with a character
that you can't encode as cp1252. If the build tool uses .encode(loc_enc,
'replace'), then you've lost information even before it gets to the
install tool.
It's 2017, I really don't want to go down the 'locale specified
encoding' route again. UTF-8 everywhere!
One affordance I'd consider is a recommendation to install tools that if
captured output is not valid UTF-8, they dump the raw bytes to a file so
that no information is lost. I'm not sure if that recommendation needs
to be in the spec itself, though.
Thomas
More information about the Distutils-SIG
mailing list