[Distutils] string types for paths in PEP 517

Thomas Kluyver thomas at kluyver.me.uk
Tue Sep 5 04:00:48 EDT 2017


I considered this. It's *potentially* a problem, but I think we should
not try to deal with it for now:

- Normally, temp files will go in /tmp - so it should be fine to
construct paths of entirely ascii characters.
- Frontends that want the wheel to end up elsewhere can ask for it in a
tmp directory first and then move it, so there's a workaround if it
becomes an issue.
- We already have workarounds for the commonest case of UTF-8 paths + C
locale: ignore the locale and treat paths as UTF-8.
- The 'right' way to deal with it on Unix is to make all paths bytes,
which would introduce a similar issue on Windows. If paths have to be
bytes in some situations and unicode in others, both frontends and
backends need extra complexity to handle that.
- If your non-ascii username breaks stuff on Python 2... Python 3 is
ready to make your life easier.

Thomas

On Tue, Sep 5, 2017, at 07:33 AM, Nathaniel Smith wrote:
> Hi all,
> 
> Quick question about an arcane topic: currently, PEP 517 says that
> paths are always represented as unicode strings. For example, when the
> frontend calls build_wheel, it has to create a temporary dir to hold
> the output wheel, and it passes this in as an absolute path
> represented as a unicode string.
> 
> In Python 3 I think this is totally fine, because the surrogate-escape
> system means that all paths can be represented as unicode strings,
> even on systems like Linux where you can have paths that are invalid
> according to Python's idea of the filesystem encoding.
> 
> In Python 2, if I understand correctly (and I'm not super confident
> that I do), then there is no surrogate-escape, and it's possible to
> have paths that can't be represented as a unicode object. For example,
> if someone's home directory is /home/stéfan in UTF-8 but Python thinks
> that the locale is C, and a frontend tries to make a tmpdir in
> $HOME/.local/tmp/ and pass it to a backend then... everything blows
> up, I guess?
> 
> So I guess this is a question for those unfortunate souls who
> understand these details better than me (hi Nick!): is this actually a
> problem, and is there anything we can/should do differently?
> 
> -n
> 
> -- 
> Nathaniel J. Smith -- https://vorpus.org
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org
> https://mail.python.org/mailman/listinfo/distutils-sig


More information about the Distutils-SIG mailing list