On 23 Mar 2015 00:45, "Paul Moore" <p.f.moore@gmail.com> wrote:
> Something that hit me today, which might become a more common issue
> when the Windows installers move towards installing to the user
> directory, is that there appear to be some bugs in handling of
> non-ASCII paths.
> Two that I spotted are a failure of the "script wrappers" installed by
> pip to work with a non-ASCII interpreter path (reported to distlib)
> and a possible issue with the py.exe launcher when a script has
> non-ASCII in the shebang line (not reported yet because I'm not clear
> on what's going on).
> I've only seen Windows-specific issues - I don't know how common
> non-ASCII paths for the python interpreter are on Unix or OSX, or
> whether the more or less universal use of UTF-8 on Unix makes such
> issues less common.

POSIX is fine if the locale encoding is correct, but can go fairly wrong if it isn't. Last major complaints I heard related to upstart sometimes getting it wrong in cron and for some daemonized setups (systemd appears to be more robust in setting it correctly as it pulls the expected setting from a system wide config file).

"LANG=C" also doesn't work well, as that tells CPython to use ASCII instead of UTF-8 or whatever the actual system encoding is. Armin Ronacher pointed out "LANG=C.UTF-8" as a good alternative, but whether that's available or not is currently distro-specific. I filed an upstream bug with the glibc devs asking for that to be made standard, and they seemed amenable to the idea, but I haven't checked back in on its progress recently.

> But if anyone has an environment that makes
> testing on non-ASCII install paths easy, it might be worth doing some
> checks just so we can catch any major ones before 3.5 is released.

I'd suggest looking at the venv tests and using them as inspiration to create a separate "test_venv_nonascii" test file that checks:

* creating a venv containing non-ASCII characters
* copying the Python binary to a temporary directory with non-ASCII characters in the name and using that to create a venv

More generally, we should likely enhance the venv tests to actually *run* the installed pip binary to list the installed packages. That will automatically test the distlib script wrappers, as well as checking the installed package set matches what we're currently bundling.

With those changes, the buildbots would go a long way towards ensuring that non-ASCII installation paths always work correctly, as well as making it relatively straightforward for other implementations to adopt the same checks.


> On which note, I'm assuming neither of the issues I've found are major
> blockers. "pip.exe doesn't work if Python is installed in a directory
> with non-ASCII characters in the name" can be worked around by using
> python -m pip, and the launcher issue by using a generic shebang like
> #!/usr/bin/python3.5.
> Paul
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com