Installing Python to non-ASCII paths
Something that hit me today, which might become a more common issue when the Windows installers move towards installing to the user directory, is that there appear to be some bugs in handling of non-ASCII paths. Two that I spotted are a failure of the "script wrappers" installed by pip to work with a non-ASCII interpreter path (reported to distlib) and a possible issue with the py.exe launcher when a script has non-ASCII in the shebang line (not reported yet because I'm not clear on what's going on). I've only seen Windows-specific issues - I don't know how common non-ASCII paths for the python interpreter are on Unix or OSX, or whether the more or less universal use of UTF-8 on Unix makes such issues less common. But if anyone has an environment that makes testing on non-ASCII install paths easy, it might be worth doing some checks just so we can catch any major ones before 3.5 is released. On which note, I'm assuming neither of the issues I've found are major blockers. "pip.exe doesn't work if Python is installed in a directory with non-ASCII characters in the name" can be worked around by using python -m pip, and the launcher issue by using a generic shebang like #!/usr/bin/python3.5. Paul
On 22/03/2015 14:44, Paul Moore wrote:
On which note, I'm assuming neither of the issues I've found are major blockers. "pip.exe doesn't work if Python is installed in a directory with non-ASCII characters in the name" can be worked around by using python -m pip, and the launcher issue by using a generic shebang like #!/usr/bin/python3.5.
That can become a more major blocker if "pip doesn't work" == "ensurepip doesn't work and blocks thus the installer crashes" as was the case with a mimetypes issue a little while back. I'll create a £££ user (which is the easiest non-ASCII name to create on a UK keyboard) to see how cleanly the latest installer works. TJG
On 22/03/2015 15:12, Tim Golden wrote:
On 22/03/2015 14:44, Paul Moore wrote:
On which note, I'm assuming neither of the issues I've found are major blockers. "pip.exe doesn't work if Python is installed in a directory with non-ASCII characters in the name" can be worked around by using python -m pip, and the launcher issue by using a generic shebang like #!/usr/bin/python3.5.
That can become a more major blocker if "pip doesn't work" == "ensurepip doesn't work and blocks thus the installer crashes" as was the case with a mimetypes issue a little while back.
I'll create a £££ user (which is the easiest non-ASCII name to create on a UK keyboard) to see how cleanly the latest installer works.
Tried with "Mr £££". The installer's fine but the installed pip.exe fails while "py -3 -mpip" succeeeds as Paul notes. TJG
On 3/22/2015 8:12 AM, Tim Golden wrote:
I'll create a £££ user (which is the easiest non-ASCII name to create on a UK keyboard) to see how cleanly the latest installer works.
You can also copy/paste. A path with a Cyrillic, Greek, Chinese, Tibetan, Japanese, Armenian, and Romanian character, none of which are in the "Windows ANSI" character set, should suffice... Here ya go... ț硕բ文བོདΘ In my work with Windows, I've certainly seen that £ is much more acceptable to more programs than ț or these other ones. <http://ar.wikipedia.org/wiki/%D8%A3%D9%84%D9%81%D8%A8%D8%A7%D8%A6%D9%8A%D8%A...>
On 23/03/2015 01:46, Glenn Linderman wrote:
On 3/22/2015 8:12 AM, Tim Golden wrote:
I'll create a £££ user (which is the easiest non-ASCII name to create on a UK keyboard) to see how cleanly the latest installer works.
You can also copy/paste. A path with a Cyrillic, Greek, Chinese, Tibetan, Japanese, Armenian, and Romanian character, none of which are in the "Windows ANSI" character set, should suffice... Here ya go...
ț硕բ文བོདΘ
In my work with Windows, I've certainly seen that £ is much more acceptable to more programs than ț or these other ones. <http://ar.wikipedia.org/wiki/%D8%A3%D9%84%D9%81%D8%A8%D8%A7%D8%A6%D9%8A%D8%A...>
Thanks, Glenn. Good point. TJG
Hi Paul, Please open an issue, I can take a look. Please describe a scenario to reproduce the issue. Victor 2015-03-22 15:44 GMT+01:00 Paul Moore <p.f.moore@gmail.com>:
Something that hit me today, which might become a more common issue when the Windows installers move towards installing to the user directory, is that there appear to be some bugs in handling of non-ASCII paths.
Two that I spotted are a failure of the "script wrappers" installed by pip to work with a non-ASCII interpreter path (reported to distlib) and a possible issue with the py.exe launcher when a script has non-ASCII in the shebang line (not reported yet because I'm not clear on what's going on).
I've only seen Windows-specific issues - I don't know how common non-ASCII paths for the python interpreter are on Unix or OSX, or whether the more or less universal use of UTF-8 on Unix makes such issues less common. But if anyone has an environment that makes testing on non-ASCII install paths easy, it might be worth doing some checks just so we can catch any major ones before 3.5 is released.
On which note, I'm assuming neither of the issues I've found are major blockers. "pip.exe doesn't work if Python is installed in a directory with non-ASCII characters in the name" can be worked around by using python -m pip, and the launcher issue by using a generic shebang like #!/usr/bin/python3.5.
Paul _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.co...
On 22 March 2015 at 19:34, Victor Stinner <victor.stinner@gmail.com> wrote:
Please open an issue, I can take a look. Please describe a scenario to reproduce the issue.
The "issue" with the launcher seems to have bee a red herring. When I set up a test case properly, it worked. I suspect I messed up writing the shebang line in UTF8. Thanks anyway. (And the pip.exe issue is a distlib issue, which I have reported). Paul
On 23 Mar 2015 00:45, "Paul Moore" <p.f.moore@gmail.com> wrote:
Something that hit me today, which might become a more common issue when the Windows installers move towards installing to the user directory, is that there appear to be some bugs in handling of non-ASCII paths.
Two that I spotted are a failure of the "script wrappers" installed by pip to work with a non-ASCII interpreter path (reported to distlib) and a possible issue with the py.exe launcher when a script has non-ASCII in the shebang line (not reported yet because I'm not clear on what's going on).
I've only seen Windows-specific issues - I don't know how common non-ASCII paths for the python interpreter are on Unix or OSX, or whether the more or less universal use of UTF-8 on Unix makes such issues less common.
POSIX is fine if the locale encoding is correct, but can go fairly wrong if it isn't. Last major complaints I heard related to upstart sometimes getting it wrong in cron and for some daemonized setups (systemd appears to be more robust in setting it correctly as it pulls the expected setting from a system wide config file). "LANG=C" also doesn't work well, as that tells CPython to use ASCII instead of UTF-8 or whatever the actual system encoding is. Armin Ronacher pointed out "LANG=C.UTF-8" as a good alternative, but whether that's available or not is currently distro-specific. I filed an upstream bug with the glibc devs asking for that to be made standard, and they seemed amenable to the idea, but I haven't checked back in on its progress recently.
But if anyone has an environment that makes testing on non-ASCII install paths easy, it might be worth doing some checks just so we can catch any major ones before 3.5 is released.
I'd suggest looking at the venv tests and using them as inspiration to create a separate "test_venv_nonascii" test file that checks: * creating a venv containing non-ASCII characters * copying the Python binary to a temporary directory with non-ASCII characters in the name and using that to create a venv More generally, we should likely enhance the venv tests to actually *run* the installed pip binary to list the installed packages. That will automatically test the distlib script wrappers, as well as checking the installed package set matches what we're currently bundling. With those changes, the buildbots would go a long way towards ensuring that non-ASCII installation paths always work correctly, as well as making it relatively straightforward for other implementations to adopt the same checks. Cheers, Nick.
On which note, I'm assuming neither of the issues I've found are major blockers. "pip.exe doesn't work if Python is installed in a directory with non-ASCII characters in the name" can be worked around by using python -m pip, and the launcher issue by using a generic shebang like #!/usr/bin/python3.5.
Paul _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
participants (5)
-
Glenn Linderman
-
Nick Coghlan
-
Paul Moore
-
Tim Golden
-
Victor Stinner