On Wed, Feb 10, 2021 at 8:39 PM Paul Moore p.f.moore@gmail.com wrote:
On Wed, 10 Feb 2021 at 11:01, Inada Naoki songofacandy@gmail.com wrote:
On Wed, Feb 10, 2021 at 5:33 PM Paul Moore p.f.moore@gmail.com wrote:
So get PYTHONUTF8 added to the environment activate script. That's a simple change to venv. And virtualenv, and conda - yes, it need to happen in multiple places, but that's still easier IMO than proposing a change to Python's already complex (and slower than many of us would like) startup process.
I am not sure this idea works fine. Is the activate script always called when venv is used on Windows?
When I use venv on Unix, I often just execute .venv/bin/some-script without activating the venv.
So in your training course, tell users to activate the environment.
I am not sure here. It's not my training course. Target user is thousands of students. They may don't use command prompt at all.
Experienced users (like you) who can run scripts directly aren't the target of this change, are they? This is one of the frustrating points here, I'm not clear who the target is. When I say it wouldn't help me, I'm told I'm not the target. When I suggest an alternative, it apparently isn't useful because it wouldn't work for you...
I'm sorry about it. I didn't mean "it don't work for me". I meant just I am not sure activation script is always executed.
I looked vscode-python and found it execute the activation script. I am not sure about PyCharm yet, but it works if they works like vscode-python.
Another story is clicking .exe files in the Scripts/ directory. But it can be fixed by changing only the launcher exe.
Adding per-venv UTF-8 mode is one attractive option. We can keep python.exe untouched.
Students may need to learn about encoding at some point. But when they learn "how to read/write file" first time, they don't need to know what encoding is.
Agreed.
VSCode, notepad, PyCharm use UTF-8 by default. Students don't need to learn how to use encoding other than UTF-8 until really need it.
If they only use ASCII files and a system codepage that is the same as ASCII for the first 127 characters, they it's irrelevant. If they read data from a legacy system, that is quite likely to be in the system codepage (most of the local files I use at work, for example, are not UTF-8).
But students don't know what is ASCII yet.
So I'd say that many students don't need to learn how to use *any* encoding until they need it. But I'm not a professional trainer, so my experience is limited.
We can add "Enable the UTF-8 mode" checkbox to the installer. And we can have "Enable the UTF-8 mode" tool in the start menu. So students don't need to edit the ini file manually.
Those options could set the environment variable. After all, that's what "Add Python to PATH" does, and people seem OK with that. No need for an ini file (that adds an extra file read to the startup time, as has already been mentioned as a downside).
The problem is; should we recommend to enable UTF-8 mode globally by setting environment variable, or provide a per-site UTF-8 mode setting?
What precisely do you mean by "per site"? Do you mean "per Python interpreter"? Do you view separate virtual environments as "sites"?
One installation is one site. One venv is one site. One conda env is one site. I don't know proper term for it, but I call it "site" because all of them have one "site-packages".
They may not want to promote UTF-8 mode until official Python promote UTF-8 mode. So I think venv should support UTF-8 mode first.
That's fair enough. Although I'd like to point out the parallel here - you're saying "environment tools might not want to make UTF8 the default until Python does". I'm saying "Python might not want to make UTF8 the default until the OS does". I'm not completely sure why your argument is stronger than mine :-)
Oh, I don't propose changing the default encoding for now.
Microsoft provides "Beta: use unicode UTF-8 for worldwide language support in my PC" option. It affects to all application. It is similar to global PYTHONUTF8 environment variable.
Microsoft provides UTF-8 code page (*) too. It affects only one application. It is similar to per-site UTF-8 mode idea.
(*) https://docs.microsoft.com/en-us/windows/uwp/design/globalizing/use-utf8-cod...
So what i am proposing is not more aggressive than Microsoft. Microsoft provides similar options already.
Because it solves many real world problem that many Windows users suffer.
OK. My experience differs, but that's fine. But why wasn't this a consideration when UTF8 mode was first designed? At that point, an interpreter flag and an environment variable were considered sufficient. Why is that no longer true? Is it because the initial design of UTF8 mode ignored Windows?
When I accepted the UTF-8 mode, main target is server application. Some Unix server OS (especially "minimal" container images) only have C locale. Since target users are server side programmers, command-line arg and environment variable are enough.
I knew UTF-8 mode is interesting for Windows too. But Windows users were not main target when I accepted it. After UTF-8 mode is shipped, I noticed UTF-8 mode is very nice for Windows users who learning Python.
Why, if this is such a Windows-specific problem?
For Unix (macOS, iPadOS, Android, ChromeOS, and Linux) desktop users, they uses UTF-8 locale already. Students can learn Python in "UTF-8 is default" environment. UTF-8 mode is used for server applications running in C locale. Server side programmers are familar with command line and environment variables.
On the other hand, Most students learning Python on Windows are not server-side programmer. They are not familar with command line and environment variables. And they are suffered by UnicodeError for now, because the default encoding for text files are not UTF-8.
That is the key difference.
Sigh. To be honest, I don't have the time (or the interest) to go back over all the history here. I think I'm just going to have to drop this discussion and wait to comment when a concrete proposal is put forward. PEP 597 is the only actual PEP on the table at the moment, everything else is just speculation, and I really can't keep up with the volume of discussion in the various threads.
Paul
I'm sorry about it. I have not chose actual implementation yet so I can not write concrete PEP yet.