Make UTF-8 mode more accessible for Windows users.
The fact that Python does not use UTF-8 as the default encoding when opening text files is an obstacle for many Windows users, especially beginners in programming. If you search for UnicodeDecodeError, you will see that many Windows users have encountered the problem. This list is only part of many search results. * https://qiita.com/Yuu94/items/9ffdfcb2c26d6b33792e * https://www.mikan-partners.com/archives/3212 * https://teratail.com/questions/268749 * https://github.com/neovim/pynvim/issues/443 * https://www.coder.work/article/1284080 * https://teratail.com/questions/271375 * https://qiita.com/shiroutosan/items/51358b24b0c3defc0f58 * https://github.com/jsvine/pdfplumber/issues/304 * https://ja.stackoverflow.com/questions/69281/73612 * https://trend-tracer.com/pip-error/ Looking at the errors, the following are the most common cases. * UnicodeDecodeError is raised when trying to open a text file written in UTF-8, such as JSON. * UnicodeEncodeError is raised when trying to save text data retrieved from the web, etc. * User run `pip install` and `setup.py` reads README.md or LICENSE file written in UTF-8 without `encoding="UTF-8"` Users can use UTF-8 mode to solve these problems. I wrote a section for UTF-8 mode in the "3. Using Python on Windows" document. https://docs.python.org/3/using/windows.html#utf-8-mode However, UTF-8 mode is still not very well known. How can we make UTF-8 mode more user-friendly? Right now, UTF-8 mode can be enabled using the `-Xutf8` option or the `PYTHONUTF8` environment variable. This is a hurdle for beginners. In particular, Jupyter users may not use the command line at all. Is it possible to enable UTF-8 mode in a configuration file like `pyvenv.cfg`? * User can enable UTF-8 mode per-install, and per-venv. * But difficult to write the setting file when Python is installed for system (not for user), or Windows Store Python * User can still enable UTF-8 mode in venv. But many beginners don't need venv. Is it possible to make it easier to configure? * Put a checkbox in the installer? * Provide a small tool to allow configuration after installation? * python3 -m utf8mode enable|disable? * Accessible only for CLI user * Add "Enable UTF-8 mode" and "Disable UTF-8 mode" to Start menu? Any ideas are welcome. -- Inada Naoki <songofacandy@gmail.com>
Thanks for working so hard to move this forward! The "real" solution is to change the defaults not to use the system encoding at all -- which, of course, we are moving towards with PEP 597. So first a plug to do that as fast as possible! I myself would love to see PEP 597 implemented tomorrow -- for all supported versions of Python. However, the real trick here is that Python is a programming language/library/runtime -- not an application. So the folks starting up the interpreter are very often NOT the same as the folks writing the code. And this is why this is the issue it is -- folks write code on *nix systems, or maybe Windows with utf-8 as a system encoding, or only test with ASCII data, or ... -- then someone else actually runs the code, on Windows, and it doesn't work. Even if the person is technically writing the code, they may have copy and pasted it or who knows what? Think about it -- of all the Python code you run (libraries, etc) -- how much of it did you write yourself? (I myself have been highly negligent with my teaching materials in this regard -- so have personally unleashed dozens of folks writting buggy code on the world.) Anyway -- I'm afraid any combination of start-up flags, environment variables, etc. will not be enough -- is there a way to enable UTF-8 mode in the code, e.g. with a __future__ import? This may be impossible, as UTF-8 modeis an interpreter global setting, and it could get very messy if a __future import__ in one library changes the behavior of all the other code -- but maybe there's some way to accomplish something similar? from __future__ import utf8_mode Could monkey patch open() for that module, but would there be any way to have it work, on a module basis, for all other uses of TextIOWrapper? Maybe one work around would be for the __future__ import (Or something) to set the mode, and then trigger warnings for all uses of TextIOWrapper that don't use utf-8 -- that us turn on PEP597 So you'd use one library that had the __future__ import, and it wouldn't break any other code, but it would turn on Warnings. Anyway, this is a very hard problem, but what I'm trying to get at is that we don't want the exact same code to run differently depending on what environment it's running in. Currently, it depends on the system encoding, we'd just be switching to it depending on whether utf-mode is turned on, which is better, I suppose, (e.g Jupyter could choose to turn utf-mode on by default for example), but would still have the same fundamental problem. Imagine someone runs some code in Jupyter, and it's fine, and then they run it in plain Python, on the same machine, and it breaks -- ouch! BTW: is there a way at runtime to check for UTF8 mode? Then at least I could raise a warning in my code. Or maybe simply check if locale.getpreferredencoding() returns utf-8, and raise a warning if not. That wouldn't be hard to do, but it might be worth having a small utility that does it in a _future__import: from __future__ import warn_if_not_utf8 On Wed, Jan 27, 2021 at 11:35 PM Inada Naoki <songofacandy@gmail.com> wrote:
Is it possible to enable UTF-8 mode in a configuration file like `pyvenv.cfg`?
I can't see how that's any more powerful/flexible than an environment variable. Is it possible to make it easier to configure?
* Put a checkbox in the installer? * Provide a small tool to allow configuration after installation? * python3 -m utf8mode enable|disable? * Accessible only for CLI user * Add "Enable UTF-8 mode" and "Disable UTF-8 mode" to Start menu?
This is still going to have the same fundamental problems of the same code running differently on different machines or even the same machine in different environments, installs -- someone upgrades and forgets to check that box again, etc .... Maybe this would be a good thing to do once there are Warnings in place? -CHB Any ideas are welcome.
-- Inada Naoki <songofacandy@gmail.com> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/LQVK2U... Code of Conduct: http://python.org/psf/codeofconduct/
-- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Fri, Jan 29, 2021 at 4:00 AM Christopher Barker <pythonchb@gmail.com> wrote:
The "real" solution is to change the defaults not to use the system encoding at all -- which, of course, we are moving towards with PEP 597. So first a plug to do that as fast as possible! I myself would love to see PEP 597 implemented tomorrow -- for all supported versions of Python.
Note that PEP 597 doesn't change the default encoding. It just adds an option to emit a warning when the default encoding is used. I think it might take about 10 years to change it.
However, the real trick here is that Python is a programming language/library/runtime -- not an application. So the folks starting up the interpreter are very often NOT the same as the folks writing the code.
And this is why this is the issue it is -- folks write code on *nix systems, or maybe Windows with utf-8 as a system encoding, or only test with ASCII data, or ... -- then someone else actually runs the code, on Windows, and it doesn't work. Even if the person is technically writing the code, they may have copy and pasted it or who knows what? Think about it -- of all the Python code you run (libraries, etc) -- how much of it did you write yourself?
(I myself have been highly negligent with my teaching materials in this regard -- so have personally unleashed dozens of folks writting buggy code on the world.)
You are right. Many codes are written by other people. It cause UnicodeDecodeError on Windows. And UTF-8 mode rescues it.
Anyway -- I'm afraid any combination of start-up flags, environment variables, etc. will not be enough -- is there a way to enable UTF-8 mode in the code, e.g. with a __future__ import? This may be impossible, as UTF-8 mode is an interpreter global setting, and it could get very messy if a __future import__ in one library changes the behavior of all the other code -- but maybe there's some way to accomplish something similar?
Could monkey patch open() for that module, but would there be any way to have it work, on a module basis, for all other uses of TextIOWrapper?
UTF-8 mode is used to decode command-line arguments and environment variables on Unix. So UTF-8 mode can be enabled only at startup for now. This restriction is caused by Unix so I think we can add something like `sys._enable_utf8_mode()` only on Windows if it is really needed. But it means codes using `sys._enable_utf8_mode()` are Windows-only. It doesn't make sense. Another way is adding runtime option to change only the default text encoding. (e.g. `io.set_default_encoding("utf-8")`) This is a considerable option. When we add it on the top of scripts or Notebook, it uses UTF-8 to open files on all platforms. On the other hand, it adds another "xxx encoding" terminology to Python. Python has too many "xxx encoding"s and it confuses users. So I am cautious about adding another encoding option and focus on UTF-8 mode now.
Maybe one work around would be for the __future__ import (Or something) to set the mode, and then trigger warnings for all uses of TextIOWrapper that don't use utf-8 -- that us turn on PEP597
So you'd use one library that had the __future__ import, and it wouldn't break any other code, but it would turn on Warnings.
Please don't discuss PEP 597 in this thread. Let's focus on UTF-8 mode. They are different approaches and they are not mutually exclusive. * UTF-8 mode helps users who see UnicodeDecodeError while `pip install`. * PEP 597 helps developers to notice `open("README.md").read()` in `setup.py`.
Anyway, this is a very hard problem, but what I'm trying to get at is that we don't want the exact same code to run differently depending on what environment it's running in. Currently, it depends on the system encoding, we'd just be switching to it depending on whether utf-mode is turned on, which is better, I suppose, (e.g Jupyter could choose to turn utf-mode on by default for example), but would still have the same fundamental problem.
Imagine someone runs some code in Jupyter, and it's fine, and then they run it in plain Python, on the same machine, and it breaks -- ouch!
You are right. UTF-8 mode must be accessible for both of Jupyter on conda Python and Python installed by official installer. If UTF-8 mode is accessible enough, user can fix it by enabling UTF-8 mode.
BTW: is there a way at runtime to check for UTF8 mode? Then at least I could raise a warning in my code. Or maybe simply check if locale.getpreferredencoding() returns utf-8, and raise a warning if not.
There is `sys.flags.utf8_mode`. But UTF-8 mode is not used on most Unix users because locale encoding is UTF-8. So checking `locale.getpreferredencoding(False)` is better. But note that `locale.getpreferredencoding(False)` may return "utf8", "utf-8", "utf_8", "UTF-8"...
That wouldn't be hard to do, but it might be worth having a small utility that does it in a _future__import:
from __future__ import warn_if_not_utf8
It seems you are misusing __future__ import. __future__ import is for compilers and parsers. It is not for runtime behavior. And I don't think we should add `warn_if_not_utf8()` for now.
Is it possible to enable UTF-8 mode in a configuration file like `pyvenv.cfg`?
I can't see how that's any more powerful/flexible than an environment variable.
It is powerful/flexible for power users. But not for beginners. Imagine users execute Jupyter from the start menu. * Command-line `-Xutf8` or `set PYTHONUTF8=1` is not accessible. * User environment variable is not accessible too, and it may affect other Python installations.
Is it possible to make it easier to configure?
* Put a checkbox in the installer? * Provide a small tool to allow configuration after installation? * python3 -m utf8mode enable|disable? * Accessible only for CLI user * Add "Enable UTF-8 mode" and "Disable UTF-8 mode" to Start menu?
This is still going to have the same fundamental problems of the same code running differently on different machines or even the same machine in different environments, installs -- someone upgrades and forgets to check that box again, etc ....
There are pros and cons. If we use user-wide (or system-wide) setting like `PYTHONUTF8` in user environment variable, all Python environments use UTF-8 mode consistently. But it will break legacy applications running on old Python environment. If we have per-environment option, it's easy to recommend users to enable UTF-8 mode.
Maybe this would be a good thing to do once there are Warnings in place?
Do you mean programs only runs on UTF-8 mode warns if UTF-8 mode is not enabled? e.g. ``` if sys.platform == "win32" and not sys.flags.utf8_mode: sys.exit("This programs runs only on UTF-8 mode. Please enable UTF-8 mode.") ``` Then, I don't like it... Windows only API to enable UTF-8 mode in runtime seems better. ``` if sys.platform == "win32": sys._win32_enable_utf8mode() ``` Regards, -- Inada Naoki <songofacandy@gmail.com>
On Thu, Jan 28, 2021 at 4:25 PM Inada Naoki <songofacandy@gmail.com> wrote:
The "real" solution is to change the defaults not to use the system encoding at all -- which, of course, we are moving towards with PEP 597. So first a plug to do that as fast as possible! I myself would love to see PEP 597 implemented tomorrow -- for all supported versions of Python.
Note that PEP 597 doesn't change the default encoding. It just adds an option to emit a warning when the default encoding is used.
I know -- and THAT could be done soon, yes?
I think it might take about 10 years to change it.
I hope it's not that long -- having code that runs differently in different environments is not good ...
However, the real trick here is that Python is a programming language/library/runtime -- not an application. So the folks starting up the interpreter are very often NOT the same as the folks writing the code.
And this is why this is the issue it is -- folks write code on *nix
systems, or maybe Windows with utf-8 as a system encoding, or only test with ASCII data, or ... -- then someone else actually runs the code, on Windows, and it doesn't work. Even if the person is technically writing the code, they may have copy and pasted it or who knows what? Think about it -- of all the Python code you run (libraries, etc) -- how much of it did you write yourself?
(I myself have been highly negligent with my teaching materials in this
regard -- so have personally unleashed dozens of folks writting buggy code on the world.)
Many codes are written by other people. It cause UnicodeDecodeError on Windows. And UTF-8 mode rescues it.
exactly. But the trick is that UTF-* mode is in control of the end user / installer of Python, not the writer of the code.
UTF-8 mode is used to decode command-line arguments and environment variables on Unix. So UTF-8 mode can be enabled only at startup for now. This restriction is caused by Unix so I think we can add something like `sys._enable_utf8_mode()` only on Windows if it is really needed. But it means codes using `sys._enable_utf8_mode()` are Windows-only. It doesn't make sense.
well, that would be a no-op on other platforms.
Another way is adding runtime option to change only the default text encoding. (e.g. `io.set_default_encoding("utf-8")`) This is a considerable option. When we add it on the top of scripts or Notebook, it uses UTF-8 to open files on all platforms.
On the other hand, it adds another "xxx encoding" terminology to Python. Python has too many "xxx encoding"s and it confuses users. So I am cautious about adding another encoding option
I appreciate that -- but I do like handing control over to the code-writer, rather than the python-installer.
Maybe one work around would be for the __future__ import (Or something) to set the mode, and then trigger warnings for all uses of TextIOWrapper that don't use utf-8 -- that us turn on PEP597
So you'd use one library that had the __future__ import, and it wouldn't break any other code, but it would turn on Warnings.
Please don't discuss PEP 597 in this thread. Let's focus on UTF-8 mode. They are different approaches and they are not mutually exclusive.
Sure, but they are related. But I"ll try to find the right thread for PEP 597
Imagine someone runs some code in Jupyter, and it's fine, and then they run it in plain Python, on the same machine, and it breaks -- ouch!
You are right. UTF-8 mode must be accessible for both of Jupyter on conda Python and Python installed by official installer. If UTF-8 mode is accessible enough, user can fix it by enabling UTF-8 mode.
Sure -- but these days folks may have multiple environments and multiple ways to run code (Jupyter, IDEs), so it's way too easy to have UTF-8 mode on in some but not others -- all on the same machine. I'm not a Windows user (much), but users of my library are, and my students are, and I'm having a hard time figuring out what will make this work for them. In the case of my students, I can encourage UTF-8 mode for all installations. In the case of my library users -- it's harder, but I can do the same to some extent -- I do currently suggest a conda environment for my code -- so yes, making it easier to turn it on in an environment would be good. Hmm -- sorry for thinking as I write here, but if UTF-8 mode could be part of an environment spec -- that would be good. So it there a way to have a package installed that turned it on? (obviously a no-op on other platfroms). So you would specify a dependency on the utf8_mode package, At run time, if the utf8_mode package was installed, then UTF-8 mode would be turned on. So that wouldn't quite put it in the hands of the coder -- but would put it in the hands of the application developer -- the person writing the requirements file. So checking `locale.getpreferredencoding(False)` is better.
But note that `locale.getpreferredencoding(False)` may return "utf8", "utf-8", "utf_8", "UTF-8"...
A good reason to provide a utility for this then -- I know i have no idea all the ways it could be spelled.
That wouldn't be hard to do, but it might be worth having a small utility that does it in a _future__import:
from __future__ import warn_if_not_utf8
It seems you are misusing __future__ import. __future__ import is for compilers and parsers. It is not for runtime behavior.
well yes -- but to the "layperson" -- it's a way to say: "make this code act like it will in the future" --which is this case.
And I don't think we should add `warn_if_not_utf8()` for now.
I've been thinking about this -- on the one hand, if I, as a library or application author, am thinking about this issue, then I can (and should) add the ``encoding="utf-8"`` flag everywhere I open a text file in my code. So why not just do that, rather than adding an extra import or function call, or whatever? But in fact, I know I've (and my dev team) have been lazy, and have a lot of places where I should be setting the encoding and am not. And sure, I know how to use grep -- I can find all those places. But it would actually be a lot easier and more reliable to have a way to set up the future behavior. But maybe a topic for another thread.
Is it possible to enable UTF-8 mode in a configuration file like `pyvenv.cfg`?
I can't see how that's any more powerful/flexible than an environment variable.
It is powerful/flexible for power users. But not for beginners. Imagine users execute Jupyter from the start menu.
* Command-line `-Xutf8` or `set PYTHONUTF8=1` is not accessible. * User environment variable is not accessible too, and it may affect other Python installations.
which is actually what I like about environment variables -- it could apply to all Python installations on the system -- which would be a good thing! Where would Python look for a "configuration file like `pyvenv.cfg`" ? If we use user-wide (or system-wide) setting like `PYTHONUTF8` in user
environment variable, all Python environments use UTF-8 mode consistently. But it will break legacy applications running on old Python environment.
not ones old enough not to look for PYTHONUTF8 -- it would only change if the Python were upgraded. and at least some legacy applications are using py2exe and the like, and those would still be safe.
If we have per-environment option, it's easy to recommend users to
enable UTF-8 mode.
Back to my idea above -- any way to have that be a pip (and conda) installable package? So it could be in a requirements file? Do you mean programs only runs on UTF-8 mode warns if UTF-8 mode is
not enabled? e.g.
``` if sys.platform == "win32" and not sys.flags.utf8_mode: sys.exit("This programs runs only on UTF-8 mode. Please enable UTF-8 mode.") ```
Then, I don't like it... Windows only API to enable UTF-8 mode in runtime seems better.
``` if sys.platform == "win32": sys._win32_enable_utf8mode() ```
I agree -- if that's possible, then it's a better option. Though I would make it simply: ``sys._enable_utf8mode()`` and have it be a no-op outside of Windows. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Sat, Jan 30, 2021 at 3:45 PM Christopher Barker <pythonchb@gmail.com> wrote:
On Thu, Jan 28, 2021 at 4:25 PM Inada Naoki <songofacandy@gmail.com> wrote:
The "real" solution is to change the defaults not to use the system encoding at all -- which, of course, we are moving towards with PEP 597. So first a plug to do that as fast as possible! I myself would love to see PEP 597 implemented tomorrow -- for all supported versions of Python.
Note that PEP 597 doesn't change the default encoding. It just adds an option to emit a warning when the default encoding is used.
I know -- and THAT could be done soon, yes?
Sorry for the delay. I want to do it in Python 3.10, but I am not sure the PEP is accepted. I updated the PEP today and working on reference implementation now.
I think it might take about 10 years to change it.
I hope it's not that long -- having code that runs differently in different environments is not good ...
I agree. But backward compatibility is important too.
Many codes are written by other people. It cause UnicodeDecodeError on Windows. And UTF-8 mode rescues it.
exactly. But the trick is that UTF-* mode is in control of the end user / installer of Python, not the writer of the code.
Yes. But the writer of the code can specify `encoding="utf-8"`. PEP 597 helps it. Additionally, I am proposing per environment option. If code owner distributes the application by embeddable Python or venv, they can use UTF-8 mode too.
Please don't discuss PEP 597 in this thread. Let's focus on UTF-8 mode. They are different approaches and they are not mutually exclusive.
Sure, but they are related. But I"ll try to find the right thread for PEP 597
The thread for the PEP is https://discuss.python.org/t/pep-597-raise-a-warning-when-encoding-is-omitte...
Imagine someone runs some code in Jupyter, and it's fine, and then they run it in plain Python, on the same machine, and it breaks -- ouch!
You are right. UTF-8 mode must be accessible for both of Jupyter on conda Python and Python installed by official installer. If UTF-8 mode is accessible enough, user can fix it by enabling UTF-8 mode.
Sure -- but these days folks may have multiple environments and multiple ways to run code (Jupyter, IDEs), so it's way too easy to have UTF-8 mode on in some but not others -- all on the same machine.
If the user don't have legacy application, they can set PYTHONUTF8 as user environment variable. Then all environments are working on UTF-8 mode.
I'm not a Windows user (much), but users of my library are, and my students are, and I'm having a hard time figuring out what will make this work for them.
In the case of my students, I can encourage UTF-8 mode for all installations.
Yes, I think UTF-8 mode will help teachers and students. Maybe, WSL2.will be another option.
In the case of my library users -- it's harder, but I can do the same to some extent -- I do currently suggest a conda environment for my code -- so yes, making it easier to turn it on in an environment would be good.
If PEP 597 is accepted, you can find all code omitting `encoding="utf-8"`. Your library users can run it without UTF-8 mode.
It is powerful/flexible for power users. But not for beginners. Imagine users execute Jupyter from the start menu.
* Command-line `-Xutf8` or `set PYTHONUTF8=1` is not accessible. * User environment variable is not accessible too, and it may affect other Python installations.
which is actually what I like about environment variables -- it could apply to all Python installations on the system -- which would be a good thing!
It will be great for many users. But it will be not good for some users using legacy Python applications. So it is difficult to recommend UTF-8 mode to everyone.
Where would Python look for a "configuration file like `pyvenv.cfg`" ?
I am not a Windows expert so I am not sure. But I think it should be the same directory where `python.exe` is in.
Back to my idea above -- any way to have that be a pip (and conda) installable package? So it could be in a requirements file?
I have no idea. -- Inada Naoki <songofacandy@gmail.com>
On Sat, Jan 30, 2021 at 4:05 AM Inada Naoki <songofacandy@gmail.com> wrote:
Sorry for the delay. I want to do it in Python 3.10, but I am not sure the PEP is accepted. I updated the PEP today and working on reference implementation now.
great, thanks! let us know if there's anything else we can do to help that along.
The thread for the PEP is
https://discuss.python.org/t/pep-597-raise-a-warning-when-encoding-is-omitte...
Thanks -- I guess I need to get on discuss finally :-)
If PEP 597 is accepted, you can find all code omitting `encoding="utf-8"`. Your library users can run it without UTF-8 mode.
yes -- a good reason to get that done :-)
Where would Python look for a "configuration file like `pyvenv.cfg`" ?
I am not a Windows expert so I am not sure. But I think it should be the same directory where `python.exe` is in.
I'm not a Windows expert either, but I do think that it's pretty common for Windows to use the location of the exe as part of start-up, location of config files, etc.
Back to my idea above -- any way to have that be a pip (and conda) installable package? So it could be in a requirements file?
I have no idea.
This, I think is worth exploring -- a way for an application or library to specofy that it is expecting utf8 mode. Conda can put a file anywhere (within the conda environment), so a config file would be very doable. But I'm not sure if pip can put anything outside of site packages -- and I'm not sure if Python knows where site-packages is early enough in the startup process. So if anyone knows if there would be a way to pip-install UTF8-mode -- I think that would be a nice feature. BTW -- is it guaranteed that all other supported systems other than Windows use utf-8? or should UTF8-mode be available everywhere, even though in most cases it won't make a difference. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 30 Jan 2021, at 12:05, Inada Naoki <songofacandy@gmail.com> wrote:
Where would Python look for a "configuration file like `pyvenv.cfg`" ?
I am not a Windows expert so I am not sure. But I think it should be the same directory where `python.exe` is in.
You can put the system default there but each user needs to have a file that they can control to set the per user config. py.exe uses %LOCALAPPDATA%\py.ini I'd suggest that you could have a %LOCALAPPDATA%\python.ini. Barry
On Tue, Feb 2, 2021 at 6:31 AM Barry Scott <barry@barrys-emacs.org> wrote:
On 30 Jan 2021, at 12:05, Inada Naoki <songofacandy@gmail.com> wrote:
Where would Python look for a "configuration file like `pyvenv.cfg`" ?
I am not a Windows expert so I am not sure. But I think it should be the same directory where `python.exe` is in.
You can put the system default there but each user needs to have a file that they can control to set the per user config.
py.exe uses %LOCALAPPDATA%\py.ini
I'd suggest that you could have a %LOCALAPPDATA%\python.ini.
But what happen if user installed Python from python.org and Python from conda? User may have two or more Pythons having the same version. In my idea, if user can not change the config of system installed Python, user still can create venv and change the setting for the venv. -- Inada Naoki <songofacandy@gmail.com>
On 2 Feb 2021, at 00:22, Inada Naoki <songofacandy@gmail.com> wrote:
On Tue, Feb 2, 2021 at 6:31 AM Barry Scott <barry@barrys-emacs.org <mailto:barry@barrys-emacs.org>> wrote:
On 30 Jan 2021, at 12:05, Inada Naoki <songofacandy@gmail.com> wrote:
Where would Python look for a "configuration file like `pyvenv.cfg`" ?
I am not a Windows expert so I am not sure. But I think it should be the same directory where `python.exe` is in.
You can put the system default there but each user needs to have a file that they can control to set the per user config.
py.exe uses %LOCALAPPDATA%\py.ini
I'd suggest that you could have a %LOCALAPPDATA%\python.ini.
But what happen if user installed Python from python.org <http://python.org/> and Python from conda? User may have two or more Pythons having the same version.
Apple showed the way by using reversed FQD's. Python.org <http://python.org/> would use org.python.python.ini Conda.io <http://conda.io/> would use io.conda.python.ini barry-emacs.org <http://barry-emacs.org/>'s python would use org.barrys-emacs.python.python.ini Further we would need to support multiple versions of python from the same org installed side-by-side. Structure the .ini so that it has default settings and version specific settings. --- [default] utf8_mode = true [3.8-64] utf8-mode = false ---
In my idea, if user can not change the config of system installed Python, user still can create venv and change the setting for the venv.
That fine for the venv users but does not help the people that do not need/want to use venv. Barry
-- Inada Naoki <songofacandy@gmail.com <mailto:songofacandy@gmail.com>>
On Tue, Feb 2, 2021 at 11:12 AM Barry Scott <barry@barrys-emacs.org> wrote:
Where would Python look for a "configuration file like `pyvenv.cfg`" ?
I am not a Windows expert so I am not sure. But I think it should be the same directory where `python.exe` is in.
A small note here -- ideally there would be nothing Windows specific here. Yes, UTF-8 mode is Windows only, but:
1) should it be? I'm still unsure on this, but while the vast majority of other platforms use UTF-8 -- maybe it would be more robust to have UTF-8 mode available everywhere -- essentially, ignore the "system encoding", no matter the system. 2) perhaps UTF-8 mode isn't the only use-case for this -- it would be good to have a way to have Python startup parameters that are installation / environment specific -- that could help with other issues with "global" configuration: PYTHONPATH, PYTHONHOME, PYTHONSTARTUP, other PYTHON* environment variables.
I'd suggest that you could have a %LOCALAPPDATA%\python.ini
But what happen if user installed Python from python.org and Python from conda? User may have two or more Pythons having the same version.
and different environments, be they conda environments, pipenv, what have
you.
Apple showed the way by using reversed FQD's.
Python.org would use org.python.python.ini Conda.io would use io.conda.python.ini barry-emacs.org's python would use org.barrys-emacs.python.python.ini
Further we would need to support multiple versions of python from the same org installed side-by-side. Structure the .ini so that it has default settings and version specific settings.
This would be pretty painful to manage -- it's a "bad idea" to have a single configuration file that is being managed and updated by any number of different tools. And those tools are managed by different groups of people. And this would be VERY hard for end users to manage -- as people installed and uninstalled python versions and environments, they would get a very, very messy global ini file. Also: "conda python" is not necessarily any different than python.org python -- it's generally built exactly the same way -- the only difference is how it's installed. So it would be much better for the config file to be located inside the Python installation: essentially a 1:1 relationship between the python executable and the config file. So you know for sure that if you fire up the python you are intending to, you will get the configuration that comes with it.
[3.8-64] utf8-mode = false
The key issue here is that the configuration is not "version number specific" -- it's (or should be) application specific. And Python has had that issue for ages: as a run-time system (for lack of a better word), each application needs a different set of packages in various versions. And that's been addressed with with "environments" of various sorts. So it would be good to leverage that, and have a config file that goes right along with the environment systems: i.e. it's part of that particular Python install, And ideally, it could be installed with pip somehow, so that an application could specify a python configuration along with its package specifications. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2 Feb 2021, at 19:45, Christopher Barker <pythonchb@gmail.com> wrote:
On Tue, Feb 2, 2021 at 11:12 AM Barry Scott <barry@barrys-emacs.org <mailto:barry@barrys-emacs.org>> wrote:
Where would Python look for a "configuration file like `pyvenv.cfg`" ?
I am not a Windows expert so I am not sure. But I think it should be the same directory where `python.exe` is in.
A small note here -- ideally there would be nothing Windows specific here. Yes, UTF-8 mode is Windows only, but:
1) should it be? I'm still unsure on this, but while the vast majority of other platforms use UTF-8 -- maybe it would be more robust to have UTF-8 mode available everywhere -- essentially, ignore the "system encoding", no matter the system.
2) perhaps UTF-8 mode isn't the only use-case for this -- it would be good to have a way to have Python startup parameters that are installation / environment specific -- that could help with other issues with "global" configuration: PYTHONPATH, PYTHONHOME, PYTHONSTARTUP, other PYTHON* environment variables.
Forgive me if I'm miss understanding your position. Windows is successful because of its backward compatibility. That includes big problems with text encoding in the modern world. Aside: HTML 5 even has a encoding rule that acknowledges that web pages marked utf-8 are really windows USA code page and show how to fall back! Are you calling env vars global config? But it is a Windows problem only. I write code accepting that unix (*BSD, Linux), macOS and Windows have unique quirks. To believe that if you ignore that reality is not a path that leads to happiness.
I'd suggest that you could have a %LOCALAPPDATA%\python.ini
But what happen if user installed Python from python.org <http://python.org/> and Python from conda? User may have two or more Pythons having the same version.
and different environments, be they conda environments, pipenv, what have you.
Apple showed the way by using reversed FQD's.
Python.org <http://python.org/> would use org.python.python.ini Conda.io <http://conda.io/> would use io.conda.python.ini barry-emacs.org <http://barry-emacs.org/>'s python would use org.barrys-emacs.python.python.ini
Further we would need to support multiple versions of python from the same org installed side-by-side. Structure the .ini so that it has default settings and version specific settings.
This would be pretty painful to manage -- it's a "bad idea" to have a single configuration file that is being managed and updated by any number of different tools. And those tools are managed by different groups of people.
Why would *tools* be messing with my personal config file? No *tool* messes with py.exe config that I know of.
And this would be VERY hard for end users to manage -- as people installed and uninstalled python versions and environments, they would get a very, very messy global ini file.
End users are best served with good, practical defaults. Choosing those defaults is very hard which is the point of this thread. I do not advocate a global ini file. I do not think I asked for that. I'm suggesting a mechanism that is identical to the way the py.exe is configured.
Also: "conda python" is not necessarily any different than python.org <http://python.org/> python -- it's generally built exactly the same way -- the only difference is how it's installed.
A distro will often patch in distro specific changes. If conda needs to be configure independently from python.org <http://python.org/> kits this is an obvious requirement.
So it would be much better for the config file to be located inside the Python installation: essentially a 1:1 relationship between the python executable and the config file. So you know for sure that if you fire up the python you are intending to, you will get the configuration that comes with it.
You mean where it may well not be editable without admin privs? That is a bad thing surely?
[3.8-64] utf8-mode = false
The key issue here is that the configuration is not "version number specific" -- it's (or should be) application specific. And Python has had that issue for ages: as a run-time system (for lack of a better word), each application needs a different set of packages in various versions. And that's been addressed with with "environments" of various sorts. So it would be good to leverage that, and have a config file that goes right along with the environment systems: i.e. it's part of that particular Python install,
If its an application problem then we already know that its a matter of coding the encode/decode utf-8 explicitly. An app could maybe benifit from a sys.set_default_text_encoding() call. But if its package by package that is far harder problem to solve. C:\> pip install assume-utf8 assume-legacy-windows-encoding How does that get resolved?
And ideally, it could be installed with pip somehow, so that an application could specify a python configuration along with its package specifications.
You can only do that for things that pip installs as runnable programs right? And we would need to add the ability to set the encoding assumption in setup.py. Barry
-CHB
-- Christopher Barker, PhD (Chris)
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Tue, Feb 2, 2021 at 1:37 PM Barry Scott <barry@barrys-emacs.org> wrote:
2) perhaps UTF-8 mode isn't the only use-case for this -- it would be good to have a way to have Python startup parameters that are installation / environment specific -- that could help with other issues with "global" configuration: PYTHONPATH, PYTHONHOME, PYTHONSTARTUP, other PYTHON* environment variables.
Forgive me if I'm miss understanding your position.
Windows is successful because of its backward compatibility. That includes big problems with text encoding in the modern world.
Aside: HTML 5 even has a encoding rule that acknowledges that web pages marked utf-8 are really windows USA code page and show how to fall back!
But that doesn't depend ina. system setting does it? So I don't get your point: If you think UTF8 mode is a bad idea, then don't use it. or are you saying it should be turned on for everything on a given Windows system, which also doesn't make sense?
Are you calling env vars global config?
well, yes and no -- a number of systems will set environment variables before they start up, which does allow local configuration. But I can't see how that's practical in this case -- it would require another layer on top of the python executable to set up the environment variables. In the common case, folks have their environment variables set in an initialization file (or the registry? I've lost track of what Windows does these days) so each user has their own set, but for one user, every instance of Python will use the same set. You run some code with Python3.8, then gain with 3.9, then again with a Jupyter notebook running ??, then a virtual environment you've set up for your web app -- all the same set of Environment variables. conda, on the other hand, does set environment variables when you activate an environment, so it could provide a separate set for each environment -- but that would be a conda-specific solution. And I'm not sure that you can have a conda package set them either, so it would need to be a new feature added to conda anyway. But it is a Windows problem only.
I write code accepting that unix (*BSD, Linux), macOS and Windows have unique quirks. To believe that if you ignore that reality is not a path that leads to happiness.
Sure, but it would still be nice to have one fewer quirk to deal with. And this isn't about Windows-specific behavior -- it's about behavior that is potentially different among each instance of Windows. And as a practical matter -- there is far too much Python code out there that does, indeed, ignore the specific reality that not all systems have utf-8 as their default.
This would be pretty painful to manage -- it's a "bad idea" to have a single configuration file that is being managed and updated by any number of different tools. And those tools are managed by different groups of people.
Why would *tools* be messing with my personal config file? No *tool* messes with py.exe config that I know of.
are you suggesting that users will need to hand-edit, and only hand-edit their config file? and that they will be able to figure out how to set the right special code for python 3.9 installed from python.org with a particular virtual environment? even if you ignore virtualenv -- the example you gave would have to be created for three different Pythons, sourced from three different "vendors", and I'd sure hope that they'd have their installer put something in the config for you.
And this would be VERY hard for end users to manage -- as people installed and uninstalled python versions and environments, they would get a very, very messy global ini file.
End users are best served with good, practical defaults. Choosing those defaults is very hard which is the point of this thread.
we already have a default for this thread -- it's UTF8-moe is off.and with backward compatibility, no other option is on the table. I do not advocate a global ini file. I do not think I asked for that.
not global to the system, but global to all Pythons
I'm suggesting a mechanism that is identical to the way the py.exe is configured.
That's the python launcher?, yes? Forgive my ignorance, but how IS that configured? I do know that it can look at a #! line and start up the appropriate version of Python, but is there any way to have it do any other selection? and how does it / can it be used with virtual environments?
Also: "conda python" is not necessarily any different than python.org python -- it's generally built exactly the same way -- the only difference is how it's installed.
A distro will often patch in distro specific changes. If conda needs to be configure independently from python.org kits this is an obvious requirement.
sure -- it can and it probably does, but it would be good for it not to have to use a different mechanism for this. See above -- environment variables (which already exist for UTF8 mode) may be a fine solution for conda -- not so much for other environment systems.
So it would be much better for the config file to be located inside the Python installation: essentially a 1:1 relationship between the python executable and the config file. So you know for sure that if you fire up the python you are intending to, you will get the configuration that comes with it.
You mean where it may well not be editable without admin privs?
I mean where it can be controlled along with the "environment". But that is a trick -- there should be a place outside of a system python install that needs admin privileges. If its an application problem then we already know that its a matter of
coding the encode/decode utf-8 explicitly.
Except that a python application probably uses a LOT of code written by others. And unfortunately, that's why this is an issue at all -- there is code that should use the encoding specified, that doesn't. And that is our of the hands of the user, and out of the hands of the application author.
An app could maybe benifit from a sys.set_default_text_encoding() call.
yes, that's been brought up -- but I think that would be too late for some use-cases, like setup.py files reading the README. (though I suppose setuptools could call it) But if its package by package that is far harder problem to solve.
C:\> pip install assume-utf8 assume-legacy-windows-encoding
How does that get resolved?
I wasn't thinking that there would be a "assume-legacy-windows-encoding" encoding package -- but I see your point. You can't correctly run a package that is expecting system encoding at the same time as one that is not expecting utf-8 defaults at the same time. My thought on that is that we have "packages" and "applications". packages have their dependencies declared internally (in setup.py, or pypackage.toml) -- applications have an external requirements file. So the assume-utf8 package should only be used in a requirements file, not a package's dependencies. And I think that's a helpful level of granularity -- we don't want ALL python code run on a given machine by a given user to be in UTF-8 Mode (or not), and it's not possible to have it be module or package specific, but if it could be application specific, that could be useful. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2/2/21, Christopher Barker <pythonchb@gmail.com> wrote:
In the common case, folks have their environment variables set in an initialization file (or the registry? I've lost track of what Windows does these days)
It hasn't fundamentally changed since the mid 1990s. Configurable system variables are set in the regsitry key "HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Environment", and configurable user variables are set in "HKCU\Environment". A process is spawned with an environment that's sourced from the parent process. Either it's inherited from the parent's environment or it's a new environment that was passed to CreateProcessW(). The ancestor of most interactive processes in a desktop session is the graphical shell, Explorer. At startup, it calls an undocumented shell32 function (RegenerateUserEnvironment) to load a new environment from scratch. It also reloads its environment in response to a WM_SETTINGCHANGE "Environment" message. The documented way to reload the environment from scratch is CreateEnvironmentBlock(&env, htoken, FALSE) and SetEnvironmentStringsW(env).
On 3 Feb 2021, at 02:49, Christopher Barker <pythonchb@gmail.com> wrote:
Aside: HTML 5 even has a encoding rule that acknowledges that web pages marked utf-8 are really windows USA code page and show how to fall back!
But that doesn't depend ina. system setting does it? So I don't get your point:
The bug in most (?) .net web apps apparently is that the .net libraries convert to the default system locale and do not assume utf-8. The programmer has to explicity use utf-8. So yes it does depend on a system setting. I came across this in the HTML 5 specs because of working on web page content that did not decode not because I'm a .NET developer. I raise this as this seems to be the same problem that python faces with system locale conflicting with the wider world using utf-8. Barry
On 3 Feb 2021, at 02:49, Christopher Barker <pythonchb@gmail.com> wrote:
Rather than reply point by point I will summarise my input. I think that utf-8 mode is a great idea. I think that an .INI file in the style that py.exe uses is better then env var. Env var on WIndows could be used but there can be surprises with the way windows merges user and system env vars. Maybe that only with PATH that is very odd. I'm hoping that the solution implemented allows new users to get a great experience and also that advanced users can get control of the mode. Personally I'd prefer to have files that I edit to configure python then registry keys. I can put files into git, not exmple. I have to work hard to manage registry keys via git. Barry
Although a file adds I/O slowdown to startup (which is already slow) while an envvar doesn’t. On Thu, Feb 4, 2021 at 13:19 Barry Scott <barry@barrys-emacs.org> wrote:
On 3 Feb 2021, at 02:49, Christopher Barker <pythonchb@gmail.com> wrote:
Rather than reply point by point I will summarise my input.
I think that utf-8 mode is a great idea.
I think that an .INI file in the style that py.exe uses is better then env var.
Env var on WIndows could be used but there can be surprises with the way windows merges user and system env vars. Maybe that only with PATH that is very odd.
I'm hoping that the solution implemented allows new users to get a great experience and also that advanced users can get control of the mode.
Personally I'd prefer to have files that I edit to configure python then registry keys. I can put files into git, not exmple. I have to work hard to manage registry keys via git.
Barry _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/N4HV3C... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
On Fri, Feb 5, 2021 at 6:17 AM Barry Scott <barry@barrys-emacs.org> wrote:
Rather than reply point by point I will summarise my input.
I think that utf-8 mode is a great idea.
I think that an .INI file in the style that py.exe uses is better then env var.
Env var on WIndows could be used but there can be surprises with the way windows merges user and system env vars. Maybe that only with PATH that is very odd.
I'm hoping that the solution implemented allows new users to get a great experience and also that advanced users can get control of the mode.
Personally I'd prefer to have files that I edit to configure python then registry keys. I can put files into git, not exmple. I have to work hard to manage registry keys via git.
I 100% agree with you. And pyvenv.cfg satisfies all your needs. When compared pyvenv.cfg with py.ini-like new config file: Cons: * Need system privilege to change the setting of system installed Python. * But user can install another Python, or create venv anyway. Pros: * The file is already supported. * No need to lookup another file at startup. * No need to edit any file outside the install location. * Easy to clean uninstall * Portable app friendly * One file per environment * Breaking the config file affects only one environment. So I still prefer pyvenv.cfg. -- Inada Naoki <songofacandy@gmail.com>
On Thu, Feb 4, 2021 at 7:12 PM Inada Naoki <songofacandy@gmail.com> wrote:
I 100% agree with you. And pyvenv.cfg satisfies all your needs.
oops, sorry i missed this -- maybe pyvenv.cfg will do the job. Though I'm a bit confused about how it might work outside of venv itself, which usually creates the pyvenv.cfg file. Would venv and other tools need to add / change the utf-8 mode key in the pyvenv.cfg file? (I note that that might be pretty straightforward with the use of the EnvBuilder class -- though not as easy as simply installing a file) One idea: could it simply look for a file called "UTF8-MODE" (it could be empty) next to python.exe? That would presumably be faster than having to actually open and read a file, and would be easy to install. - CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 5 Feb 2021, at 03:11, Inada Naoki <songofacandy@gmail.com> wrote:
On Fri, Feb 5, 2021 at 6:17 AM Barry Scott <barry@barrys-emacs.org> wrote:
Rather than reply point by point I will summarise my input.
I think that utf-8 mode is a great idea.
I think that an .INI file in the style that py.exe uses is better then env var.
Env var on WIndows could be used but there can be surprises with the way windows merges user and system env vars. Maybe that only with PATH that is very odd.
I'm hoping that the solution implemented allows new users to get a great experience and also that advanced users can get control of the mode.
Personally I'd prefer to have files that I edit to configure python then registry keys. I can put files into git, not exmple. I have to work hard to manage registry keys via git.
I 100% agree with you. And pyvenv.cfg satisfies all your needs.
When compared pyvenv.cfg with py.ini-like new config file:
Cons:
* Need system privilege to change the setting of system installed Python. * But user can install another Python, or create venv anyway.
Pros:
* The file is already supported. * No need to lookup another file at startup. * No need to edit any file outside the install location. * Easy to clean uninstall * Portable app friendly * One file per environment * Breaking the config file affects only one environment.
So I still prefer pyvenv.cfg.
I'm under the impression that new users will not create a venv. Indeed I run a lot of python scripts outside of venv world. I only use venv as part of my development pipe lines. I not sure that a venv cfg file would not help. But a python.ini could. Barry
-- Inada Naoki <songofacandy@gmail.com>
On Fri, Feb 5, 2021 at 7:59 PM Barry Scott <barry@barrys-emacs.org> wrote:
I'm under the impression that new users will not create a venv. Indeed I run a lot of python scripts outside of venv world. I only use venv as part of my development pipe lines.
I not sure that a venv cfg file would not help. But a python.ini could.
python.exe lookup pyvenv.cfg even outside of venv. So we can write utf8mode=1 in pyvenv.cfg even outside of venv. The main limitation is that users can not write config file in install location when Python is installed for system, not for user. Regards, -- Inada Naoki <songofacandy@gmail.com>
On 5 Feb 2021, at 11:06, Inada Naoki <songofacandy@gmail.com> wrote:
On Fri, Feb 5, 2021 at 7:59 PM Barry Scott <barry@barrys-emacs.org> wrote:
I'm under the impression that new users will not create a venv. Indeed I run a lot of python scripts outside of venv world. I only use venv as part of my development pipe lines.
I not sure that a venv cfg file would not help. But a python.ini could.
python.exe lookup pyvenv.cfg even outside of venv. So we can write utf8mode=1 in pyvenv.cfg even outside of venv.
Oh I did not know that. I'm happy that a new file is not need for the system wide setting.
The main limitation is that users can not write config file in install location when Python is installed for system, not for user.
This is the problem that I was thinking about when I proposed using a py.ini like solution where the file is looked for in the users config folder. I think that is the %LOCALAPPDATA% folder for py.exe. As Chris points out in his summary of the issue. How would this work for different version of python being installed and needing different config? How would this work for python installed from different vendors? Maybe the answer is that there is only one user defined override possible and all versions use it. Also am I right to assume that the impact of these changes would only impact on Windows? Barry
Regards, -- Inada Naoki <songofacandy@gmail.com>
On Fri, Feb 5, 2021 at 8:15 PM Barry Scott <barry@barrys-emacs.org> wrote:
The main limitation is that users can not write config file in install location when Python is installed for system, not for user.
This is the problem that I was thinking about when I proposed using a py.ini like solution where the file is looked for in the users config folder. I think that is the %LOCALAPPDATA% folder for py.exe.
As Chris points out in his summary of the issue.
How would this work for different version of python being installed and needing different config?
Each installation have each config file.
How would this work for python installed from different vendors?
Vendor installer should provide an option for it.
Maybe the answer is that there is only one user defined override possible and all versions use it.
Also am I right to assume that the impact of these changes would only impact on Windows?
I think we don't have any reason to restrict this for Windows. But since this idea is proposed only for Windows users, only Windows installer will have "Enable UTF-8 mode" option. -- Inada Naoki <songofacandy@gmail.com>
On 5 Feb 2021, at 11:49, Inada Naoki <songofacandy@gmail.com> wrote:
On Fri, Feb 5, 2021 at 8:15 PM Barry Scott <barry@barrys-emacs.org> wrote:
The main limitation is that users can not write config file in install location when Python is installed for system, not for user.
This is the problem that I was thinking about when I proposed using a py.ini like solution where the file is looked for in the users config folder. I think that is the %LOCALAPPDATA% folder for py.exe.
As Chris points out in his summary of the issue.
How would this work for different version of python being installed and needing different config?
Each installation have each config file.
I'm talking about the user's override of the system default. The system default is the easy part.
How would this work for python installed from different vendors?
Vendor installer should provide an option for it.
Maybe the answer is that there is only one user defined override possible and all versions use it.
Also am I right to assume that the impact of these changes would only impact on Windows?
I think we don't have any reason to restrict this for Windows. But since this idea is proposed only for Windows users, only Windows installer will have "Enable UTF-8 mode" option.
I'm not sure that Linux and macOS suffer from this problem. Am I wrong to think that? Barry
-- Inada Naoki <songofacandy@gmail.com>
I'm talking about the user's override of the system default.
This is indeed a limitation, but I’m not sure that’s a bad thing. As I think I’ve said before, utf-8 mode is probably not a feature you want turned on for ALL instances of Python, even all instances for a single user. So if a user wants to use UTF-8 mode, and they don’t have admin privileges, and it’s not the system default, then they need to use a user-controlled Python or environment— just llke they do when they want to use a diffferent Python version or set of packages, etc than the system version. I don’t use Windows much, but when I do, it’s a very locked down system, so I have every sympathy for folks without admin privileges. Also — when it comes to environments, I don’t recommend them to Python newbies either — but once you start to need Python for multiple applications with different requirements (including utf-8 mode) then environments are the solution.
Maybe the answer is that there is only one user defined override possible and all versions use it.
That is exactly what I wouldn’t recommend. I'm not sure that Linux and macOS I’m prettty sure that macOS is guaranteed (for now) to have utf-8 as the system encoding. But not sure for all other supported platforms. (All flavors of Linux?) But in practice, the vast majority are using utf-8, so it’s a non-issue. And if we ever do get to utf-8 as the Python default, then it’s guaranteed to be a moot point. By the way, if we go with pyvenv.cfg then an option should be added to the venv command as well. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2/5/21, Barry Scott <barry@barrys-emacs.org> wrote:
On 5 Feb 2021, at 11:06, Inada Naoki <songofacandy@gmail.com> wrote:
python.exe lookup pyvenv.cfg even outside of venv. So we can write utf8mode=1 in pyvenv.cfg even outside of venv.
I don't like extending "pyvenv.cfg" with generic settings. This is a file to configure a virtual environment in terms of finding the standard library and packages. I'd prefer a new configuration file that sets the default values for -X implementation-specific options. The mechanism for finding this file can support virtual environments.
This is the problem that I was thinking about when I proposed using a py.ini like solution where the file is looked for in the users config folder. I think that is the %LOCALAPPDATA% folder for py.exe.
It is standard practice and recommended to create a directory for the organization or project and optionally a child directory for each application, such as "%ProgramData%\Python\Python38-32\python.ini" and "%LocalAppData%\Python\Python38-32\python.ini". I would have preferred for the py launcher to read and merge settings for all existing configuration files in the order of "%ProgramData%\Python\py.ini" (all installations), "%__AppDir__%\py.ini" (particular installation), and "%LocalAppData%\Python\py.ini" (user).
On Sat, Feb 6, 2021 at 5:59 AM Eryk Sun <eryksun@gmail.com> wrote:
I would have preferred for the py launcher to read and merge settings for all existing configuration files in the order of "%ProgramData%\Python\py.ini" (all installations), "%__AppDir__%\py.ini" (particular installation), and "%LocalAppData%\Python\py.ini" (user).
Note that this is setting of python, not of py launcher. And no need for all installations, and per-user setting. Environment variable is that already. I don't want to add many way to configure one option without strong need. Currently, a per-install setting is not possible. So it is the only problem. If adding option to pyvenv.cfg is not make sense, we can add `python.ini` to same place pyvenv.cfg. i.e., directory containing python.exe, or one above directory. -- Inada Naoki <songofacandy@gmail.com>
On 2/6/21, Inada Naoki <songofacandy@gmail.com> wrote:
If adding option to pyvenv.cfg is not make sense, we can add `python.ini` to same place pyvenv.cfg. i.e., directory containing python.exe, or one above directory.
I'd rather look for "python.cfg" in the directory of the base executable (e.g. "C:\Program Files\Python310") and then in the directory of "pyvenv.cfg", if the latter is found. I wouldn't want it to check for "python.cfg" in the parent directory of the base executable.
And no need for all installations, and per-user setting. Environment variable is that already.
A configuration file in a profile data directory can target a particular version, such as "%LocalAppData%\Python\Python310-32\python.cfg". This is more flexible for the user to override a system installation, compared to setting PYTHONUTF8. However, it's not a major issue if you don't want to support the extra flexibility. That said, supporting %ProgramData% and %LocalAppData% data directories is more consistent with how this feature would be implemented in POSIX, such as "/etc/python3.10/python.cfg" and "$HOME/.config/python310/python.cfg". I think that matters because this file would be a good place to set defaults for all -X options (e.g. "utf8", "pycache_prefix", "faulthandler").
On Fri, Feb 5, 2021 at 12:59 PM Eryk Sun <eryksun@gmail.com> wrote:
I don't like extending "pyvenv.cfg" with generic settings. This is a file to configure a virtual environment
Yes indeed. in terms of finding the
standard library and packages.
But why limit it to that? If there are more things to configure in an environment-specific way — why not put it in this existing location? I'd prefer a new configuration file that sets the default values for
-X implementation-specific options. The mechanism for finding this file can support virtual environments.
Then wouldn’t that simply be two configuration files that will be treated the same way?
This is the problem that I was thinking about when I proposed using
a py.ini like solution where the file is looked for in the users config folder. I think that is the %LOCALAPPDATA% folder for py.exe.
I’m still convinced that It is a bad idea to have User-wide Python configuration like this. The fact is that different Python apps (may) need different configurations, and environments are the way to support that. Yes, not everyone uses virtual environments, but I see that as some people use only one environment— rather than not using environments at all. So it would be really good to have a single environment be configured the same way as multiple environments. This would also work better with conda environments, which work at a “higher” level: there could be any number of instances of, say, Python 3.9.6 - and each could potentially need a different configuration: having them all reference the same place in %LOCALAPPDATA% folder would be a mess. I know it seems like I’m advocating breaking from the standards already established outside of Python, but we need to remember that Python is not an application, it is a run-time environment that may support multiple applications. If there are standards for configuration of similar things, then we should look at those. Final point for emphasis: It would be really great if the chosen solution supported conda (and maybe other) environments well. -Chris B -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 2/6/21, Christopher Barker <pythonchb@gmail.com> wrote:
On Fri, Feb 5, 2021 at 12:59 PM Eryk Sun <eryksun@gmail.com> wrote:
But why limit it to that? If there are more things to configure in an environment-specific way — why not put it in this existing location?
I'd rather not limit the capability to just virtual environments.
I'd prefer a new configuration file that sets the default values for
-X implementation-specific options. The mechanism for finding this file can support virtual environments.
Then wouldn’t that simply be two configuration files that will be treated the same way?
Relative to the installation, "python.cfg" should only be found in the same directory as the base executable, not its parent directory. If "pyvenv.cfg" is found, then it's a virtual environment, and "python.cfg" will also be looked for in the directory of "pyvenv.cfg", and supersedes settings in the base installation.
I’m still convinced that It is a bad idea to have User-wide Python configuration like this. The fact is that different Python apps (may) need different configurations, and environments are the way to support that.
Add an option in the installed "python.cfg" to set the name of the organization and application. If not set, the organization and application respectively default to "Python" and "Python<major><minor>[-32]". Looking for system and user configuration would be parameterized using that name, i.e. "%ProgramData%\<organization>\<application>\python.cfg" and "%LocalAppData%\<organization>\<application>\python.cfg".
On Thu, Feb 4, 2021 at 1:17 PM Barry Scott <barry@barrys-emacs.org> wrote:
Rather than reply point by point I will summarise my input.
Thanks!
I think that utf-8 mode is a great idea.
agreed.
I'm hoping that the solution implemented allows new users to get a great experience and also that advanced users can get control of the mode.
I think we all agree here.
Personally I'd prefer to have files that I edit to configure python then registry keys.
Amen! I think that an .INI file in the style that py.exe uses is better then env
var.
Note that the env var is already available, so this is agreeing with Inada Naoki that we should add another way to set it. By "in the style" I assume you mean (from PEP 397): """ Two .ini files will be searched by the launcher - py.ini in the current user's "application data" directory (i.e. the directory returned by calling the Windows function SHGetFolderPath with CSIDL_LOCAL_APPDATA, %USERPROFILE%\AppData\Local on Vista+, %USERPROFILE%\Local Settings\Application Data on XP) and py.ini in the same directory as the launcher. """ But are you suggesting that the py.exe launcher be used in this case? Or that there'd be two locations searched for an ini file: in the user's "application data" directory and next the python.exe executable ? If so, then yes, I think that would work -- as the various "environment" systems all have copies of python.exe inside their environments, so that would allow per-environment configuration -- which is what I think we need. I'm going to step back a bit from the solution to my ideas for the specification: 1) UTF-8 mode is a global-to-the interpreter setting. So there is no way to have something like a __future__ import that will turn it on for just one module or package. 2) A given system can have any number of different versions (python3.8, 3.9, .. python.org install vs conda install, or python shipped with software, like the ESRI ArcMap) of Python installed, AND potentially any number of environments with a given Python version. 3) A given system can also have multiple users, and each of those may have different needs. With all these different "pythons", many folks probably wouldn't want to have them ALL configured the same way -- even all within a given user's space -- changing the setting to "fix" one application might break others. Python has already had this problem for years: installing / upgrading a package for one app could break other apps. And the solution that the community has converged on is "environments" -- there are various implementations, but they all do a similar thing -- allow multiple configurations of the same Python version on the same system, for the same user. So I think it makes all the sense in the world to have the ability to have UTF-8 mode specific to the environment. Then we have to decide who is in "control" of the utf-8 mode setting? Certainly the end-user should be, and maybe the system administrator should be -- at least for the defaults. But I think the application developer should be able to specify utf-8 mode for a given application, and it should be easy for a relatively naive user to get it as the application developer wants it. Sure, one could write in the Installation instructions that this app will work best under Windows with utf-8 mode turned on, and directions for how to do that, but that might be a bit too easy to miss. So I'd love to see a way to have a pip-installable package that turns on utf-8 mode, so one could simply put it in the application's requirements file, and end users wouldn't have to do anything special, and would simply get the right behaviour with an ordinary install. The only missing piece is how to specify an environment that is configured a certain way. As far as I know, the only thing you can specify for pipenv or virtualenv is a package installable by pip. and I think tha pip will only install things in site-packages -- not "next to" the python.exe file. But I think the site-packages path hasn't been configured yet when this is needed, so that's a trick. So if this config file could be somewhere pip could install it I think that would be helpful. Personally, I'm a conda user, and conda can install files anywhere in the tree -- so I would likely make a conda package to enforce utf-8 mode if this becomes available, so I could put it in my applications' requirements, and know my users will have it turned on when they run my code. - Chris B -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Wed, Jan 27, 2021 at 11:36 PM Inada Naoki <songofacandy@gmail.com> wrote:
* UnicodeDecodeError is raised when trying to open a text file written in UTF-8, such as JSON. * UnicodeEncodeError is raised when trying to save text data retrieved from the web, etc. * User run `pip install` and `setup.py` reads README.md or LICENSE file written in UTF-8 without `encoding="UTF-8"`
Users can use UTF-8 mode to solve these problems.
They can use it to solve *those* problems, but probably at the cost of creating different problems. There's a selection bias here, because you aren't seeing cases where a script worked because the default encoding was the correct one. If you switch a lot of ordinary users (not power users who already use it) to UTF-8 mode, I think a lot of scripts that currently work will start failing, or worse, silently producing bogus output that won't be understood by a downstream tool. I'm not convinced this wouldn't be a bigger problem than the problem you're trying to solve. * Put a checkbox in the installer?
I'm pretty Unicode-savvy, but even I would have no idea whether to check that box. Do I wish that everything was UTF-8? Yes. Do I want Python to assume that everything is UTF-8? Probably not.
On Fri, Jan 29, 2021 at 12:54 PM Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
On Wed, Jan 27, 2021 at 11:36 PM Inada Naoki <songofacandy@gmail.com> wrote:
* UnicodeDecodeError is raised when trying to open a text file written in UTF-8, such as JSON. * UnicodeEncodeError is raised when trying to save text data retrieved from the web, etc. * User run `pip install` and `setup.py` reads README.md or LICENSE file written in UTF-8 without `encoding="UTF-8"`
Users can use UTF-8 mode to solve these problems.
They can use it to solve *those* problems, but probably at the cost of creating different problems.
There's a selection bias here, because you aren't seeing cases where a script worked because the default encoding was the correct one. If you switch a lot of ordinary users (not power users who already use it) to UTF-8 mode, I think a lot of scripts that currently work will start failing, or worse, silently producing bogus output that won't be understood by a downstream tool. I'm not convinced this wouldn't be a bigger problem than the problem you're trying to solve.
I understand it so I proposed per-install UTF-8 mode. User can set PYTHONUTF8=1 user environment variable for now. But it may break existing applications. My proposal is per-environment UTF-8 mode. When user want to install new Python to learn Python, they can enable UTF-8 mode only for the new Python environment without breaking existing applications.
* Put a checkbox in the installer?
Do I want Python to assume that everything is UTF-8? Probably not.
Even you don't want, many developers assume default is always UTF-8 already. And you can enable UTF-8 mode only in one venv to run such code, if UTF-8 mode can be enabled by pyvenv.cfg. -- Inada Naoki <songofacandy@gmail.com>
účastníci (6)
-
Barry Scott -
Ben Rudiak-Gould -
Christopher Barker -
Eryk Sun -
Guido van Rossum -
Inada Naoki