
On 15.09.2021 00:16, Guido van Rossum wrote:
Here I think we need to drop our perfectionist attitude. When I saw Marc-Andre's proposal my first response was also "but what about threads." But really, os.chdir() is the culprit here, and since it's a syscall we can't fix it. If we can live with that, we can live with the proposed os.workdir(). The docs just need a warning about threads.
Yes, you could create a thread-safe version of this by overriding open() -- and several hundred other functions that refer to pathnames natively. Such a project will inevitably have serious backwards compatibility concerns (any use of pathnames from C extensions will not work right) so it's just not going to be done.
If we don't offer this in the stdlib, users will just implement this themselves, poorly (for example, by not restoring the original when done).
Thanks for all the feedback. I'm just going to reply here in a single email. First, let me repeat the notes I had added to the proposal, since these seem to have gotten lost in the thread. I'll some more context:
Notes: - The context manager is not thread safe. There's no thread safe model for the current work dir. OTOH, scripts usually don't use threads, so not a big deal.
The CWD (current working directory) concept is a very old concept where the OS manages a pointer into the file system on a per process basis in order to handle requests for relative paths and map them to absolute paths. https://en.wikipedia.org/wiki/Working_directory Since this is per process and managed by the OS, the concept does not extend to threads (possibly) running in the process. While Python could theoretically try to emulate parts of this for threads as well, this is not within the scope of os.workdir(). It's supposed to be a simple convenient way of saying: please chdir here, do something and then return to the previously active work dir. The problem it's trying to solve is that last part... functions forgetting to restore the previous active work dir can easily break other parts of the script, since the CWD is used for resolving relative paths in Python as well. If thread-safety is seen as a big problem, we could add a concept of managing a single CWD across all threads using lock protected globals. Not sure whether this is worth the trouble, but perhaps making it optional would ease some concerns. In this case, os.chdir() would have to be adapted as well, in order to update the Python CWD state in the same way (and possibly block or warn when os.chdir() is called inside an active with os.workdir() block).
- The context manager could be made more elaborate, e.g. adding optional logging for debugging purposes.
We could add a logger parameter to the context manager for this to customize the logger, or simply use the default one from logging.
- The same could be added to pathlib's objects as .workdir() method returning a context manager.
Since people are starting to use pathlib more, this would be a natural thing to do, as others have noted already :-) The context manager will also accept a path-like object as argument (simply because it passes this directly to os.chdir() which already does). As for having the context manager method on a Path object return the Path object itself (suggested by David), this could be made to work by returning a subclass of Path, which then has the necessary extras needed for CWD processing. I'll add some more notes: - Why in the os module ? Because this is where you'd search for a way to set the CWD and indeed, we have os.chdir() and os.getcwd() in this module. Also see the note on pathlib above. - There are other global settings which could benefit from such context managers as well, such as umask (mentioned by Cameron) or the locale settings, which you sometimes want to temporarily modify, but those are not used as often. - Why the lowercase name ? Simply to make it fit the other os module APIs. Classes should normally have SnakeCase names, but since we're not interested in the class aspect here, but rather the context manager API of the resulting object, I think it's fair to stick with lower case. Many other context managers use a lower case name as well. - Chris mentioned that library code should not be changing the CWD. In an ideal world, I'd agree, but there are cases where you don't have an option and you need to change the CWD in order make certain things work, e.g. you don't control the code you want to run, but need to make it work in a specific directory by changing the CWD and then passing relative paths to the code. If you use subprocesses for this, you can make this work, since the module forks for you and then sets the CWD in the forked process on Unix, but this still leaves you with a gap in Python, since you may well want to code in the same block which e.g. relies on os.abspath() working with your modified version of the CWD. - Main use case ? Shell scripts written in Python instead of bash. - Adding a big warning to the docs. Yes, absolutely, and os.chdir() should receive one in the docs as well. - openat() et al. (mentioned by Cameron and Christian). These could be used to implement a thread-safe way of working with CWDs, essentially placing the CWD under control of Python. However, Python extensions and the OS would now know about this per-thread CWD, so you'd most likely run into problems with only part of your code working with these per-thread CWDs. This is definitely out of scope for os.workdir(). - Bike shedding on the name. "workdir" is what I use for the context manager, but e.g. os.cwd() would also do and is more in line with os.getcwd(). We could even have os.chdir() return the context manager, without introducing any new API in the os module. Still, I find "with os.workdir()" quite intuitive :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Sep 15 2021)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/