New submission from Gregory P. Smith firstname.lastname@example.org:
The underlying API calls made by os.putenv() and os.environ[name] = value syntax are not thread safe on POSIX systems. POSIX _does not have_ any thread safe way to access the process global environment.
In a pure Python program, the GIL prevents this from being an issue. But when embedded in a C/C++ program or using extension modules that launch their own threads from C, those threads could also make the invalid assumption that they can safely _read_ the environment. Which is a race condition when a Python thread is doing a putenv() at the same time.
We should document the danger.
CPython's os module snapshots a copy of the environment into a dict at import time (during CPython startup). But os.environ assignment and os.putenv() modify the actual process global environment in addition to updating this dict. (If an embedded interpreter is launched from a process with other threads already running, even that initial environment reading could be unsafe if the larger application has a thread that wrongly assumes it has exclusive environment access)
For people modifying os.environ so that the change is visible to child processes, we can recommend using the env= parameter on subprocess API calls to supply a new environment.
A broader issue of should we be modifying the process global environment state at all from os.putenv() and os.environ assignment still exists. I'll track that in another issue (to be opened).
assignee: docs@python components: Documentation messages: 360221 nosy: docs@python, gregory.p.smith priority: normal severity: normal status: open title: Document os.environ[x] = y and os.putenv() as thread unsafe versions: Python 2.7, Python 3.7, Python 3.8, Python 3.9
Eryk Sun email@example.com added the comment:
The warning would not apply to Windows. The environment block is part of the Process Environment Block (PEB) record, which is protected by a critical-section lock. The runtime library acquires the PEB lock before accessing mutable PEB values. For example:
Getting an environment variable:
>>> win32api.GetEnvironmentVariable('foo') Breakpoint 0 hit ntdll!RtlQueryEnvironmentVariable: 00007ffc`d737a2f0 48895c2408 mov qword ptr [rsp+8],rbx ss:00000094`ec9ef470=0000000000000000
RtlQueryEnvironmentVariable acquires the PEB lock (i.e. ntdll!FastPebLock) before getting the value. The lock is passed to RtlEnterCriticalSection in register rcx:
0:000> be 2; g Breakpoint 2 hit ntdll!RtlEnterCriticalSection: 00007ffc`d737b400 4883ec28 sub rsp,28h 0:000> kc 3 Call Site ntdll!RtlEnterCriticalSection ntdll!RtlQueryEnvironmentVariable KERNELBASE!GetEnvironmentVariableW 0:000> ?? @rcx == (_RTL_CRITICAL_SECTION *)@@(ntdll!FastPebLock) bool true
Setting an environment variable:
>>> win32api.SetEnvironmentVariable('foo', 'eggs') Breakpoint 1 hit ntdll!RtlSetEnvironmentVar: 00007ffc`d73bc7d0 4c894c2420 mov qword ptr [rsp+20h],r9 ss:00000094`ec9ef488=0000000000000000
RtlSetEnvironmentVar acquires the PEB lock before setting the environment variable:
0:000> be 2; g Breakpoint 2 hit ntdll!RtlEnterCriticalSection: 00007ffc`d737b400 4883ec28 sub rsp,28h 0:000> kc 3 Call Site ntdll!RtlEnterCriticalSection ntdll!RtlSetEnvironmentVar KERNELBASE!SetEnvironmentVariableW 0:000> ?? @rcx == (_RTL_CRITICAL_SECTION *)@@(ntdll!FastPebLock) bool true
Eryk Sun firstname.lastname@example.org added the comment:
no need to remove that message.
I was discussing the wrong API. It's not directly relevant that Windows API functions that access the process environment are protected by the PEB lock. Python primarily uses [_w]environ, a copy of the process environment that's managed by the C runtime library (ucrt). os.putenv modifies this environment via _wputenv. (Ultimately this syncs with the process environment by calling SetEnvironmentVariableW.)
Functions that modify and read ucrt's environment are protected by a lock. But there's still a concern if multithreaded code reads or modifies [_w]environ concurrent to a _[w]putenv call. Also, [_w]getenv returns a pointer to the value in [_w]environ, so it has the same problem. A significant difference, however, is that _[w]putenv in ucrt is not POSIX compliant, since it copies the caller's string. Also, ucrt has safer [_w]getenv_s and _[w]dupenv_s functions that return a copy.
Daniel Martin email@example.com added the comment:
See also https://bugs.python.org/issue40961 - that bug is not about thread safety, but another quirk around env. variable setting that needs to be documented in the documentation for os.putenv and os.getenv, and so anyone addressing this bug should probably address that one at the same time.
nosy: +Daniel Martin