[Tutor] Set LD_LIBRARY_PATH and equivalents platform-independently

eryksun eryksun at gmail.com
Sat Jan 19 15:42:33 CET 2013


On Fri, Jan 18, 2013 at 4:07 PM, Albert-Jan Roskam <fomcl at yahoo.com> wrote:
>
> Alan said:
>> Support for changing environment variables on the fly is iffy in most
>> languages and is not something I'd ever do lightly.
>
> If you put it that way... yes, I now realize that it could be (and in fact
> is) really annoying if programs do such things. ;-)

As initialized by exec, spawn, or Win32 CreateProcess, a child process
uses a copy of the parent's environment, or a new environment.
Modifying it doesn't carry over to the user's profile, or even the
parent process. That said, there's no portable ANSI C function to
modify the environment. CPython uses putenv; it's in the POSIX spec
and available in Windows. If putenv isn't supported on the platform,
then os.environ is just a dict.

The environment consists of the environ array, with pointers to
combined strings, such as 'foo=bar', that are initially in a
contiguous block of memory. putenv realloc's environ to grow the
number of variables. On a POSIX system the caller is responsible for
allocating the new strings on the heap (not automatic memory on the
stack), and as the name implies the exact string is 'put' into
environ. Windows, in contrast, stores a copy, which is more like POSIX
setenv. Either way, CPython keeps an internal reference so the memory
doesn't get free'd. Another Windows difference is that when putenv is
first called it copies the entire environment block as separately
allocated strings.

You can expand on the simple example below to explore how memory gets
malloc'd and realloc'd outside of the initial environment.

    >>> from ctypes import *
    >>> env = POINTER(c_char_p).in_dll(CDLL(None), 'environ')
    >>> i = 0
    >>> while env[i]: i += 1  # the last entry is NULL
    ...
    >>> import os
    >>> os.environ['spam'] = 'eggs'
    >>> env[i]
    'spam=eggs'
    >>> print env[i+1]  # new NULL terminator
    None

CDLL(None), for accessing global symbols, is particular to POSIX
dlopen. On Windows it works if I use "_environ" out of the active CRT
(e.g. msvcr90.dll, msvcr100.dll, but not msvcrt.dll), which can be
found with find_library('c'):

    >>> from ctypes import *
    >>> from ctypes.util import find_library
    >>> env = POINTER(c_char_p).in_dll(CDLL(find_library('c')), '_environ')
    ...

On Windows, Python uppercases the keys set via os.environ. Windows
getenv is case insensitive, so the os.py authors opted to normalize to
uppercase so that 'Path' and 'PATH' have the same dict entry. In 2.x
it still calls putenv with mixed-case keys (as long as the values stay
consistent with the dict it doesn't matter), but 3.x only sets
uppercase keys as part of the Unicode encode/decode redesign. Unicode
also lead to changes for non-Windows platforms: the surrogateescape
encoding and os.environb (bytes).

>>>  As an alternative, I used os.chdir to tell the OS where to start
>>>  looking for libraries, but somebody on this list (I forgot who)
>>>  warned me that this could have nasty side effects.

That works for Windows LoadLibrary and OS X dlopen. Linux dlopen
doesn't look in the current directory for a non-path, but you can use
a relative path (e.g. "./libfoo.so"). For a non-path (i.e. no '/' in
the string), ld.so uses the ELF RPATH, LD_LIBRARY_PATH, ELF RUNPATH,
/etc/ld.so.cache, /lib, and /usr/lib.

As to nasty side effects, your process has its own cwd; changing it
doesn't affect the parent (as in the shell). If desired, you can use a
context manager to automatically return to the previous directory,
even (especially) if there's an exception.

> You mean something like a setup.cfg that is read by ConfigParser? The coolest
> would be if it worked out-of-the-box, but this may be the next best thing. I
> want my C libraries and my .py files to live in the same dir (so not in a
> standard location such as /usr/lib, but in site-packages.

If you're loading libraries with ctypes, as mentioned above you can
provide an absolute or relative path. The path for the Python module
is os.path.dirname(__file__). Your libs can be installed there as
package data. If nothing was done to modify the search path, you'll
have to manage dependencies manually. For example, If libspam depends
on libeggs, you can manually load libeggs before loading libspam. For
ELF .so files this requires that libeggs has a correct SONAME tag.
Verify it with readelf -d filename. This will also show dependencies
listed as NEEDED, or you can use ldd to list the dependencies.


More information about the Tutor mailing list