Handling linker scripts reached when dynamically loading a module
Hi, In Haskell I experienced a situation where dynamically loaded modules were experiencing "invalid ELF header" errors. This was caused by the module names actually referring to linker scripts rather than ELF binaries. I patched the GHC runtime system to deal with these scripts. I noticed that this same patch has been ported to Ruby and Node.js, so I suggested to the libc developers that they might wish to incorporate the patch into their library, making it available to all languages. They rejected this suggestion, so I am making the suggestion to the Python devs in case it is of interest to you. Basically, when a linker script is loaded by dlopen, an "invalid ELF header" error occurs. The patch checks to see if the file is a linker script. If so, it finds the name of the real ELF binary with a regular expression and tries to dlopen it. If successful, processing proceeds. Otherwise, the original "invalid ELF error" message is returned. If you want to add this code to Python, you can look at my original patch (http://hackage.haskell.org/trac/ghc/ticket/2615) or the Ruby version (https://github.com/ffi/ffi/pull/117) or the Node.js version (https://github.com/rbranson/node-ffi/pull/5) to help port it. Note that the GHC version in GHC 7.2.1 has been enhanced to also handle another possible error when the linker script is too short, so you might also want to add this enhancement also (see https://github.com/ghc/blob/master/rts/Linker.c line 1191 for the revised regular expression): "(([^ \t()])+\\.so([^ \t:()])*):([ \t])*(invalid ELF header|file too short)" At this point, I don't have the free time to write the Python patch myself, so I apologize in advance for not providing it to you. HTH, Howard B. Golden Northridge, California, USA
Excuse me for asking a newbie question, but what are linker scripts
and why are they important? I don't recall anyone ever having
requested this feature before.
--Guido
On Wed, Sep 7, 2011 at 12:33 PM, Howard B. Golden
Hi,
In Haskell I experienced a situation where dynamically loaded modules were experiencing "invalid ELF header" errors. This was caused by the module names actually referring to linker scripts rather than ELF binaries. I patched the GHC runtime system to deal with these scripts.
I noticed that this same patch has been ported to Ruby and Node.js, so I suggested to the libc developers that they might wish to incorporate the patch into their library, making it available to all languages. They rejected this suggestion, so I am making the suggestion to the Python devs in case it is of interest to you.
Basically, when a linker script is loaded by dlopen, an "invalid ELF header" error occurs. The patch checks to see if the file is a linker script. If so, it finds the name of the real ELF binary with a regular expression and tries to dlopen it. If successful, processing proceeds. Otherwise, the original "invalid ELF error" message is returned.
If you want to add this code to Python, you can look at my original patch (http://hackage.haskell.org/trac/ghc/ticket/2615) or the Ruby version (https://github.com/ffi/ffi/pull/117) or the Node.js version (https://github.com/rbranson/node-ffi/pull/5) to help port it.
Note that the GHC version in GHC 7.2.1 has been enhanced to also handle another possible error when the linker script is too short, so you might also want to add this enhancement also (see https://github.com/ghc/blob/master/rts/Linker.c line 1191 for the revised regular expression):
"(([^ \t()])+\\.so([^ \t:()])*):([ \t])*(invalid ELF header|file too short)"
At this point, I don't have the free time to write the Python patch myself, so I apologize in advance for not providing it to you.
HTH,
Howard B. Golden Northridge, California, USA
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)
I don't know why, but some Linux distributions place scripts into .so files instead of the actual binaries. This takes advantage of a feature of GNU ld that it will process the script (which points to the actual binary) when it links the .so file. This feature works fine when you are linking a binary, but it doesn't take into account that binaries can be loaded dynamically by interpreters (e.g., Python or GHCi). If dlopen finds a linker script, it doesn't know what to do with it. It simply diagnoses the file as either an invalid ELF header or too short. On Gentoo Linux, some common libraries that are represented as linker scripts include libm.so, libpthread.so and libpcre.so. I know this also affects Ubuntu. Howard On Sat, 2011-09-10 at 14:39 -0700, Guido van Rossum wrote:
Excuse me for asking a newbie question, but what are linker scripts and why are they important? I don't recall anyone ever having requested this feature before.
--Guido
On Wed, Sep 7, 2011 at 12:33 PM, Howard B. Golden
wrote: Hi,
In Haskell I experienced a situation where dynamically loaded modules were experiencing "invalid ELF header" errors. This was caused by the module names actually referring to linker scripts rather than ELF binaries. I patched the GHC runtime system to deal with these scripts.
I noticed that this same patch has been ported to Ruby and Node.js, so I suggested to the libc developers that they might wish to incorporate the patch into their library, making it available to all languages. They rejected this suggestion, so I am making the suggestion to the Python devs in case it is of interest to you.
Basically, when a linker script is loaded by dlopen, an "invalid ELF header" error occurs. The patch checks to see if the file is a linker script. If so, it finds the name of the real ELF binary with a regular expression and tries to dlopen it. If successful, processing proceeds. Otherwise, the original "invalid ELF error" message is returned.
If you want to add this code to Python, you can look at my original patch (http://hackage.haskell.org/trac/ghc/ticket/2615) or the Ruby version (https://github.com/ffi/ffi/pull/117) or the Node.js version (https://github.com/rbranson/node-ffi/pull/5) to help port it.
Note that the GHC version in GHC 7.2.1 has been enhanced to also handle another possible error when the linker script is too short, so you might also want to add this enhancement also (see https://github.com/ghc/blob/master/rts/Linker.c line 1191 for the revised regular expression):
"(([^ \t()])+\\.so([^ \t:()])*):([ \t])*(invalid ELF header|file too short)"
At this point, I don't have the free time to write the Python patch myself, so I apologize in advance for not providing it to you.
HTH,
Howard B. Golden Northridge, California, USA
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
Odd. Let's see what other core devs say.
On Sat, Sep 10, 2011 at 2:50 PM, Howard B. Golden
I don't know why, but some Linux distributions place scripts into .so files instead of the actual binaries. This takes advantage of a feature of GNU ld that it will process the script (which points to the actual binary) when it links the .so file.
This feature works fine when you are linking a binary, but it doesn't take into account that binaries can be loaded dynamically by interpreters (e.g., Python or GHCi). If dlopen finds a linker script, it doesn't know what to do with it. It simply diagnoses the file as either an invalid ELF header or too short.
On Gentoo Linux, some common libraries that are represented as linker scripts include libm.so, libpthread.so and libpcre.so. I know this also affects Ubuntu.
Howard
On Sat, 2011-09-10 at 14:39 -0700, Guido van Rossum wrote:
Excuse me for asking a newbie question, but what are linker scripts and why are they important? I don't recall anyone ever having requested this feature before.
--Guido
On Wed, Sep 7, 2011 at 12:33 PM, Howard B. Golden
wrote: Hi,
In Haskell I experienced a situation where dynamically loaded modules were experiencing "invalid ELF header" errors. This was caused by the module names actually referring to linker scripts rather than ELF binaries. I patched the GHC runtime system to deal with these scripts.
I noticed that this same patch has been ported to Ruby and Node.js, so I suggested to the libc developers that they might wish to incorporate the patch into their library, making it available to all languages. They rejected this suggestion, so I am making the suggestion to the Python devs in case it is of interest to you.
Basically, when a linker script is loaded by dlopen, an "invalid ELF header" error occurs. The patch checks to see if the file is a linker script. If so, it finds the name of the real ELF binary with a regular expression and tries to dlopen it. If successful, processing proceeds. Otherwise, the original "invalid ELF error" message is returned.
If you want to add this code to Python, you can look at my original patch (http://hackage.haskell.org/trac/ghc/ticket/2615) or the Ruby version (https://github.com/ffi/ffi/pull/117) or the Node.js version (https://github.com/rbranson/node-ffi/pull/5) to help port it.
Note that the GHC version in GHC 7.2.1 has been enhanced to also handle another possible error when the linker script is too short, so you might also want to add this enhancement also (see https://github.com/ghc/blob/master/rts/Linker.c line 1191 for the revised regular expression):
"(([^ \t()])+\\.so([^ \t:()])*):([ \t])*(invalid ELF header|file too short)"
At this point, I don't have the free time to write the Python patch myself, so I apologize in advance for not providing it to you.
HTH,
Howard B. Golden Northridge, California, USA
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)
I can confirm that libpthread.so (/usr/lib/x86_64-linux-gnu/libpthread.so)
is a linker script on my Ubuntu 11.04 install. This hasn't ever caused me
any problems, though.
As for why distributions do this, here are the contents of the script:
/* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily. */
OUTPUT_FORMAT(elf64-x86-64)
GROUP ( /lib/x86_64-linux-gnu/libpthread.so.0
/usr/lib/x86_64-linux-gnu/libpthread_nonshared.a )
Cheers,
Nadeem
On Sun, Sep 11, 2011 at 12:24 AM, Guido van Rossum
Odd. Let's see what other core devs say.
On Sat, Sep 10, 2011 at 2:50 PM, Howard B. Golden
wrote: I don't know why, but some Linux distributions place scripts into .so files instead of the actual binaries. This takes advantage of a feature of GNU ld that it will process the script (which points to the actual binary) when it links the .so file.
This feature works fine when you are linking a binary, but it doesn't take into account that binaries can be loaded dynamically by interpreters (e.g., Python or GHCi). If dlopen finds a linker script, it doesn't know what to do with it. It simply diagnoses the file as either an invalid ELF header or too short.
On Gentoo Linux, some common libraries that are represented as linker scripts include libm.so, libpthread.so and libpcre.so. I know this also affects Ubuntu.
Howard
On Sat, 2011-09-10 at 14:39 -0700, Guido van Rossum wrote:
Excuse me for asking a newbie question, but what are linker scripts and why are they important? I don't recall anyone ever having requested this feature before.
--Guido
On Wed, Sep 7, 2011 at 12:33 PM, Howard B. Golden
wrote: Hi,
In Haskell I experienced a situation where dynamically loaded modules were experiencing "invalid ELF header" errors. This was caused by the module names actually referring to linker scripts rather than ELF binaries. I patched the GHC runtime system to deal with these scripts.
I noticed that this same patch has been ported to Ruby and Node.js, so I suggested to the libc developers that they might wish to incorporate the patch into their library, making it available to all languages. They rejected this suggestion, so I am making the suggestion to the Python devs in case it is of interest to you.
Basically, when a linker script is loaded by dlopen, an "invalid ELF header" error occurs. The patch checks to see if the file is a linker script. If so, it finds the name of the real ELF binary with a regular expression and tries to dlopen it. If successful, processing proceeds. Otherwise, the original "invalid ELF error" message is returned.
If you want to add this code to Python, you can look at my original patch (http://hackage.haskell.org/trac/ghc/ticket/2615) or the Ruby version (https://github.com/ffi/ffi/pull/117) or the Node.js version (https://github.com/rbranson/node-ffi/pull/5) to help port it.
Note that the GHC version in GHC 7.2.1 has been enhanced to also handle another possible error when the linker script is too short, so you might also want to add this enhancement also (see https://github.com/ghc/blob/master/rts/Linker.c line 1191 for the revised regular expression):
"(([^ \t()])+\\.so([^ \t:()])*):([ \t])*(invalid ELF header|file too short)"
At this point, I don't have the free time to write the Python patch myself, so I apologize in advance for not providing it to you.
HTH,
Howard B. Golden Northridge, California, USA
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/nadeem.vawda%40gmail.com
On Sun, 2011-09-11 at 00:39 +0200, Nadeem Vawda wrote:
I can confirm that libpthread.so (/usr/lib/x86_64-linux-gnu/libpthread.so) is a linker script on my Ubuntu 11.04 install. This hasn't ever caused me any problems, though.
As for why distributions do this, here are the contents of the script:
/* GNU ld script Use the shared library, but some functions are only in the static library, so try that secondarily. */ OUTPUT_FORMAT(elf64-x86-64) GROUP ( /lib/x86_64-linux-gnu/libpthread.so.0 /usr/lib/x86_64-linux-gnu/libpthread_nonshared.a )
Cheers, Nadeem
Let me clarify: This will only be a problem when using a foreign function interface to call a non-versioned module dynamically. In the more common situation, when one links to a package specified at link time, the linker figures out the specific, versioned name of the .so file and then the dlopen will refer to the actual binary. So, in Python, this is likely to only affect users calling packages using ctypes. (This corresponds to GHCi loading an unversioned library, e.g., "ghci -lm" which would load the current version of the math library into the GHC interpreter.) Howard
On Sat, Sep 10, 2011 at 4:35 PM, Howard B. Golden
So, in Python, this is likely to only affect users calling packages using ctypes. (This corresponds to GHCi loading an unversioned library, e.g., "ghci -lm" which would load the current version of the math library into the GHC interpreter.)
And it does do so on Gentoo: $ python Python 2.6.6 (r266:84292, Dec 26 2010, 17:43:52) [GCC 4.4.4] on linux2 Type "help", "copyright", "credits" or "license" for more information.
from ctypes import cdll cdll.LoadLibrary('libpthread.so') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/ctypes/__init__.py", line 431, in LoadLibrary return self._dlltype(name) File "/usr/lib/python2.6/ctypes/__init__.py", line 353, in __init__ self._handle = _dlopen(self._name, mode) OSError: /usr/lib/libpthread.so: invalid ELF header cdll.LoadLibrary('libpthread.so.0')
$ cat /usr/lib/libpthread.so /* GNU ld script Use the shared library, but some functions are only in the static library, so try that secondarily. */ OUTPUT_FORMAT(elf32-i386) GROUP ( /lib/libpthread.so.0 /usr/lib/libpthread_nonshared.a ) -- Ben Wolfson "Human kind has used its intelligence to vary the flavour of drinks, which may be sweet, aromatic, fermented or spirit-based. ... Family and social life also offer numerous other occasions to consume drinks for pleasure." [Larousse, "Drink" entry]
Let me clarify: This will only be a problem when using a foreign function interface to call a non-versioned module dynamically.
As such, it won't be much of a problem for Python. In Python, we don't normally dlopen .so files, except when we know they are Python extension modules, in which case we also know that they won't be linker scripts - it just doesn't make sense to write a linker script for what should be a Python module, since you won't ever link against Python modules. The only case where it might matter is ctypes, which is Python's "dynamic" FFI (as opposed to the C API, which is the "static" FFI). However, those libraries which are often wrapped with linker scripts don't typically get used in ctypes - e.g. libpthread won't be used in ctypes, but along with the thread module. The only common case where a library that is often a linker script gets also often used in ctypes (i.e. libc) is already special-cased - ctypes knows how to find the "real" C library. IOW, I would defer this until it becomes a real problem, at what point whoever has that problem ought to provide a patch. Regards, Martin
participants (5)
-
"Martin v. Löwis"
-
Ben Wolfson
-
Guido van Rossum
-
Howard B. Golden
-
Nadeem Vawda