_PyImport_LoadDynamicModule questions
On AIX, building a shared library for use by extension modules and which uses the Python 'C' API is way harder than it should be**; the best workaround we could find involves hijacking _PyImport_LoadDynamicModule to load the shared library and patch up the symbol references. _PyImportLoadDynamicModule is undocumented, AFAICT. Am I not supposed to touch it? Is it likely to disappear or change its signature? On a related note, I wonder about all of the _[A-Z] names in Python. I know that at least in C++, these names are reserved to the compiler implementation. I assume the same holds for 'C', and wonder if Python's use of these names is intentional, on the "to be changed someday" list, or something else. Thanks in advance, Dave ** details to follow in later messages; AIX is nasty but I think Python could do a few things to help. +---------------------------------------------------------------+ David Abrahams C++ Booster (http://www.boost.org) O__ == Pythonista (http://www.python.org) c/ /'_ == resume: http://users.rcn.com/abrahams/resume.html (*) \(*) == email: david.abrahams@rcn.com +---------------------------------------------------------------+
On AIX, building a shared library for use by extension modules and which uses the Python 'C' API is way harder than it should be**; the best workaround we could find involves hijacking _PyImport_LoadDynamicModule to load the shared library and patch up the symbol references.
_PyImportLoadDynamicModule is undocumented, AFAICT. Am I not supposed to touch it? Is it likely to disappear or change its signature?
That's anybody's guess. I have no plans in this area, but it's only an external symbol because we wanted a logical separation between importdl.c and import.c. If someone has a great idea for refactoring this that involves changing its signature, I don't see why not. Now, maybe you're right that this *should* be documented and guaranteed API -- we've promoted internal APIs to that status before without removing the leading underscore (e.g. _PyString_Resize).
On a related note, I wonder about all of the _[A-Z] names in Python. I know that at least in C++, these names are reserved to the compiler implementation. I assume the same holds for 'C', and wonder if Python's use of these names is intentional, on the "to be changed someday" list, or something else.
They're reserved by standard C too, but I figure we'd be safe by using _Py as the prefix. Same thing really as assuming ASCII or 8-bit bytes -- standard C doesn't guarantee those either. --Guido van Rossum (home page: http://www.python.org/~guido/)
From: "Guido van Rossum" <guido@python.org>
_PyImportLoadDynamicModule is undocumented, AFAICT. Am I not supposed to touch it? Is it likely to disappear or change its signature?
That's anybody's guess. I have no plans in this area, but it's only an external symbol because we wanted a logical separation between importdl.c and import.c. If someone has a great idea for refactoring this that involves changing its signature, I don't see why not.
Now, maybe you're right that this *should* be documented and guaranteed API -- we've promoted internal APIs to that status before without removing the leading underscore (e.g. _PyString_Resize).
Actually, I'm not going to try to convince anyone to make this a stable public API until after I've failed to convince you that maybe there ought to be a libpython.so on AIX. On AIX, the dynamic linking model is in some ways a lot like that of Windows, and it's really tough for a shared library to get ahold of symbols from an executable (witness the strange implementation of the title function of this email on AIX). If it didn't break something else, it might be a major simplification to move the bulk of Python into a shared library on this platform, just as it is on Windows... especially for poor unsuspecting souls (like me) who try to use the Python 'C' API from a shared library (not an extension module). Without some major hackery to patch up those symbols, it unceremoniously dumps core when you try to call Python functions.
On a related note, I wonder about all of the _[A-Z] names in Python. I know that at least in C++, these names are reserved to the compiler implementation. I assume the same holds for 'C', and wonder if Python's use of these names is intentional, on the "to be changed someday" list, or something else.
They're reserved by standard C too, but I figure we'd be safe by using _Py as the prefix. Same thing really as assuming ASCII or 8-bit bytes -- standard C doesn't guarantee those either.
OK; what's the intended meaning of the prefix? -Dave
Actually, I'm not going to try to convince anyone to make this a stable public API until after I've failed to convince you that maybe there ought to be a libpython.so on AIX.
You're talking to the wrong guy. I have no access to AIX, no experience with it, and no understanding of it. Somebody else will have to judge your recommendation, not me.
On AIX, the dynamic linking model is in some ways a lot like that of Windows, and it's really tough for a shared library to get ahold of symbols from an executable (witness the strange implementation of the title function of this email on AIX). If it didn't break something else, it might be a major simplification to move the bulk of Python into a shared library on this platform, just as it is on Windows... especially for poor unsuspecting souls (like me) who try to use the Python 'C' API from a shared library (not an extension module). Without some major hackery to patch up those symbols, it unceremoniously dumps core when you try to call Python functions.
Maybe Anthony Baxter can corrobborate your story.
On a related note, I wonder about all of the _[A-Z] names in Python. I know that at least in C++, these names are reserved to the compiler implementation. I assume the same holds for 'C', and wonder if Python's use of these names is intentional, on the "to be changed someday" list, or something else.
They're reserved by standard C too, but I figure we'd be safe by using _Py as the prefix. Same thing really as assuming ASCII or 8-bit bytes -- standard C doesn't guarantee those either.
OK; what's the intended meaning of the prefix?
They are for names that have to be external symbols because they are used across multiple files, but are not part of the official API that we recommend to extension writers. We reserve the right to remove these APIs or change them incompatibly (although when we find that one of them has accidentally become useful in some way we usually don't do that). --Guido van Rossum (home page: http://www.python.org/~guido/)
From: "Guido van Rossum" <guido@python.org>
Actually, I'm not going to try to convince anyone to make this a stable public API until after I've failed to convince you that maybe there ought to be a libpython.so on AIX.
You're talking to the wrong guy. I have no access to AIX, no experience with it, and no understanding of it. Somebody else will have to judge your recommendation, not me.
By "you" I meant "the wider you", not "the royal you", nor you (Guido) specifically.
On AIX, the dynamic linking model is in some ways a lot like that of Windows, and it's really tough for a shared library to get ahold of symbols from an executable (witness the strange implementation of the title function of this email on AIX). If it didn't break something else, it might be a major simplification to move the bulk of Python into a shared library on this platform, just as it is on Windows... especially for poor unsuspecting souls (like me) who try to use the Python 'C' API from a shared library (not an extension module). Without some major hackery to patch up those symbols, it unceremoniously dumps core when you try to call Python functions.
Maybe Anthony Baxter can corrobborate your story.
I hope "you" find it worth looking into at least.
"David Abrahams" <david.abrahams@rcn.com> writes:
On AIX, building a shared library for use by extension modules and which uses the Python 'C' API is way harder than it should be**; the best workaround we could find involves hijacking _PyImport_LoadDynamicModule to load the shared library and patch up the symbol references.
_PyImportLoadDynamicModule is undocumented, AFAICT. Am I not supposed to touch it? Is it likely to disappear or change its signature?
As Guido explains, you should not use it in an extension module: it may go away without notice, or change its signature. As for dynamic linking on AIX: It would be really good if somebody stepped forward who claims to understand the dynamic loader of AIX, and rewrite Python to make this work more "nicely". Before starting the rewrite, I'd be really curious to hear the full story of why it currently is the way it is, and how this state could be improved. If you find that a "good" redesign requires a shared libpython, then so be it - but I'm still quite fond of the "single executable" appraoch, so preserving that would be even better. Regards, Martin
As for dynamic linking on AIX: It would be really good if somebody stepped forward who claims to understand the dynamic loader of AIX, and rewrite Python to make this work more "nicely". Before starting the rewrite, I'd be really curious to hear the full story of why it currently is the way it is, and how this state could be improved.
+1
If you find that a "good" redesign requires a shared libpython, then so be it - but I'm still quite fond of the "single executable" appraoch, so preserving that would be even better.
Why? I thought there was serious work underway to make libpython.so a reality on Linux? --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum <guido@python.org> writes:
If you find that a "good" redesign requires a shared libpython, then so be it - but I'm still quite fond of the "single executable" appraoch, so preserving that would be even better.
Why? I thought there was serious work underway to make libpython.so a reality on Linux?
It already is, with --enable-shared. However, I still think that people creating --enable-shared installations are misguided: You gain nothing (IMO), and you lose a number of benefits: - starting python will always require the dynamic linker to search for the library, after the system already searched for the executable. This will cause a number of extra stat calls. Together with the need to produce PIC code, this will slow down Python. - If Python is installed into a non-standard location (such as /usr/local on Solaris), you will need additional trickery to find the shared library. Even though the default installation achieves this trickery with a -R option, the resulting binary is not relocatable anymore to a different directory (or, the python binary, but libpython2.3.so isn't) Regards, Martin
From: "Martin v. Loewis" <martin@v.loewis.de>
Guido van Rossum <guido@python.org> writes:
If you find that a "good" redesign requires a shared libpython, then so be it - but I'm still quite fond of the "single executable" appraoch, so preserving that would be even better.
Why? I thought there was serious work underway to make libpython.so a reality on Linux?
It already is, with --enable-shared.
...interesting...
However, I still think that people creating --enable-shared installations are misguided: You gain nothing
On AIX, you'd gain at least a little something: there would be no need to "manually" patch up the symbol references the way _PyImport_LoadDynamicModule is currently doing, and shared libraries that were not themselves extension modules wouldn't have to duplicate that functionality (as I understand you are saying they should in lieu of hijacking the Python call). Whether that makes the separate shared library worth it or not, I can't say. Probably not in light of the other benefits you list here. The AIX problem with resolving symbols from an executable is strange enough that it can be a significant obstacle to porting systems built around Python. The only way we figured it out was by crawling through the Python source to figure out how the symbols were getting resolved. It would be really nice if there were a "documented and stable" version of the core functionality which is doing the symbol patching. Since Python is an executable which expects to share its symbols with libraries, it's not unreasonable to think that Python ought to provide the means to do it. Coming up with a code patch wouldn't be too hard, I think, but where would the documentation go? -Dave
It already is, with --enable-shared.
However, I still think that people creating --enable-shared installations are misguided: You gain nothing (IMO), and you lose a number of benefits:
Do you understand why people have always been asking for this? Are they all misguided? It really is a FAQ (3.30). Why? --Guido van Rossum (home page: http://www.python.org/~guido/)
Coming up with a code patch wouldn't be too hard, I think, but where would the documentation go?
In the C/API docs. There are enough precendents of platform-specific documentation in the core docs. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum <guido@python.org> writes:
However, I still think that people creating --enable-shared installations are misguided: You gain nothing (IMO), and you lose a number of benefits:
Do you understand why people have always been asking for this? Are they all misguided? It really is a FAQ (3.30). Why?
People give various reasons: - (from #400938): "Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications)." This is nonsense, of course: the interpreter executable is shared just as well. - libraries should be shared (405931). There is often no further rationale given, but I believe "... because that saves disk space" is the common implication. Given that /usr/local/bin/python would be the only application of libpythonxy.so on most installation, this rationale is questionable. - it simplifies embedding (with the variant "embedding is not possible without it"). Some people are simply not aware that a libpython.a is also installed. In 497102, James Henstridge argues that PyXPCOM mandates a shared libpython, as does gnome-vfs. He might have a case here, but I think a special-case shared library that exposes all Python symbols might be more appropriate. - on the same topic, the PostgreSQL documentation claims that you cannot build PL/Python without a shared libpython. They admit that it might work to use the static library, and that it is just the fault of their build system to not support this scenario: http://www.postgresql.org/idocs/index.php?plpython-install.html - For embedded applications, people also bring up "allows sharing at run-time" argument). In that case, it is factually true. However, even without a shared libpython, multiple copies of the same embedded executable (or shared library) will still share code. - The Windows port uses a python DLL. To summarize, it probably does have advantages for people who want to embed Python in applications that are themselves shared libraries. I think those advantages are outweighed by the problems people may get with a shared libpython, and who never want to embed Python. Just my 0.02EUR, Martin
However, I still think that people creating --enable-shared installations are misguided: You gain nothing (IMO), and you lose a number of benefits:
Do you understand why people have always been asking for this? Are they all misguided? It really is a FAQ (3.30). Why?
People give various reasons:
- (from #400938): "Using a shared library should have an advantage if you're running multiple instances of Python (be it standalone interpreter or embedded applications)." This is nonsense, of course: the interpreter executable is shared just as well.
It's not nonsense if you have lots of *different* programs that embed the interpreter. If each has a static copy, those static copies aren't shared between different copies.
- libraries should be shared (405931). There is often no further rationale given, but I believe "... because that saves disk space" is the common implication. Given that /usr/local/bin/python would be the only application of libpythonxy.so on most installation, this rationale is questionable.
Agreed. Disk space is cheap.
- it simplifies embedding (with the variant "embedding is not possible without it"). Some people are simply not aware that a libpython.a is also installed. In 497102, James Henstridge argues that PyXPCOM mandates a shared libpython, as does gnome-vfs. He might have a case here, but I think a special-case shared library that exposes all Python symbols might be more appropriate.
Possibly, but this could be countered with "if we're going to have to provide a shared library anyway, it might as well be the only one we offer." I don't understand why PyXPCOM needs a shared library, but it may point to something we'll hear more in the future -- shared libraries are more open to inspection. I also wonder if the ability to slip a (compatible) newer version of a shared library in might not make a good argument for using shared libraries. Use case: you have an app that embeds Python. The embedded Python version uses the same standard library location as the installed Python, say, Python 2.1.2. Now you upgrade to Python 2.1.3. The standard library is upgraded. (a) Wouldn't it be nice if the embedding app was automatically upgraded too (rather than having to relink it, which may be a pain if the source and objects were thrown away). (b) Some new feature might be added to the core (in a binary compatible way) that is used by some library module. If the embedding app uses that module, it will fail because the statically linked app still is Python 2.1.2.
- on the same topic, the PostgreSQL documentation claims that you cannot build PL/Python without a shared libpython. They admit that it might work to use the static library, and that it is just the fault of their build system to not support this scenario: http://www.postgresql.org/idocs/index.php?plpython-install.html
I expect we'll see this more and more. After all, every library that comes with Linux is a shared lib (even / especially libc).
- For embedded applications, people also bring up "allows sharing at run-time" argument). In that case, it is factually true. However, even without a shared libpython, multiple copies of the same embedded executable (or shared library) will still share code.
See above.
- The Windows port uses a python DLL.
To summarize, it probably does have advantages for people who want to embed Python in applications that are themselves shared libraries. I think those advantages are outweighed by the problems people may get with a shared libpython, and who never want to embed Python.
What problems (apart from the pain of getting this to build right on many platforms)? You mentioned a bit of a slowdown due to PIF code (probably not even measurable in pystone) and a slower startup due to a few stat calls. Note that all extensions are already shared libraries -- so the problems can't be too bad. :-) --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum <guido@python.org> writes:
What problems (apart from the pain of getting this to build right on many platforms)?
Building it is not the issue; running it is the problem. /usr/local/lib is searched for shared libraries only on Linux, on all other systems, you either have to add a -R option, or require users to set LD_LIBRARY_PATH. The latter is clear undesirable, so you have to hard-code the path to libpython into the executable. In turn, the resulting binary is not relocatable anymore.
You mentioned a bit of a slowdown due to PIF code (probably not even measurable in pystone) and a slower startup due to a few stat calls. Note that all extensions are already shared libraries -- so the problems can't be too bad.
You also get a slow-down from the -R option (if you needed to use it in the first place). This will cause all library searches without a path to look into an additional directory. Regards, Martin
participants (3)
-
David Abrahams
-
Guido van Rossum
-
martin@v.loewis.de