I have a problem with SSL support in _socket and the way setup.py does the autodetection: even though SSL may be installed on the system, it seems that they changed the exposed APIs between patch level releases. As a result, _socket compiles but the import fails on platforms which have the wrong OpenSSL version installed. setup.py then simply removes _socket from the extension list and builds Python without socket support which is a really Bad Thing since _socket without SSL support compiles just fine. What can we do about this ? Since auto-detection is happening rather early in setup.py it doesn't seem possible to apply some fallback scheme depending on extra knowledge for the various modules. Perhaps we should simply let setup.py build two extensions: _socket (without SSL) and _socketssl (with SSL) ?! If the _socketssl build or import fails for some reason, Python could still pick up the _socket extension in socket.py. Comments ? -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
Perhaps we should simply let setup.py build two extensions: _socket (without SSL) and _socketssl (with SSL) ?! If the _socketssl build or import fails for some reason, Python could still pick up the _socket extension in socket.py.
+1 --Guido van Rossum (home page: http://www.python.org/~guido/)
"M.-A. Lemburg" <mal@lemburg.com> writes:
What can we do about this ?
The standard solution is to modify Modules/Setup at installation time, to suit your local needs.
Perhaps we should simply let setup.py build two extensions: _socket (without SSL) and _socketssl (with SSL) ?! If the _socketssl build or import fails for some reason, Python could still pick up the _socket extension in socket.py.
-1: Instead of avoiding to use an existing OpenSSL installation, it would be much better if the socket module was fixed to work with all existing versions. Of course, without a precise bug report, we cannot know whether this was possible. Regards, Martin
"Martin v. Loewis" wrote:
"M.-A. Lemburg" <mal@lemburg.com> writes:
What can we do about this ?
The standard solution is to modify Modules/Setup at installation time, to suit your local needs.
I thought that Modules/Setup is deprecated and replaced by the auto setup tests in setup.py ? In any case, setup.py will simply remove _socket if it doesn't import correctly and so a casual sys admin or user will lose big if his OpenSSL installation happens to be out of sync with whatever we provide in _socket.
Perhaps we should simply let setup.py build two extensions: _socket (without SSL) and _socketssl (with SSL) ?! If the _socketssl build or import fails for some reason, Python could still pick up the _socket extension in socket.py.
-1: Instead of avoiding to use an existing OpenSSL installation, it would be much better if the socket module was fixed to work with all existing versions.
Of course, without a precise bug report, we cannot know whether this was possible.
Some symbols starting with 'RAND_*' are aparently missing from OpenSSL on my notebook. On other occasions (i.e. on RedHat) I found that the system vendor had forgotten to provide a link to the 0.9 version of OpenSSL and instead used 1.0 as version number (which is completely wrong since there is no 1.0 version of OpenSSL). As a result, _socket built on a system with correctly setup libs wouldn't run on this particular RedHat installation. In summary: _socket is just too important to lose if something in the OpenSSL support goes wrong. The two build model I suggested fixes this problem elegantly and doesn't cost anything in terms of adding tons of code -- all we need is an #ifdef for the module name in _socketmodule.c -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
Some symbols starting with 'RAND_*' are aparently missing from OpenSSL on my notebook.
Yes, this has bitten me too. It's apparently a relatively new API in OpenSSL and the SSL code in socket.c was changed to require it almost as soon as it appeared in OpenSSL.
In summary: _socket is just too important to lose if something in the OpenSSL support goes wrong. The two build model I suggested fixes this problem elegantly and doesn't cost anything in terms of adding tons of code -- all we need is an #ifdef for the module name in _socketmodule.c
Since the SSL support mostly introduces new code that doesn't depend on other socket code (not 100% sure if this is true), can't we make the SSL support a separate module? Then socket.py (which is also used on Unix these days!!!) can glue them together. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Some symbols starting with 'RAND_*' are apparently missing from OpenSSL on my notebook.
Yes, this has bitten me too. It's apparently a relatively new API in OpenSSL and the SSL code in socket.c was changed to require it almost as soon as it appeared in OpenSSL.
In summary: _socket is just too important to lose if something in the OpenSSL support goes wrong. The two build model I suggested fixes this problem elegantly and doesn't cost anything in terms of adding tons of code -- all we need is an #ifdef for the module name in _socketmodule.c
Since the SSL support mostly introduces new code that doesn't depend on other socket code (not 100% sure if this is true), can't we make the SSL support a separate module? Then socket.py (which is also used on Unix these days!!!) can glue them together.
Good idea. Checking the code it should be easy to do. I'll look into this later this week. Funny, BTW, that the source file is named socketmodule.c while the resulting DLL is called _socket... I suppose renaming socketmodule.c to _socket.c would be advisable. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
Checking the code it should be easy to do. I'll look into this later this week.
Great!
Funny, BTW, that the source file is named socketmodule.c while the resulting DLL is called _socket... I suppose renaming socketmodule.c to _socket.c would be advisable.
That requires asking the SF sysadmin a favor to move a file, or loses all he CVS history. So who cares. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido van Rossum wrote:
Checking the code it should be easy to do. I'll look into this later this week.
Great!
Done -- wasn't that easy after all, because the ssl object relies on the socket object. Please review and test. The header file chaos at the top of socketmodule.* looks scary. It works fine on Linux, but I have no idea what the situation is on other platforms. Side-note: I've added the "inter-module dynamic C API linking via Python trick" from the mx tools to the _socket module. _ssl only uses it to get at the type object, but the support can easily be extended if this should be needed for more C APIs from _socket. Also note: the non-Unix build process files need to be updated.
Funny, BTW, that the source file is named socketmodule.c while the resulting DLL is called _socket... I suppose renaming socketmodule.c to _socket.c would be advisable.
That requires asking the SF sysadmin a favor to move a file, or loses all he CVS history. So who cares.
I have left out this step. Perhaps Barry know a way to rename the socketmodule.* files without losing the history ?! -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
[MAL]
Side-note: I've added the "inter-module dynamic C API linking via Python trick" from the mx tools to the _socket module. _ssl only uses it to get at the type object, but the support can easily be extended if this should be needed for more C APIs from _socket.
Also note: the non-Unix build process files need to be updated.
I don't know what "inter-module dynamic C API linking via Python trick" means, but the Windows build doesn't compile anymore despite that it didn't and doesn't support SSL. I suspect it's because "inter-module" wrt sockets is really "cross-DLL" on Windows, and clever tricks are going to bite hard because of that. It's griping here: static PyTypeObject PySocketSock_Type = { C:\Code\python\Modules\socketmodule.c(1768) : error C2491: 'PySocketSock_Type' : definition of dllimport data not allowed and here: &PySocketSock_Type, C:\Code\python\Modules\socketmodule.c(2650) : error C2099: initializer is not a constant The changes to socketmodule.h pretty much baffle me. Why is the body of the function PySocketModule_ImportModuleAndAPI included in the header file? Why is the body of this function skipped unless PySocket_BUILDING_SOCKET is defined? All in all, this appears to be an extremely confusing way to define a function named PySocketModule_ImportModuleAndAPI in the new _ssl.c alone. So why isn't the function just defined in _ssl.c directly? There appears no reason to put it in the header file, and it's confusing there. This shows signs of adapting a complicated framework to a situation too simple to require most of what the framework does. If so, since there is no other use of this framework in Python, and the framework isn't documented in the Python codebase, the framework should be tossed, and something as simple as possible done instead. I can't make more time to sort this out now. It would help if the code were made more transparent (see last paragraph), so it consumed less time to figure out what it's intending to do. In the meantime, the Windows build will remain broken.
Tim Peters wrote:
[MAL]
Side-note: I've added the "inter-module dynamic C API linking via Python trick" from the mx tools to the _socket module. _ssl only uses it to get at the type object, but the support can easily be extended if this should be needed for more C APIs from _socket.
Also note: the non-Unix build process files need to be updated.
I don't know what "inter-module dynamic C API linking via Python trick" means, but the Windows build doesn't compile anymore despite that it didn't and doesn't support SSL. I suspect it's because "inter-module" wrt sockets is really "cross-DLL" on Windows, and clever tricks are going to bite hard because of that.
No it's not (and that's the main advantage of the "trick"). Some explanation: The _ssl module needs access to the type object defined in the _socket module. Since cross-DLL linking introduces a lot of problems on many platforms, the "trick" is to wrap the C API of a module in a struct which then gets exported to other modules via a PyCObject. The code in socketmodule.c defines this struct (which currently only contains the type object reference, but could very well also include other C APIs needed by other modules) and exports it as PyCObject via the module dictionary under the name "CAPI". Other modules can now include the socketmodule.h file which defines the needed C APIs to import and set up a static copy of this struct in the importing module. After initialization, the importing module can then access the C APIs from the _socket module by simply referring to the static struct, e.g. /* Load _socket module and its C API; this sets up the global PySocketModule */ if (PySocketModule_ImportModuleAndAPI()) return; ... if (!PyArg_ParseTuple(args, "O!|zz:ssl", PySocketModule.Sock_Type, (PyObject*)&Sock, &key_file, &cert_file)) return NULL; (Perhaps I should copy the above explanation into the source files ?!)
It's griping here:
static PyTypeObject PySocketSock_Type = { C:\Code\python\Modules\socketmodule.c(1768) : error C2491: 'PySocketSock_Type' : definition of dllimport data not allowed
and here:
&PySocketSock_Type, C:\Code\python\Modules\socketmodule.c(2650) : error C2099: initializer is not a constant
Ah, you're right, the export of the type object is not needed anymore since this is now done using the PyCObject. Sorry, my bad.
The changes to socketmodule.h pretty much baffle me. Why is the body of the function PySocketModule_ImportModuleAndAPI included in the header file? Why is the body of this function skipped unless PySocket_BUILDING_SOCKET is defined? All in all, this appears to be an extremely confusing way to define a function named PySocketModule_ImportModuleAndAPI in the new _ssl.c alone. So why isn't the function just defined in _ssl.c directly? There appears no reason to put it in the header file, and it's confusing there.
The reason for putting the code in the header file is to avoid duplication of code. The import API is needed by all modules wishing to use the C API of the socket module. Currently, only _ssl needs this, but I think it would be a good strategy to extend this technique to other modules as well (esp. the array module would be a good candidate).
This shows signs of adapting a complicated framework to a situation too simple to require most of what the framework does. If so, since there is no other use of this framework in Python, and the framework isn't documented in the Python codebase, the framework should be tossed, and something as simple as possible done instead.
I don't think it's overly complicated. It's been in use in mxDateTime and various database modules including mxODBC for many years and I haven't received any complaints about it in the last few years. It would be nice if we could integrate better support for it into the Python core. Then we wouldn't need the header file source code definition anymore. IMHO, it's a very useful way of doing cross-DLL "linking" in a platform independent manner. Note that the whole idea originated from a discussion I had with Jim Fulton some years ago. As I understand, the PyCObject was invented for just this purpose.
I can't make more time to sort this out now. It would help if the code were made more transparent (see last paragraph), so it consumed less time to figure out what it's intending to do. In the meantime, the Windows build will remain broken.
As I read the checkins, you've remove the type object export. I am curious why the test_socket still fails on Windows though. Both test_socket and test_socket_ssl work just fine on Linux. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
[M.-A. Lemburg]
Some explanation:
The _ssl module needs access to the type object defined in the _socket module. Since cross-DLL linking introduces a lot of problems on many platforms, the "trick" is to wrap the C API of a module in a struct which then gets exported to other modules via a PyCObject.
The code in socketmodule.c defines this struct (which currently only contains the type object reference, but could very well also include other C APIs needed by other modules) and exports it as PyCObject via the module dictionary under the name "CAPI".
Other modules can now include the socketmodule.h file which defines the needed C APIs to import and set up a static copy of this struct in the importing module.
After initialization, the importing module can then access the C APIs from the _socket module by simply referring to the static struct, e.g.
/* Load _socket module and its C API; this sets up the global PySocketModule */ if (PySocketModule_ImportModuleAndAPI()) return;
... if (!PyArg_ParseTuple(args, "O!|zz:ssl",
PySocketModule.Sock_Type,
(PyObject*)&Sock, &key_file, &cert_file)) return NULL;
(Perhaps I should copy the above explanation into the source files ?!)
I don't know. I really don't have time to try and understand this, but I can tell you I spent a lot of time staring at the code just trying to fix the part that didn't work, and it was slow and painful going. Without deep understanding, I can only repeat that all this machinery *seems* to be overkill in this specific case; and since there is no other case in the Python core, a mass of overly general machinery in the Python core seems out of place.
... Ah, you're right, the export of the type object is not needed anymore since this is now done using the PyCObject. Sorry, my bad.
No problem -- that part turned out to be easy, once I found it.
... The reason for putting the code in the header file is to avoid duplication of code. The import API is needed by all modules wishing to use the C API of the socket module.
But in this specific case you confirm that there is only one client:
Currently, only _ssl needs this, but I think it would be a good strategy to extend this technique to other modules as well (esp. the array module would be a good candidate).
Possibly, but it's overly elaborate in this specific case. If it needed to be hypergeneral (and it doesn't here), it seems it would be better to make the code template *more* general, so that every importer of every module could include a common (e.g.) PyImportModuleAndApi.h header file one or more times, after setting a pile of #defines to specialize it to the module at hand.
I don't think it's overly complicated.
You've confirmed that it is in this specific case, and that's the only case there is in the codebase all the Python developers work with. ...
As I read the checkins, you've remove the type object export.
Well, I removed the DL_IMPORT. The problem was more that it wasn't exported, and now it doesn't need to be imported or exported.
I am curious why the test_socket still fails on Windows though. Both test_socket and test_socket_ssl work just fine on Linux.
test_socket was a red herring. Merely trying to import socket died with NameError on Windows. That got fixed too, and the non-SLL socket tests on Windows worked fine then.
Tim Peters wrote:
[M.-A. Lemburg]
Some explanation:
The _ssl module needs access to the type object defined in the _socket module. Since cross-DLL linking introduces a lot of problems on many platforms, the "trick" is to wrap the C API of a module in a struct which then gets exported to other modules via a PyCObject.
The code in socketmodule.c defines this struct (which currently only contains the type object reference, but could very well also include other C APIs needed by other modules) and exports it as PyCObject via the module dictionary under the name "CAPI".
Other modules can now include the socketmodule.h file which defines the needed C APIs to import and set up a static copy of this struct in the importing module.
After initialization, the importing module can then access the C APIs from the _socket module by simply referring to the static struct, e.g.
/* Load _socket module and its C API; this sets up the global PySocketModule */ if (PySocketModule_ImportModuleAndAPI()) return;
... if (!PyArg_ParseTuple(args, "O!|zz:ssl",
PySocketModule.Sock_Type,
(PyObject*)&Sock, &key_file, &cert_file)) return NULL;
(Perhaps I should copy the above explanation into the source files ?!)
I don't know. I really don't have time to try and understand this, but I can tell you I spent a lot of time staring at the code just trying to fix the part that didn't work, and it was slow and painful going. Without deep understanding, I can only repeat that all this machinery *seems* to be overkill in this specific case; and since there is no other case in the Python core, a mass of overly general machinery in the Python core seems out of place.
The idea of using the above framework was to get the discussion started and then perhaps extend this kind of support to other modules as well, e.g. to be able to create and access types from other modules at C level. Note that the framework only seem to be overkill at the moment (since it only exports one symbol). As soon as you add more APIs to the API struct, things look different -- e.g. a socket constructor at C level would be nice to have.
... Ah, you're right, the export of the type object is not needed anymore since this is now done using the PyCObject. Sorry, my bad.
No problem -- that part turned out to be easy, once I found it.
You should have just thrown the error message in my Inbox.
... The reason for putting the code in the header file is to avoid duplication of code. The import API is needed by all modules wishing to use the C API of the socket module.
But in this specific case you confirm that there is only one client:
Currently, only _ssl needs this, but I think it would be a good strategy to extend this technique to other modules as well (esp. the array module would be a good candidate).
Possibly, but it's overly elaborate in this specific case. If it needed to be hypergeneral (and it doesn't here), it seems it would be better to make the code template *more* general, so that every importer of every module could include a common (e.g.) PyImportModuleAndApi.h header file one or more times, after setting a pile of #defines to specialize it to the module at hand.
Right.
I don't think it's overly complicated.
You've confirmed that it is in this specific case, and that's the only case there is in the codebase all the Python developers work with.
Yeah, well, ok :-) You have to get the ball rolling somehow ;-)
I am curious why the test_socket still fails on Windows though. Both test_socket and test_socket_ssl work just fine on Linux.
test_socket was a red herring. Merely trying to import socket died with NameError on Windows. That got fixed too, and the non-SLL socket tests on Windows worked fine then.
Thanks. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
On Monday, February 18, 2002, at 06:19 , Tim Peters wrote:
I don't know. I really don't have time to try and understand this, but I can tell you I spent a lot of time staring at the code just trying to fix the part that didn't work, and it was slow and painful going. Without deep understanding, I can only repeat that all this machinery *seems* to be overkill in this specific case; and since there is no other case in the Python core, a mass of overly general machinery in the Python core seems out of place.
Well... The MacOS toolbox modules have a similar requirement (but currently implemented in a different way, see pymactoolboxglue.c if you're interested in the gory details) and various extension packages (such as Numeric) also have their own implementation of something similar. And there's packages like VTK which currently do hard cross-dll linking which could benefit from such a scheme. Maybe someone should try and come up with a list of requirements for inter-extension-module communication and PEP it? -- - Jack Jansen <Jack.Jansen@oratrix.com> http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -
Jack Jansen wrote:
On Monday, February 18, 2002, at 06:19 , Tim Peters wrote:
I don't know. I really don't have time to try and understand this, but I can tell you I spent a lot of time staring at the code just trying to fix the part that didn't work, and it was slow and painful going. Without deep understanding, I can only repeat that all this machinery *seems* to be overkill in this specific case; and since there is no other case in the Python core, a mass of overly general machinery in the Python core seems out of place.
Well... The MacOS toolbox modules have a similar requirement (but currently implemented in a different way, see pymactoolboxglue.c if you're interested in the gory details) and various extension packages (such as Numeric) also have their own implementation of something similar.
And there's packages like VTK which currently do hard cross-dll linking which could benefit from such a scheme.
Maybe someone should try and come up with a list of requirements for inter-extension-module communication and PEP it?
Good idea. I can have a go at this next weekend. -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/
Guido van Rossum <guido@python.org> writes:
Since the SSL support mostly introduces new code that doesn't depend on other socket code (not 100% sure if this is true), can't we make the SSL support a separate module? Then socket.py (which is also used on Unix these days!!!) can glue them together.
+1. Martin
"M.-A. Lemburg" <mal@lemburg.com> writes:
I thought that Modules/Setup is deprecated and replaced by the auto setup tests in setup.py ?
Not at all. It is just used less frequently. Personally, I think that is a pity. Python binary distributions, by default, on Unix, should build as many extension libraries statically into the interpreter as they can without dragging in too many additional shared libraries. IOW, _socket should be compiled statically into the interpreter, which you cannot do with distutils (by nature). The reason for linking them statically is efficiency: if used, the interpreter won't have to locate them in sys.path, they don't need to be compiled as PIC code, the dynamic linker does not need to bind that many symbols, etc; if not used, they don't consume any additional resources as they are demand-paged from the executable. Static linking is also desirable for frozen applications. For those reasons, I hope that Setup.dist continues to be maintained. Regards, Martin
"MAL" == M <mal@lemburg.com> writes:
MAL> I thought that Modules/Setup is deprecated and replaced by MAL> the auto setup tests in setup.py ? In any case, setup.py will MAL> simply remove _socket if it doesn't import correctly and so a MAL> casual sys admin or user will lose big if his OpenSSL MAL> installation happens to be out of sync with whatever we MAL> provide in _socket. This is a more general problem with the current setup.py stuff for the standard library. It took me /ages/ to figure out why BerkeleyDB support was broken in Python 2.2 -- not just broken, but non-existant! "import bsddb" simply failed because the .so wasn't there. I couldn't figure out why that was until I trolled through the build output and realized that setup.py was deleting the .so because it got an import error after building the .so. Then I had to figure out how to build the .so and keep it around so I could then learn that it had link problems and from there, I realized why BerkeleyDB support in Python 2.2 is /really/ busted (it tries to be too smart about finding its libraries). It shouldn't have been this difficult to debug. Surely there must be some way to tell setup.py not to delete .so's it can't import so we have a prayer of finding the real problems. -Barry
participants (6)
-
barry@zope.com -
Guido van Rossum -
Jack Jansen -
M.-A. Lemburg -
martin@v.loewis.de -
Tim Peters