Hi, here's an updated proposal, adopting Marc-Andre's improvement that uses a new field in the PyModuleDef struct to register the callback. Note that this change no longer keeps up binary compatibility, which may or may not be acceptable for Python 3.4. Stefan The problem =========== Python modules and extension modules are not being set up in the same way. For Python modules, the module is created and set up first, then the module code is being executed. For extensions, i.e. shared libraries, the module init function is executed straight away and does both the creation and initialisation. This means that it knows neither the __file__ it is being loaded from nor its package (i.e. its FQMN). This hinders relative imports and resource loading. In Py3, it's also not being added to sys.modules, which means that a (potentially transitive) re-import of the module will really try to reimport it and thus run into an infinite loop when it executes the module init function again. And without the FQMN, it's not trivial to correctly add the module to sys.modules either. We specifically run into this for Cython generated modules, for which it's not uncommon that the module init code has the same level of complexity as that of any 'regular' Python module. Also, the lack of a FQMN and correct file path hinders the compilation of __init__.py modules, i.e. packages, especially when relative imports are being used at module init time. The proposal ============ I propose to split the extension module initialisation into two steps in Python 3.4, in a backwards compatible way. Step 1: The current module init function can be reduced to just creating the module instance and returning it (and potentially doing some simple C level setup). Additionally, and this is the new part, the module init code can register a C callback function in its PyModuleDef struct that will be called after setting up the module. Step 2: The shared library importer receives the module instance from the module init function, adds __file__, __path__, __package__ and friends to the module dict, and then checks for the callback. If non-NULL, it calls it to continue the module initialisation by user code. The callback ============ The callback is defined as follows:: int (*PyModule_init_callback)(PyObject* the_module, PyModuleInitContext* context) "PyModuleInitContext" is a struct that is meant mostly for making the callback more future proof by allowing additional parameters to be passed in. For now, I can see a use case for the following fields:: struct PyModuleInitContext { char* module_name; char* qualified_module_name; } Both names are encoded in UTF-8. As for the file path, I consider it best to retrieve it from the module's __file__ attribute as a Python string object to reduce filename encoding problems. Note that this struct argument is not strictly required (it could be a simple "inquiry" function), but given that this proposal would have been much simpler if the module init function had accepted such an argument in the first place, I consider it a good idea not to let this chance pass by again. The counter arguments would be "keep it simple" and "we already pass in the whole module (and its dict) anyway". Up for debate! The registration of the callback uses a new field "m_init" in the PyModuleDef struct:: typedef struct PyModuleDef{ PyModuleDef_Base m_base; const char* m_name; const char* m_doc; Py_ssize_t m_size; PyMethodDef *m_methods; inquiry m_reload; traverseproc m_traverse; inquiry m_clear; freefunc m_free; /* --- original fields up to here */ PyModule_init_callback m_init; /* post-setup init callback */ } PyModuleDef; Implementation ============== The implementation requires local changes to the extension module importer and a new field in the PyModuleDef struct. Open questions ============== It is not clear how extensions should be handled that register more than one module in their module init function, e.g. compiled packages. One possibility would be to leave the setup to the user, who would have to know all FQMNs anyway in this case, although not the import file path. Alternatively, the import machinery could use a stack to remember for which modules a callback was registered during the last init function call, set up all of them and then call their callbacks. It's not clear if this meets the intention of the user. It's not guaranteed that all of these modules will be related to the module that registered them, in the sense that they should receive the same setup. The best way to fix this correctly might be to make users pass the setup explicitly into the module creation functions in Python 4 (see alternatives below), so that the setup and sys.modules registration can happen directly at this point. Alternatives ============ 1) It would be possible to make extension modules optionally export another symbol, e.g. "PyInit2_modulename", that the shared library loader would call in addition to the required function "PyInit_modulename". This would keep up binary compatibility. The drawback is that it also makes it easier to write broken code because a Python version or implementation that does not support this second symbol would simply not call it, without error. The new struct field would let the build fail instead if it is not supported. 2) The callback could be made available as a Python function in the module dict, thus also removing the need for an explicit registration API. However, this approach would add overhead to both sides, the importer code and the user provided module init code, as it would require additional dictionary handling and the implementation of a one-time Python function in user code. It would also suffer from the problem that missing support in the runtime would pass silently. 3) The original proposal used a new C-API function to register the callback explicitly, as opposed to extending the PyModuleDef struct. This has the advantage of keeping up binary compatibility with existing Py3.3 extensions. It has the disadvantage of adding another indirection to the setup procedure where a static function pointer would suffice. 4) Pass a new context argument into the module init function that contains all information necessary to properly and completely set up the module at creation time. This would provide a much simpler and cleaner solution than the proposed solution. However, it will not be possible before Python 4 as it breaks backwards compatibility with all existing extension modules at both the source and binary level.