On Sun, Aug 11, 2013 at 6:52 AM, Stefan Behnel
Nick Coghlan, 11.08.2013 15:19:
On 11 Aug 2013 09:02, "Stefan Behnel" wrote:
BTW, this already suggests a simple module initialisation interface. The extension module would expose a function that returns a module type, and the loader/importer would then simply instantiate that. Nothing else is needed.
Actually, strike the word "module type" and replace it with "type". Is there really a reason why Python needs a module type at all? I mean, you can stick arbitrary objects in sys.modules, so why not allow arbitrary types to be returned by the module creation function?
That's exactly what I have in mind, but the way extension module imports currently work means we can't easily do it just yet. Fortunately, importlib means we now have some hope of fixing that :)
Well, what do we need? We don't need to care about existing code, as long as the current scheme is only deprecated and not deleted. That won't happen before Py4 anyway. New code would simply export a different symbol when compiling for a CPython that supports it, which points to the function that returns the type.
Then, there's already the PyType_Copy() function, which can be used to create a heap type from a statically defined type. So extension modules can simply define an (arbitrary) additional type in any way they see fit, copy it to the heap, and return it.
Next, we need to define a signature for the type's __init__() method. This can be done in a future proof way by allowing arbitrary keyword arguments to be added, i.e. such a type must have a signature like
def __init__(self, currently, used, pos, args, **kwargs)
and simply ignore kwargs for now.
Actually, we may get away with not passing all too many arguments here if we allow the importer to add stuff to the type's dict in between, specifically __file__, __path__ and friends, so that they are available before the type gets instantiated. Not sure if this is a good idea, but it would at least relieve the user from having to copy these things over from some kind of context or whatever we might want to pass in.
Alternatively, we could split the instantiation up between tp_new() and tp_init(), and let the importer set stuff on the instance dict in between the two. But given that this context won't actually change once the shared library is loaded, the only reason to prefer modifying the instance instead of the type would be to avoid requiring a tp_dict for the type. Open for discussion, I guess.
Did I forget anything? Sounds simple enough to me so far.
Out of curiosity - can we list actual use cases for this new design? The previous thread, admittedly, deals with an isoteric corner-cases that comes up in overly-clever tests. If we plan to serious consider these changes - and this appears to be worth a PEP - we need a list of actual advantages over the current approach. It's not that a more conceptually pure design is an insufficient reason, IMHO, but it would be interesting to hear about other implications. Eli