[Import-SIG] On singleton modules, heap types, and subinterpreters

Petr Viktorin encukou at gmail.com
Wed Jul 29 20:05:38 CEST 2015

On Wed, Jul 29, 2015 at 7:53 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Jul 26, 2015 4:39 AM, "Petr Viktorin" <encukou at gmail.com> wrote:
>> So it seems that extension modules that need per-module state need to
>> use heap types. And the heap types need a reference to "their" module.
>> And methods of those types need to be called with the class that
>> defined them.
>> This would be possible with regular methods. But, consider for example
>> the tp_iternext signature:
>>     PyObject* myobj_iternext(PyObject *self)
>> There's no good way for this function to get a reference to the class
>> it belongs to.
>> `Py_TYPE(self)` might be a subclass. The best way I can think of is
>> walking the MRO until I get to a class with tp_iter (or a class
>> created from "my" known PyType_Spec), but one of the requirements on
>> module state is that it needs to be efficient, so I'd rather avoid
>> walking a list.
> One thing I've considered for several years now, and perhaps even proposed
> at some point (around PEP 451?), is adding "__origin__" to objects,
> indicating where the object came from.  "Where" would be the object (or its
> qualname?) associated with the scope in which the first object was created.
> For example, for classes this would be the module (or class/func for nested
> ones).  Likewise, the class for methods.
> Something like __origin__ would help make the actual class/module explicit.
> I expect it would be sufficiently efficient.  __qualname__ gets you
> something similar but less efficiently.  Is __qualname__ set for extension
> types/functions?  Note that __origin__ provides other non-import benefits as
> well.

One thing to watch out is reference cycles. Having a hard ref to a
module is fine; modules aren't designed to be unloaded frequently so
having them only collected by a full GC run is OK.
But I think having each nested function link to the outside function
would create too many reference cycles. And it would keep the outer
function alive for the lifetime of the inner one.

> An alternative, could the module intra-dependencies be bound where needed?
> For example, with _csv could Error be added to reader.__dict__ (i.e. bound
> to reader)?

Putting a reference to the module on *classes* is not a problem.
Getting a reference to the module from normal methods should also not
be that hard.

The hard part is special methods, like tp_iter, which only get "self"
as an argument, at the C level. There's no information about which
(super)class the method is defined in.

More information about the Import-SIG mailing list