[Import-SIG] On singleton modules, heap types, and subinterpreters

Thu Jul 30 05:30:46 CEST 2015

On 30 July 2015 at 07:00, Eric Snow <ericsnowcurrently at gmail.com> wrote:
> On Wed, Jul 29, 2015 at 2:01 PM, Petr Viktorin <encukou at gmail.com> wrote:
>> On Wed, Jul 29, 2015 at 8:57 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>>> The slot methods would have to do `type(self).<"global">`, which is
>>> what any other method would do.  So in the relevant _csv.reader
>>> methods we would use the equivalent "type(self).Error" where needed.
>>
>> We need the class that defines the method. type(self) might return a
>> subclass of that. So we either need to walk the MRO until the defining
>> class is found, or use Nick's mechanism and record the defining class
>> for each relevant special method.
>
> If you explicitly bind the module-scoped object to the class that
> needs it, then the methods of that class can access it.

This subsequent discussion meant I realised that with the reverse
lookup from slot method -> defining class stored on type objects, and
a reference to the defining module similarly stored on type objects,
then we can dispense entirely with my more complex "active module"
idea - the slot implementation will always have access to the type,
and could traverse from there to the defining module as needed.

Now, the *reason* we want to enable access to the defining module is
because we want to minimise the barrier to migrating from singleton
extension modules to subinterpreter friendly modules, which means
providing a way to do a fast lookup of the defining module given only
a slot ID and a type instance.

For full compatibility, we can't get away without offering *some* way
of doing that - consider cases where folks want to allow rebinding
(rather than mutation) of module level attributes and have that affect
the behaviour of special methods.

However, we also want to offer a compelling replacement for caching
things in C level static variables.

Eric's suggestion made me realise there may be a better way to go
about addressing those *performance* related aspects of this problem:
what if instead of focusing on providing fast access to the Python
level module object, we instead focused on providing an indexed
*PyObject pointer cache* on type instances for use by extension module
method implementations? A "__typeslots__" as it were?

If __typeslots__ was defined, it would be a tuple of slot field names.
The named fields at the Python level would become descriptors
accessing indexed slots in an underlying C level array of PyObject
pointers (unlike instance slots, we could allocate this array
separately from the type object, since type objects are already quite
sprawling beasts, but don't generally exist in the kinds of numbers
that instances do).

Then if a particular extension module needs fast access to the module,
or to any module attribute (and didn't need to worry about lazy
lookup), then they could cache it in a typeslot on first use, and
thereafter look it up by index.

That way, we could walk the type hierarchy to find the defining class
on first look up, and then cache it on the derived type when done.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia