[Import-SIG] On singleton modules, heap types, and subinterpreters

Thu Jul 30 10:13:49 CEST 2015

On Thu, Jul 30, 2015 at 5:30 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 30 July 2015 at 07:00, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>> On Wed, Jul 29, 2015 at 2:01 PM, Petr Viktorin <encukou at gmail.com> wrote:
>>> On Wed, Jul 29, 2015 at 8:57 PM, Eric Snow <ericsnowcurrently at gmail.com> wrote:
>>>> The slot methods would have to do `type(self).<"global">`, which is
>>>> what any other method would do.  So in the relevant _csv.reader
>>>> methods we would use the equivalent "type(self).Error" where needed.
>>>
>>> We need the class that defines the method. type(self) might return a
>>> subclass of that. So we either need to walk the MRO until the defining
>>> class is found, or use Nick's mechanism and record the defining class
>>> for each relevant special method.
>>
>> If you explicitly bind the module-scoped object to the class that
>> needs it, then the methods of that class can access it.
>
> This subsequent discussion meant I realised that with the reverse
> lookup from slot method -> defining class stored on type objects, and
> a reference to the defining module similarly stored on type objects,
> then we can dispense entirely with my more complex "active module"
> idea - the slot implementation will always have access to the type,
> and could traverse from there to the defining module as needed.
>
> Now, the *reason* we want to enable access to the defining module is
> because we want to minimise the barrier to migrating from singleton
> extension modules to subinterpreter friendly modules, which means
> providing a way to do a fast lookup of the defining module given only
> a slot ID and a type instance.
>
> For full compatibility, we can't get away without offering *some* way
> of doing that - consider cases where folks want to allow rebinding
> (rather than mutation) of module level attributes and have that affect
> the behaviour of special methods.
>
> However, we also want to offer a compelling replacement for caching
> things in C level static variables.
>
> Eric's suggestion made me realise there may be a better way to go
> about addressing those *performance* related aspects of this problem:
> what if instead of focusing on providing fast access to the Python
> level module object, we instead focused on providing an indexed
> *PyObject pointer cache* on type instances for use by extension module
> method implementations? A "__typeslots__" as it were?
>
> If __typeslots__ was defined, it would be a tuple of slot field names.
> The named fields at the Python level would become descriptors
> accessing indexed slots in an underlying C level array of PyObject
> pointers (unlike instance slots, we could allocate this array
> separately from the type object, since type objects are already quite
> sprawling beasts, but don't generally exist in the kinds of numbers
> that instances do).
>
> Then if a particular extension module needs fast access to the module,
> or to any module attribute (and didn't need to worry about lazy
> lookup), then they could cache it in a typeslot on first use, and
> thereafter look it up by index.
>
> That way, we could walk the type hierarchy to find the defining class
> on first look up, and then cache it on the derived type when done.

It sounds classes with index-based __typeslots__ would be incompatible
for multiple inheritance.
We're trying to help Cython classes be more like Python ones, so I
think that's something to watch out for.

Anyway, it's a good idea. But it's an addition; the original problem
needs to be solved anyway :(