[Cython] Speedup module-level lookup

Stefan Behnel stefan_ml at behnel.de
Sat Jan 21 07:00:00 CET 2012


Vitja Makarov, 19.01.2012 08:49:
> 2012/1/19 Robert Bradshaw:
>> On Wed, Jan 18, 2012 at 12:30 PM, Vitja Makarov wrote:
>>> I tried to optimize module lookups (__pyx_m) by caching internal PyDict state.
>>>
>>> In this example bar() is 1.6 time faster (500us against 842us):
>>>
>>> C = 123
>>> def foo(a):
>>>     return C * adef bar():
>>>     for i in range(10000):        foo(i)
>>> Here is proof of
>>> concept:https://github.com/vitek/cython/commit/1d134fe54a74e6fc6d39d09973db499680b2a8d9
>>>
>>> So the question is: does it worth it?
>>
>> I think the right thing to do here is make all module-level globals
>> into "cdef public" attributes, i.e. C globals with getters and setters
>> for Python space. I'm not sure whether this would best be done by
>> creating a custom dict or module subclass, but it would probably be
>> cleaner and afford much more than a 1.6x speedup.
> 
> Yes, nice idea.
> It's possible to subclass PyModuleObject and I didn't find any use of
> PyModule_CheckExact() in CPython's sources:
> 
> import types
> import sys
> 
> global_foo = 1234
> 
> class CustomModule(types.ModuleType):
>     def __init__(self, name):
>         types.ModuleType.__init__(self, name)
>         sys.modules[name] = self
> 
>     @property
>     def foo(self):
>         return global_foo
> 
>     @foo.setter
>     def foo(self, value):
>         global global_foo
>         global_foo = value
> 
> CustomModule('foo')
> 
> import foo
> print foo.foo

The one thing I don't currently see is how to get the module subtype
instantiated in a safe and portable way.

The normal way to create the module in Python 2.x is a call to
Py_InitModule*(), which internally does a PyImport_AddModule(). We may get
away with creating and registering the module object before calling into
Py_InitModule*(), so that PyImport_AddModule() finds it there. At least,
the internal checks on modules seem to use PyModule_Check() and not
PyModule_CheckExact(), so someone seems to have already thought about this.

In Python 3.x, the situation is different. There is no lookup involved and
the module is always newly instantiated. That may mean that we have to copy
the module creation code into Cython. But that doesn't look like a huge
drawback (except for compatibility to potential future changes), because we
already do most of the module initialisation ourselves anyway, especially
now that we have CyFunction.

I start feeling a bit like Linus Torvalds when he broke his minix
installation and went: "ok, what else do I need to add to this terminal
emulator in order to make it an operating system?"

Stefan



More information about the cython-devel mailing list