[Cython] Multiple modules in one compilation unit

Vitja Makarov vitja.makarov at gmail.com
Thu Mar 3 10:48:19 CET 2011


2011/3/3 mark florisson <markflorisson88 at gmail.com>:
> On 3 March 2011 07:43, Stefan Behnel <stefan_ml at behnel.de> wrote:
>> Lisandro Dalcin, 03.03.2011 05:38:
>>>
>>> On 2 March 2011 21:01, Greg Ewing<greg.ewing at canterbury.ac.nz>  wrote:
>>>>
>>>> Stefan Behnel wrote:
>>>>>
>>>>> you'd call "cython" on a package and it would output a directory with a
>>>>> single __init__.so that contains the modules compiled from all .pyx/.py
>>>>> files in that package. Importing the package would then trigger an
>>>>> import of
>>>>> that __init__.so, which in turn will execute code in its init__init__()
>>>>> function to register the other modules.
>>>>
>>>> I don't think it even has to be a directory with an __init__,
>>>> it could just be an ordinary .so file with the name of the
>>>> package.
>>>>
>>>> I just tried an experiment in Python:
>>>>
>>>> # onefilepackage.py
>>>> import new, sys
>>>> blarg = new.module("blarg")
>>>> blarg.thing = "This is the thing"
>>>> sys.modules["onefilepackage.blarg"] = blarg
>>>>
>>>> and two different ways of importing it:
>>>>
>>>> >>> from onefilepackage import blarg
>>>> >>> blarg
>>>> <module 'blarg' (built-in)>
>>>> >>> blarg.thing
>>>> 'This is the thing'
>>>>
>>>> >>> import onefilepackage.blarg
>>>> >>> onefilepackage.blarg.thing
>>>> 'This is the thing'
>>>>
>>>
>>> I'm hacking around these lines. However, I'm working to maintain
>>> different modules in different C compilation units, in order to
>>> workaround the obvious issue with duplicated global C symbols.
>>
>> That should be ok as a first attempt to get it working quickly. I'd still
>> like to see the modules merged in the long term in order to increase the
>> benefit of the more compact format. They'd all share the same code generator
>> and Cython's C internals, C helper functions, constants, builtins, etc., but
>> each of them would use a separate (name mangling) scope to keep the visible
>> C names separate.
>
> I was thinking that perhaps we could share declarations common to all
> cython modules (compiled with that version of Cython) in a cython.h
> header file (which is also imho nicer to maintain than a code.put() in
> e.g. ModuleNode.generate_module_preamble), and put it in e.g.
> Cython/Includes and set the -I C compiler flag to point to the
> Includes directory.
> Module-specific functions would still be declared static, of course.
> And if users want to ship generated C files to avoid Cython as a
> dependency, they could simply ship the header and adjust their
> setup.py.
>
> If you want to merge modules and the "package-approach" is chosen, it
> should have well-defined semantics for in-place builds, and
> package/__init__.py is preferred over package.so. So how would you
> solve that problem without either placing package.so in the package
> itself, or giving it another name (and perhaps star-importing it from
> __init__.py)? Basically, if people want to combine several modules
> into one they could use the 'include' statement.
>
> e.g. in spam.pyx you 'include "ham.pyx"' and in spam.pxd you 'include
> "ham.pxd"'.
>
> (although you'd probably rename ham.pyx to ham.pxi, and you'd probably
> merge spam.pxd with ham.pxd)
>
> In any case, I'm just wondering, would this functionality be more
> useful than our current include statement and a cython.h header that
> is shared by default?
>
>>>> So assuming the same thing works with a .so instead of a .py,
>>>> all you need to do is emit a .so whose init function stuffs
>>>> appropriate entries into sys.modules to make it look like
>>>> a package.
>>>>
>>>
>>> However, the import machinery does not treat .so the same as *.pyx.
>>> For example, I have a problem with Python 3. For .py modules, before
>>> the module code starts to execute, the matching entries in sys.modules
>>> is already there.
>>
>> And it has to be, in order to prevent accidental reimports.
>>
>>
>>> The same happens in Python 2 for .so modules, as
>>> Py_InitModule() add the entry in sys.modules early. However, in Python
>>> 3 that is not te case (and only for the .so, for .py is the same as in
>>> Py2), the import machinery adds the entry later, after the
>>> finalization of the module init function. I'm tempted to workaround
>>> this by setting the entry in sys.modules right after the call to
>>> PyModule_Create() ... What do you think about this? Any potential
>>> issue?
>>
>> No idea. I'd say, read the source and give it a try.
>>
>> Stefan


To share common sources is a good idea, we can also share "code" in
libcython-<version>.so
But then we should handle ABI compatibility problems.

-- 
vitja.


More information about the cython-devel mailing list