[Cython] Utilities, cython.h, libcython

Thu Oct 6 11:45:55 CEST 2011

On 6 October 2011 01:05, Robert Bradshaw <robertwb at math.washington.edu> wrote:
> On Wednesday, October 5, 2011, mark florisson wrote:
>>
>> On 5 October 2011 01:46, Robert Bradshaw <robertwb at math.washington.edu>
>> wrote:
>> > On Tue, Oct 4, 2011 at 2:19 PM, mark florisson
>> > <markflorisson88 at gmail.com> wrote:
>> >> Hey,
>> >>
>> >> I briefly mentioned something about this in a pull request, but maybe
>> >> it deserves some actual discussion on the ML.
>> >>
>> >> So I propose that after fused types gets merged we try to move as many
>> >> utility codes as possible to their utility code files (unless they are
>> >> used in pending pull requests or other branches). Preferably this will
>> >> be done in one or a few commits. How should we split up the work, any
>> >> volunteers? Perhaps people who wrote certain utilities also want to
>> >> move them? In that case, we should start a new branch and then merge
>> >> that into master when it's done.
>> >> We could actually move things before fused types get merged, as long
>> >> as we don't touch binding_cfunc_utility_code.
>> >
>> > +1 to moving towards this, but I don't see the urgency or need to do
>> > it all at once (though if there's going to be a big push, lets
>> > coordinate on a wiki or trac).
>>
>> Hm, perhaps there is no strict need to hurry, as long as we take care
>> not to modify utilities after they have been moved. The wiki could be
>> great for that, but I personally don't keep track of everyone's
>> branches, so I don't know which utility is modified by whom (if at
>> all), so strictly speaking (to avoid painful merges) I'd have to ask
>> everyone each time I wanted to move something, or dig through
>> everyone's branches.
>
> I was proposing that everyone lists the utility code sections that are
> likely to cause merge conflicts on a wiki page, and the rest are fair game.

Ah ok, that sounds good.

>>
>> >> Before we go there, Stefan, do we still want to implement the header
>> >> .ini style which can list dependencies and such? I personally don't
>> >> care very much about it, but memoryviews and the utility loaders are
>> >> merged so if someone wants to take up that job, it'd be good to do
>> >> before moving the utilities.
>> >>
>> >> Another issue is that Cython compile time is increasing with the
>> >> addition of control flow and cython utilities. If you use fused types
>> >> you're also going to combinatorially add more compile time.
>> >
>> > Yeah, this was especially obvious with, e.g. cython.compile(...). (In
>> > particular, some utility code was being parsed before it could even
>> > figure out whether it needed to do a full re-compile...)
>> >
>> >> I'm sure
>> >> this came up earlier, but I really think we should have a libcython
>> >> and a cython.h. libcython (a shared library) should contain any common
>> >> Cython-specific code not meant to be inlined, and cython.h any types,
>> >> macros and inline functions etc. This will decrease Cython and C
>> >> compile time, and will also make executables smaller.
>> >
>> > +1. Yes, we talked about this earlier, but nothing concrete was
>> > planned. It's probably worth a CEP, if anything to have a concrete
>> > plan recorded somewhere other than a series of mailing list threads
>> > (though discussion tends to work best here).
>> >
>> >> This could be
>> >> enabled using a command line option to Cython, as well as with
>> >> distutils, eventually we may decide to make it the default (lets
>> >> figure that out later). Preferably libcython.so would be installed
>> >> alongside libpython.so and cython.h inside the Python include
>> >> directory. Assuming multiple versions of Cython and multiple Python
>> >> installations, we'd need to come up with a versioning scheme for
>> >> either.
>> >
>> > I would propose a cython.h file that sits in Cython/Compiler/Include
>> > (or similar), as a first step. The .pyx -> .c pass could be configured
>> > to copy this to a specific location (for shipping just the generated
>> > .c files).
>>
>> That would be fine as well. It might be convenient for users in that
>> case if we could provide a cython.get_include() in addition to the
>> distutils hooks, and a cython-config script.
>
> For sure. We could also have a cython.get_shared_library() (common_code?
> cython_module?) which would return an Extension object to build.
>
>>
>> > One option is to build the shared library as a companion
>> > _cython_x_y_z.so module which, while not as efficient as linking at
>> > the C level, would probably be much simpler to implement in a
>> > cross-platform way. (This perhaps merits some benchmarks, but the main
>> > contents is likely to be things like shared classes and objects.)
>> > Actually linking .so files from modules that cimport each other would
>> > be a nice feature down the road anyways. Again, the associated .c file
>> > could be (optionally) generated/copied during the .pyx -> .c step.
>> > Installation would determine if the required module exists, and if not
>> > build and install it.
>>
>> Hm, that's a really good idea. I think the only overhead would be the
>> capsule unpacking and pointer duplication, but that shouldn't suddenly
>> be an issue. That means we don't have to do any versioning of the
>> libraries and the symbols to avoid clashes in a flat namespaces as
>> Stefan mentioned.
>
> I'm not sure what the overhead is, if any, in calling function pointers vs.
> actually linking things together at the C level (which is essentially the
> same idea, but perhaps addresses are resolved at library load time rather
> than requiring a dereference on each call?)

I think there isn't any difference with dynamic linking and having a
pointer. My understanding (of ELF shared libraries) is that the
procedure lookup table will contain the actual address of the symbol
(likely after the first reference to it has been made, it may have a
stub that resolves the symbol and replaces it's own address with the
actual address), which to me sounds like the same thing as a pointer.
I think only static linking can prevent this, i.e. directly encode the
static address into the call opcode, but I'm not an expert.

>>
>> >> We could also provide a static library there, for users who want to
>> >> link and ship a compiled and statically linked version of their code.
>> >> For a local Cython that isn't built, we can ignore the header and
>> >> shared library option and issue a warning or some such.
>> >>
>> >> Lastly, I think we also should figure out a way to serialize Entry
>> >> objects from CythonUtilities, which could easily and swiftly be loaded
>> >> when creating the cython scope. It's quite a pain to declare all
>> >> entries for utilities you write manually, so what I mostly did was
>> >> parse the utility up to and including AnalyseDeclarationsTransform,
>> >> and then retrieve the entries from there.
>> >
>> > This would be really nice too. Way back in the day I did some work
>> > with trying to pickle full module scopes, but that soon became too
>> > painful as there are so many far-reaching references. Pickling
>> > individual Entries and re-building modules will probably be a more
>> > tractable goal. Eventually, I'd like to see a way to cache the full
>> > pxd pipeline.
>> >
>> > - Robert
>> > _______________________________________________
>> > cython-devel mailing list
>> > cython-devel at python.org
>> > http://mail.python.org/mailman/listinfo/cython-devel
>> >
>> _______________________________________________
>> cython-devel mailing list
>> cython-devel at python.org
>> http://mail.python.org/mailman/listinfo/cython-devel
>
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
>