[Cython] Gsoc project

Dag Sverre Seljebotn d.s.seljebotn at astro.uio.no
Wed Mar 28 05:08:41 CEST 2012


On 03/27/2012 08:05 PM, Dag Sverre Seljebotn wrote:
> On 03/27/2012 02:17 PM, Philip Herron wrote:
>> Hey
>>
>> I got linked to your idea
>> http://groups.google.com/group/cython-users/browse_thread/thread/cb8aa58083173b97/cac3cf12d438b122?show_docid=cac3cf12d438b122&pli=1
>>
>> by David Malcolm on his plugin mailing list.
>>
>> I am looking to apply to Gsoc once again this year i have done gsoc
>> 2010 and 2011 on GCC implementing my own GCC front-end for python
>> which is still in very early stages since its a huge task. But i am
>> tempted to apply to this project to implement a more self contained
>> project to give back to the community more promptly while that hacking
>> on my own front-end on my own timer. And i think it would benefit me
>> to get to understand in more detail different aspects of python which
>> is what i need and would gain very much experience from.
>
> Excellent! After talking to lots of people at PyCon about Cython, it is
> obvious that auto-generation of pxd files is *the* most missed feature
> in Cython today. If you do this, lots of Cython users will be very
> grateful.
>
>>
>> I was wondering if you could give me some more details on how this
>> could all work i am not 100% familiar with cython but i think i
>> understand it to a good extend from playing with it for most of my
>> evening. I just want to make sure i understand the basic use case of
>> this fully, When a user could have something like:
>>
>> -header foo.h
>>
>> extern int add (int, int);
>>
>> -source foo.c
>>
>> #include "foo.h"
>>
>> int add (int x, int y)
>> {
>> return x+y;
>> }
>>
>> We use the plugin to go over the decls created and create a pxd file
>> like:
>>
>> cdef int add (int a, int b):
>> return a + b
>>
>> Although this is a really basic example i just want to make sure i
>> understand whats going on. Maybe some more of you have input? I guess
>> this would be best suited as a proposal for Python rather than GCC?
>
> This isn't quite what should be done. Cython generates C code that
> includes C header files; what the pxd files are needed for is to provide
> declarations for Cython about what is available on the C side (during
> the Cython->C translation/compilation).
>
> So: "foo.c" is irrelevant to Cython. And, foo.h should turn into foo.pxd
> like this:
>
> cdef extern from "foo.h":
> int add(int, int)
>
> Let us know if you have any question; you may want to look at examples
> for using Cython to wrap C code, such as
>
> https://github.com/zeromq/pyzmq/blob/master/zmq/core/libzmq.pxd
>
> and the rest of the pyzmq code.
>
> Moving over to the idea of making this a GSoC:
>
> First, we have a policy of requiring patches from prospective students
> in addition to their application. Often, this has been to fix a bug or
> two in Cython. However, given that pxd generation can be done without
> much digging into Cython itself, I think that something like a crude
> prototype of the pxd generator (supporting only a subset of C) would be
> a better fit (other devs, what do you think?)
>
> The project should contain at least:
>
> - The wrapper generator itself
> - Tests for it (including the task of figuring out how to test this,
> possibly both unit tests and integration tests)
> - A strategy for testing it for all relevant versions of gcc; one should
> probably set up Jenkins jobs for it
>
> Even then, I feel that this is rather small for a full GSoC, even when
> supporting the subset of C++ supported by Cython, I would estimate a
> month or so (and GSoC is two months). So it should be extended in one
> direction or another. Some ideas:

I should stress that even if you only include the above in the proposal, 
it would definitely still get consideration. It may well be better to go 
slowly but creating something rock solid, than having lots of bells and 
whistles.

It is also possible to label the above the core features, and whatever 
you decide on in addition as "optional bonus goals" in your proposal.

Dag

>
> - Very often one is not interested in the full header file. One really
> wants "the API", not a translation of the C header. This probably
> requires a) some heuristics, and b) the possibility for, as easily as
> possible, write some selectors/configuration for what should be included
> and not. Making that end-user-friendly is perhaps a challenge, I'm not
> sure.
>
> One idea here is to make possible an interplay where you look at the pyx
> file what needs to be wrapped. I.e. you first try to use a function in
> the pyx file as if it had already been declared, then run the pxd
> generator feeding in the pyx files (and .h files), and out comes the
> required pxd file bridging the two (containing only the used subset).
>
> - Support using clang to parse C code in addition
>
> - There's a problem in that an often-used Cython approach is:
>
> 1) Generate C file from pyx and pxd files
> 2) Ship to other computers
> 3) Compile C file
>
> However, this is fragile when combined with auto-generated pxd files,
> because the resulting pxd may be different depending on whether -DFOO is
> given to gcc or not.
>
> The above 3 steps are possible because Cython often does not care about
> the exact type of something, just basic type and signedness. So if you do
>
> cdef extern from "foo.h":
> ctypedef int sometype_t
>
> then sometype_t can actually be a short or a char, and Cython doesn't
> care. (Similarly, not all fields of a struct needs to be exposed, only
> the ones that form part of the API.)
>
> However, I'm not sure if the quality of an auto-generated pxd file is
> good enough for this approach.
>
> So either a) the wrapper generator and Cython must be plugged into the
> typical setup.py build, or b) one figures out something clever (or,
> likely, more than one clever thing) which allows to continue using the
> above workflow.
>
> Either a) and b), or both, could be part of the project. a) essentially
> requires looking at Cython.Distutils. For b), it *may* involve hooking
> into gcc *before* the preprocessor is run and take into account #ifdef
> etc, if that is even possible, and new features in Cython for specifying
> in a pxd file that "there's an #ifdef here", and see if that can somehow
> result in intelligently generated C code.
>
> PS. I should stress that a pxd generator is *very* useful -- because it
> would do 90% of the job, and even if humans need to do the last 10% it
> is still a major timesaver.
>
> - More straightforward than the above: Parse Fortran through the
> gfortran GCC frontend. The Fwrap program
> (https://github.com/fwrap/fwrap) has been dormant in terms of
> development past couple of years, but is still the most promising way of
> bringing Fortran and Cython together.
>
> Part of Fwrap's problem is the existing parser. Changing to using the
> gfortran as the parser would be spectacular, and probably revive the
> project. It has a solid test suite, so one would basically replace the
> parser component of Fwrap, make sure the test suite passes, and that
> would be it.
>
> (Of course, few people outside the scientific community cares anything
> about Fortran.)
>
> Those are some ideas. Remember: This is *your* project, so make sure you
> focus on features you'd find fun to play with and implement. And do NOT
> take all of the above, that's way too much :-), just find one or two
> extra features that help make the GSoC application really appealing.
>
> Dag
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> http://mail.python.org/mailman/listinfo/cython-devel



More information about the cython-devel mailing list